Improving Query Performance in DynamoDB with Sparse Indexes and .NET
A Sparse Index is a special type of Index, where the Index does not have an entry for every item in the base table. Sparse Indexes optimize storage and performance by selectively including only a subset of items from the base table. Learn how to use this to your advantage.
Table of Contents
A DynamoDB Sparse Index is a Secondary Index in Amazon DynamoDB that contains a subset of the items from the base table.
Both the Global and Local Secondary Index in DynamoDB is by default sparse. DynamoDB writes an entry into the index only if the Index key values are present. A Sparse Index is a special type of Index, where the Index does not have an entry for every item in the base table.
Sparse Indexes optimize storage and performance by selectively including only a subset of items from the base table.
This article is sponsored by AWS and is part of my AWS Series.
In this post, let’s learn more about,
- DynamoDB Sparse Index
- How to create a DynamoDB Sparse Index
- Populate a DynamoDB Sparse Index
- Query a DynamoDB Sparse Index
The sample code below is on the WeatherForecast
table, which has CityName
and Date
as the hash and range key, respectively.
If you are new to DynamoDB, I highly recommend checking out my Getting Started with AWS DynamoDB For the .NET Developer blog post below, where I also show you how to set up the table used in this blog post.
Creating a DynamoDB Sparse Index
For any item in a table, DynamoDB writes a corresponding index entry only if the index sort key value is present in the item. If the sort key doesn't appear in every table item, the index is said to be sparse.
Since they are either a Global Secondary Index or a Local Secondary Index, you create them just like creating the other Index. You can do this from the AWS Console UI under the `Secondary Indexes' section of the DynamoDB table details.
You can create a Local Secondary Index only at the time of DynamoDB table creation.
You can find more information on Index Creation in the below two blog posts.
- Exploring Global Secondary Index: Advanced Querying in DynamoDB From .NET
- Improving Queries Using Local Secondary Index in DynamoDB with .NET
The difference with Sparse Index is with the key selection for the DynamoDB table Index.
Based on the application and the query that you are optimizing for, choose the keys such that only the items required are copied over and available in the Index.
For example, the below Index on the WeatherForecast
table uses the Date as the partition key and the IsExtermeWeatherConditions
flag on the item as its sort key.
The IsExtermeWeatherConditions
is set on the object only if it was extreme weather conditions on that given day for that city in the original weather forecast object. Since the property is conditionally populated in the base table, all the items do not make it to the Index - hence it will be a Sparse Index.
This Sparse Index allows us to get all the cities that had extreme weather conditions given a specific Date.
Populating DynamoDB Sparse Index From .NET
Based on the programming language you are using to write the DynamoDB items you can conditionally populate the IsExtermeWeatherConditions
property on the WeatherForecast object.
When using the .NET programming language this is easily done by specifying the IsExtermeWeatherConditions
as a Nullable boolean type, as shown below.
public class WeatherForecastTable
{
public string CityName { get; set; }
public DateTime Date { get; set; }
public int TemperatureC { get; set; }
public int TemperatureF => 32 + (int)(TemperatureC / 0.5556);
public string Summary { get; set; }
public bool? IsExtremeWeatherConditions { get; set; }
...
}
When using the DynamoDBContext
to save the item to the table, it includes the IsExtermeWeatherConditions
property only if it has a value specified. It ignores the property if it's set to NULL
.
So from the application, set this value to be true only when the conditions are extreme. This depends on your business and domain logic.
Based on the application you are building, you can choose the property and type appropriate to your data and the business domain.
Only items that have the IsExtermeWeatherConditions
property set on the base table item is copied over to the Index - Date-IsExtremeWeatherConditions-index
Querying DynamoDB Sparse Index From .NET
Since a Sparse Index is a special type of Index, querying it is similar to querying any other Index.
Below is the query that you can use to get all the extreme weather forecast data for a specified date. It specifies the IndexName on the QueryRequest
object and specifies the Date
in the KeyConditionExpression.
Since Date
is a reserved keyword in DynamoDB it passes it as a parameterized key, with the original property names specified in the ExpressionAttributeNames
.
[HttpGet("gsi-query-sparse-index")]
public async Task<IEnumerable<ExtremeWeatherForecastListItem>> GetExtremeWeatherConditionsUsingSparseGSIQuery(DateTime startDate)
{
var request = new QueryRequest()
{
TableName = nameof(WeatherForecastTable),
IndexName = "Date-IsExtremeWeatherConditions-index",
KeyConditionExpression = "#Date = :startDate",
ExpressionAttributeNames = new Dictionary<string, string>()
{
{"#Date", "Date"}
},
ExpressionAttributeValues = new Dictionary<string, AttributeValue>()
{
{":startDate", new AttributeValue(startDate.ToString(AWSSDKUtils.ISO8601DateFormat))},
},
};
var response = await _amazonDynamoDbClient.QueryAsync(request);
return response.Items
.Select(Document.FromAttributeMap)
.Select(_dynamoDbContext.FromDocument<WeatherForecastListItem>);
}
The above query returns all the items for the specified date that had extreme weather conditions.
Since this is a frequently queried data in our application, adding this specific Index and designing it to be a Sparse Index, helps to minimize storage costs and also return all the items quickly.
Rahul Nath Newsletter
Join the newsletter to receive the latest updates in your inbox.