Improving Query Performance in DynamoDB with Sparse Indexes and .NET

A Sparse Index is a special type of Index, where the Index does not have an entry for every item in the base table. Sparse Indexes optimize storage and performance by selectively including only a subset of items from the base table. Learn how to use this to your advantage.

Rahul Pulikkot Nath
Rahul Pulikkot Nath

Table of Contents

A DynamoDB Sparse Index is a Secondary Index in Amazon DynamoDB that contains a subset of the items from the base table.

Both the Global and Local Secondary Index in DynamoDB is by default sparse. DynamoDB writes an entry into the index only if the Index key values are present. A Sparse Index is a special type of Index, where the Index does not have an entry for every item in the base table.

Sparse Indexes optimize storage and performance by selectively including only a subset of items from the base table.

This article is sponsored by AWS and is part of my AWS Series.

In this post, let’s learn more about,

  • DynamoDB Sparse Index
  • How to create a DynamoDB Sparse Index
  • Populate a DynamoDB Sparse Index
  • Query a DynamoDB Sparse Index

The sample code below is on the WeatherForecast table, which has CityName and Date as the hash and range key, respectively.

If you are new to DynamoDB, I highly recommend checking out my Getting Started with AWS DynamoDB For the .NET Developer blog post below, where I also show you how to set up the table used in this blog post.

AWS DynamoDB For The .NET Developer: How To Easily Get Started
Learn how to get started with AWS DynamoDB with .NET Core by updating the default ASP NET Web API template to use DynamoDB as it’s data store. We will learn to do basic Creat, Read, Update and Delete operations from the API.

Creating a DynamoDB Sparse Index

For any item in a table, DynamoDB writes a corresponding index entry only if the index sort key value is present in the item. If the sort key doesn't appear in every table item, the index is said to be sparse.

Since they are either a Global Secondary Index or a Local Secondary Index, you create them just like creating the other Index. You can do this from the AWS Console UI under the `Secondary Indexes' section of the DynamoDB table details.

Create Secondary Index option in AWS Console. You can create Global and Local Index when creating the table.

You can create a Local Secondary Index only at the time of DynamoDB table creation.

You can find more information on Index Creation in the below two blog posts.

The difference with Sparse Index is with the key selection for the DynamoDB table Index.

Based on the application and the query that you are optimizing for, choose the keys such that only the items required are copied over and available in the Index.

In a DynamoDB table, each key value must be unique. However, the key values in a Global Secondary Index do not need to be unique.

For example, the below Index on the WeatherForecast table uses the Date as the partition key and the IsExtermeWeatherConditions flag on the item as its sort key.

The IsExtermeWeatherConditions is set on the object only if it was extreme weather conditions on that given day for that city in the original weather forecast object. Since the property is conditionally populated in the base table, all the items do not make it to the Index - hence it will be a Sparse Index.

This Sparse Index allows us to get all the cities that had extreme weather conditions given a specific Date.

Populating DynamoDB Sparse Index From .NET

Based on the programming language you are using to write the DynamoDB items you can conditionally populate the IsExtermeWeatherConditions property on the WeatherForecast object.

When using the .NET programming language this is easily done by specifying the IsExtermeWeatherConditions as a Nullable boolean type, as shown below.

public class WeatherForecastTable
    public string CityName { get; set; }
    public DateTime Date { get; set; }
    public int TemperatureC { get; set; }
    public int TemperatureF => 32 + (int)(TemperatureC / 0.5556);
    public string Summary { get; set; }
    public bool? IsExtremeWeatherConditions { get; set; }

When using the DynamoDBContext to save the item to the table, it includes the IsExtermeWeatherConditions property only if it has a value specified. It ignores the property if it's set to NULL.

So from the application, set this value to be true only when the conditions are extreme. This depends on your business and domain logic.

Based on the application you are building, you can choose the property and type appropriate to your data and the business domain.

Only items that have the IsExtermeWeatherConditions property set on the base table item is copied over to the Index - Date-IsExtremeWeatherConditions-index

Querying DynamoDB Sparse Index From .NET

Since a Sparse Index is a special type of Index, querying it is similar to querying any other Index.

Below is the query that you can use to get all the extreme weather forecast data for a specified date. It specifies the IndexName on the QueryRequest object and specifies the Date in the KeyConditionExpression.

Since Date is a reserved keyword in DynamoDB it passes it as a parameterized key, with the original property names specified in the ExpressionAttributeNames.

public async Task<IEnumerable<ExtremeWeatherForecastListItem>> GetExtremeWeatherConditionsUsingSparseGSIQuery(DateTime startDate)
    var request = new QueryRequest()
        TableName = nameof(WeatherForecastTable),
        IndexName = "Date-IsExtremeWeatherConditions-index",
        KeyConditionExpression = "#Date = :startDate",
        ExpressionAttributeNames = new Dictionary<string, string>()
            {"#Date", "Date"}
        ExpressionAttributeValues = new Dictionary<string, AttributeValue>()
            {":startDate", new AttributeValue(startDate.ToString(AWSSDKUtils.ISO8601DateFormat))},

    var response = await _amazonDynamoDbClient.QueryAsync(request);

    return response.Items

The above query returns all the items for the specified date that had extreme weather conditions.

Since this is a frequently queried data in our application, adding this specific Index and designing it to be a Sparse Index, helps to minimize storage costs and also return all the items quickly.