How to autoscale DynamoDB table with the help of a Lambda function

I recently implemented a way to autoscale DynamoDB based on expected (immediate) future workload. Particularly, this will work well when a workload is fairly regular or scheduled due to the AWS limitation that you can't scale down more than four times in a single UTC calendar day. If your workload varies greatly throughout the course of the day, I suggest you read this blog post.

Here is the basic architecture of how this works. In this example, files are dropped into an S3 bucket from an external system, and the application needs to load them into DynamoDB.

Let's walk through that together. First, a huge file is dropped into an S3 bucket (big-files). Each row in the file will be an item in DynamoDB. Next, a Lambda function (BatchItems) chunks that file up into smaller files for processing. This is mostly necessary due to the 5 minute limitation of Lambda functions.

Next, a second Lambda function (QueueItems) takes the batched files and puts each row from the file as a message on the SQS queue (NewItems).

As the items are put on the queue, a third Lambda function (LoadItems) retrieves them and loads them into a DynamoDB table (Items).

If we just wanted to process the data, we'd be done at this point. However, we want to do so as quickly as possible while spending as little as possible.

To implement the autoscaling, we add a Lambda function (AutoScaleDynamo) that monitors the queue. Every minute it gets the queue count. It also retrieves the capacity of the DynamoDB table. If queue count is above a specified threshold and table capacity is below a specified threshold, we programatically increase the table capacity to help the messages process faster. Additionally, if queue count is below a certain threshold and table capacity is above a specified threshold, we decrease table capacity to save money.

Here it is in action. I put a large file (100,000 rows) into my S3 bucket. It was batched, loaded into the queue, and each row loaded into DynamoDB.

Here you can see the write capacity is steady at 25 units/second. It then jumps up to 300 units/second as the queue is processed. Once the queue is empty, the capacity is returned to 25 units/second.

Any latency in updating the capacity is covered by burst capacity, so the queue is able to be processed as fast as possible. Not a single throttled request to DynamoDB during this run.

Here is the code to monitor the queue and update the table capacity in .NET

C#:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
namespace AutoScaleDynamo
{
    public class AutoScaleDynamoHandler
    {
        public async Task<string> FunctionHandler(ILambdaContext context)
        {
            var tableCapacities = new List<DynamoTableCapacity>
            {
                new DynamoTableCapacity
                {
                    TableName = "",
                    SteadyStateWriteCapacity = 25,
                    ElevatedWriteCapacity = 300
                }
            };

            foreach (var tableCapacity in tableCapacities)
            {
                int approximateNumberOfMessagesInQueue;
                var messageCountThreshold = 15;
                using (var sqsClient = new AmazonSQSClient())
                {
                    var queueRequest = new GetQueueAttributesRequest
                    {
                        QueueUrl = "",
                        AttributeNames = new List<string> {
                            "ApproximateNumberOfMessages"
                        }
                    };

                    var queueResponse = await sqsClient.GetQueueAttributesAsync(queueRequest);
                    approximateNumberOfMessagesInQueue = queueResponse.ApproximateNumberOfMessages;
                    context.Logger.LogLine(string.Format("Approximate number of messages in queue: {0}", approximateNumberOfMessagesInQueue));
                }

                var infoRequest = new DescribeTableRequest
                {
                    TableName = tableCapacity.TableName
                };

                long readCapacityUnits;
                long writeCapacityUnits;
                using (var dynamoClient = new AmazonDynamoDBClient())
                {
                    var infoResponse = await dynamoClient.DescribeTableAsync(infoRequest);
                    var description = infoResponse.Table;

                    readCapacityUnits = description.ProvisionedThroughput.ReadCapacityUnits;
                    writeCapacityUnits = description.ProvisionedThroughput.WriteCapacityUnits;

                    context.Logger.LogLine("Provision Throughput (reads/sec): " + description.ProvisionedThroughput.ReadCapacityUnits);
                    context.Logger.LogLine("Provision Throughput (writes/sec): " + description.ProvisionedThroughput.WriteCapacityUnits);
                }

                if (approximateNumberOfMessagesInQueue > messageCountThreshold && writeCapacityUnits < tableCapacity.ElevatedWriteCapacity)
                {
                    context.Logger.LogLine("Updating capacity of table: " + tableCapacity.TableName + " ElevatedWriteCapacity: " + tableCapacity.ElevatedWriteCapacity);

                    var updateRequest = new UpdateTableRequest
                    {
                        TableName = tableCapacity.TableName,
                        ProvisionedThroughput = new ProvisionedThroughput()
                        {
                            ReadCapacityUnits = readCapacityUnits,
                            WriteCapacityUnits = tableCapacity.ElevatedWriteCapacity
                        }
                    };

                    using (var dynamoClient = new AmazonDynamoDBClient())
                    {
                        var response = await dynamoClient.UpdateTableAsync(updateRequest);
                    }

                    context.Logger.LogLine("Done updating capacity of table: " + tableCapacity.TableName);
                }
                else if (approximateNumberOfMessagesInQueue < messageCountThreshold && writeCapacityUnits >= tableCapacity.ElevatedWriteCapacity)
                {
                    context.Logger.LogLine("Updating capacity of table: " + tableCapacity.TableName + " SteadyStateWriteCapacity: " + tableCapacity.SteadyStateWriteCapacity);

                    var updateRequest = new UpdateTableRequest
                    {
                        TableName = tableCapacity.TableName,
                        ProvisionedThroughput = new ProvisionedThroughput()
                        {
                            ReadCapacityUnits = readCapacityUnits,
                            WriteCapacityUnits = tableCapacity.SteadyStateWriteCapacity
                        }
                    };

                    using (var dynamoClient = new AmazonDynamoDBClient())
                    {
                        var response = await dynamoClient.UpdateTableAsync(updateRequest);
                    }

                    context.Logger.LogLine("Done updating capacity of table: " + tableCapacity.TableName);
                }
                else
                {
                    context.Logger.LogLine("No capacity change needed");
                }
            }

            context.Logger.LogLine("Done updating all table capacities");

            return "Success";
        }
    }
}

And the DynamoTableCapacity class looks like this:

C#:
1
2
3
4
5
6
7
8
9
10
11
namespace AutoScaleDynamo
{
    public class DynamoTableCapacity
    {
        public string TableName { get; set; }
        public int SteadyStateReadCapacity { get; set; }
        public int SteadyStateWriteCapacity { get; set; }
        public int ElevatedReadCapacity { get; set; }
        public int ElevatedWriteCapacity { get; set; }
    }
}

Keep in mind that you could also simplify the above Lambda to just scale up or down based on time of the day. In that case, you'd have two Lambdas: one to scale up and one to scale down, each on a separate Cron trigger for whenever you wanted. The way I've described above is a bit more robust in regards to responding to real-time demand, provided you can load your items into SQS.

Additionally, you could add more logic to scale up to a variable capacity based on how many messages are in the queue. Remember, you can scale up as many times as you like. It's the scaling down that you can only do four times a day.

portrait title

About Scott

I am a software engineer from Bozeman, MT enjoying the slightly warmer climate of Colorado. I think code can change lives. I think lives are worth changing. I write code.

You can find me on Twitter, LinkedIn, , and Github.