Introduction
You are probably thinking, what is DAX? Why do we need DAX? Isn’t Amazon DynamoDB in itself fast enough? This blog clears everything about DAX, why and when it should be used, its advantages, and its limitations.
As we already know, DAX is an AWS feature add-on for DynamoDB. Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for Amazon DynamoDB that delivers up to a 10 times performance improvement—from milliseconds to microseconds—even at millions of requests per second.
Architecture
DAX is designed to run within an Amazon Virtual Private Cloud (Amazon VPC) environment. Amazon VPC defines a virtual network that closely resembles a traditional data center. With a VPC, you have control over its IP address range, subnets, routing tables, network gateways, and security settings. You can launch a DAX cluster in your virtual network and control access to the Cluster using Amazon VPC security groups.
Creating DAX Cluster
- To create a DAX cluster, open your AWS account console and search for DynamoDB and click on DAX Clusters. Now, click on Create Cluster.
- Now add Cluster name, and description, select node family and node type, specify the number of nodes, and click on
- Select or create a subnet and security group. While creating a security group, add a rule with port 9111 for the encrypted dax cluster or 8111 for the unencrypted DAX cluster.
- Now permit your Dax Cluster and provide encryption if you want.
- Verify with advanced settings and review all steps before creating the Cluster.
How does DAX Work?
- To test DAX, we will create a DynamoDb table and then fetch data from it, after which we will again fetch the same data from the Dax cluster and then compare the time for the query executed. First, create an EC2 instance and give IAM permission for DAX full access and DynamoDB full access. Make sure you provide the same vpc and security group as your Cluster.
- Connect/SSH into your EC2 instance and run the following commands.
- pip install amazon-dax-client
- wget http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/samples/TryDax.zip
- unzip TryDax.zip
- python 01-create-table.py
- python 02-write-data.py
- python 03-getitem-test.py
- python 04-query-test.py
- python 05-scan-test.py
- The commands above will install the DAX client, download sample code, create a DynamoDb table, write some test data into the table, and lastly, it will fetch data by get-item, query, and scan. The below screenshots show the time needed for query execution after running the query commands. These commands, for now, are fetching data directly from the database.
- aws dax describe-clusters –query \”Clusters[*].ClusterDiscoveryEndpoint\”
- python 03-getitem-test.py dax://my-cluster.l6fzcv.dax-clusters.us-east-1.amazonaws.com
- python 04-query-test.py dax://my-cluster.l6fzcv.dax-clusters.us-east-1.amazonaws.com
- python 05-scan-test.py dax://my-cluster.l6fzcv.dax-clusters.us-east-1.amazonaws.com
- As you can see, we got some average execution time for the queries. Now we will run the same queries with the DAX client; for that, you can run the following command to get the dax cluster endpoint or go to the DAX cluster dashboard and copy the endpoint from there.
- The following screenshots show the output after running the above commands.
- After comparing the elapsed query time from the database versus from DAX, we can conclude that DAX is much faster and can enhance application performance from milli-seconds to microseconds. If required, we can delete the table created by running the below command.
– python 06-delete-table.py
When should we use it?
- Consistent/Burst Traffic: When you have incoming traffic with the same set of primary/secondary keys and a consistent pattern.
- Faster Response: When you need a faster response in microseconds and not milliseconds.
- Eventual Consistency: When your application can deliver data that is not immediately updated. This means that changes in the DynamoDB table might take some time to get reflected in the Dax cluster. The reflection time is also shallow, but this time may seem significant in heavy traffic scenarios.
- Read Intensive: Dax is a cache, so as we all know, caches are used mainly for read-intensive operations and data access. It is not used for write-intensive operations. It is better to use only for read-intensive applications with heavy traffic that could be split between multiple nodes of DAX.
- Save cost on DynamoDb RCU: Another benefit of using DAX is that it can also enable you to reduce your provisioned read capacity within DynamoDB. This is because DAX caches data, reducing the impact and amount of reading requests on your DB tables; instead, DAX will serve these from the in-memory cache. As we know, reducing the provisioned requirements on your DynamoDB database will also reduce your overall costs.
- Hotkey Data Retrieval: When there are too many repeated items or query requests/reads, we can retrieve them with the mentioned TTL( default TTL is 5 mins) or increase the TTL if required.
Conclusion
As we can see from the above examples, Dax is highly scalable with multiple nodes, provides extreme performance, and saves costs on RCUs during high capacity.
Written by – Mohammed Shahid Adoni