Connect DynamoDB to Tableau: 2 Easy Methods

Divij Chawla • Last Modified: December 29th, 2022

DynamoDB to Tableau

Tableau and many other BI tools discover patterns and help visualize data to make it more understandable and boost the analysis process. These help in gaining a better understanding of the metrics, trends, the user needs, performance, and play a key role in various data-driven decisions.  A NoSQL database like DynamoDB comes in handy in managing your daily big data needs. With features like horizontal scaling, no compulsion for using a fixed schema among all stored documents, etc. it lets you bring in data of all different kinds and store it in a non-relational format,  making it the perfect choice to complement a powerful BI tool like Tableau to perform an insightful analysis.  

This article aims at answering all your queries about connecting DynamoDB to Tableau. Follow our easy step-by-step guide to master the skill of connecting DynamoDB to Tableau using various methods.

Table of Contents

Introduction to DynamoDB

DynamoDB Logo

DynamoDB is a managed NoSQL database offered by Amazon. It is available as a part of Amazon’s data warehousing suite called Amazon Web Services (AWS). Being a NoSQL database, it stores data in the form of collections, which further contain documents that store data in the form of a key-value pair. Each record is accessed with the help of a primary key and key-value pairs, using its proprietary querying layer.

DynamoDB makes use of a collection of nodes, each of which contains several primary keys, so when a query executes, only those nodes which contain those primary keys get activated and fetch data. It is known for its scalability, ease of use, reliability & no compulsion for using a fixed schema among all stored documents, giving them the ability to have varying fields (columns)

For further information on DynamoDB, you can check the official site here.

Introduction to Tableau

Tableau Logo.

Tableau is a powerful business intelligence tool used to turn raw data into an understandable format. It is a tool popularly used to visualize data and can be understood even by a non-technical user. 

It creates visualizations with the help of dashboards and worksheets, this helps users perform real-time analysis in a very fast and secure manner. It doesn’t require any programming skill or technical background to operate it. Tableau is not only available as a desktop application but also as an online service that can be accessed by any user on the go. All you need to do is sign in your account on your device.

For further information on Tableau, you can check the official site here.

Simplify Data Analysis with Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline helps to load data from any data source such as Databases, SaaS applications, Cloud Storage, SDKs, and Streaming Services and simplifies the ETL process. It supports 100+ data sources (including 30+ free data sources) like Asana and is a 3-step process by just selecting the data source, providing valid credentials, and choosing the destination. Hevo not only loads the data onto the desired Data Warehouse/destination but also enriches the data and transforms it into an analysis-ready form without having to write a single line of code.

GET STARTED WITH HEVO FOR FREE[/hevoButton]

Its completely automated pipeline offers data to be delivered in real-time without any loss from source to destination. Its fault-tolerant and scalable architecture ensure that the data is handled in a secure, consistent manner with zero data loss and supports different forms of data. The solutions provided are consistent and work with different BI tools as well.

Check out why Hevo is the Best:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
SIGN UP HERE FOR A 14-DAY FREE TRIAL

Prerequisites

  • Working knowledge of DynamoDB.
  • Working knowledge of Tableau.
  • Tableau installed at the host workstation.
  • Data stored in DynamoDB documents.

Methods to Connect DynamoDB to Tableau

There are many ways in which you can establish a connection between DynamoDB & Tableau:

Download the Whitepaper on Automating Data Integration
Download the Whitepaper on Automating Data Integration
Download the Whitepaper on Automating Data Integration
Learn the key benefits of automating Data Integration

Method 1: Connecting DynamoDB to Tableau via a Data Warehouse

It is essential to load the NoSQL data from DynamoDB into a data warehouse such as Amazon Redshift, Snowflake, etc. to transform it into an analysis-ready form and load it to Tableau to start the analysis. In this blog, you will learn how to transfer your DynamoDB data into Amazon Redshift and then load it into Tableau. This can be implemented as follows:

Step 1: Exporting data from DynamoDB to Amazon Redshift

To export data from DynamoDB, you can use either of the following methods:

Using Amazon Redshift’s Copy Command to export data from DynamoDB

One of the simplest ways of transferring data from DynamoDB to Amazon Redshift is to use the Amazon Redshift’s Copy Command. This command requires you to provide the URL for your DynamoDB data as one of the parameters. It then automatically performs the entire data transfer. You can use the following syntax for using the copy command:

copy <target_tablename> from 'dynamodb://<source_table_name>'
authorization
read ratio '<integer>';

For example, if you want to copy data from the sales table in DynamoDB to a table called sales_red in your Amazon Redshift data warehouse, you can use the following command:

COPY sales_red from dynamodb://sales
credentials ‘aws_access_key_id = <access_key_id>;aws_secret_access_key=<secret_access_key>
readratio 40;

The read ratio parameter specifies the provisioned capacity for the DynamoDB instance being used in the transfer process. If this value is kept very high, it can easily degrade the performance. It is, therefore, important to keep it below 50.

This is how you can use Amazon Redshift’s Copy Command to transfer your data to Amazon Redshift.

Using AWS Data Pipeline to export data from DynamoDB

The AWS Data Pipeline comes in handy for transferring data from one application to another, that is a part of the AWS data warehousing suite. It will first copy the data to S3 and then use the Copy Command to load data into Amazon Redshift.

Use the data management console to create a data pipeline. Select the export DynamoDB table to S3 option from the source drop-down list.

Creating an S3 Pipeline.

You can check our easy step-by-step guide on AWS Pipeline here. This will help you set it up easily.

Once you have exported your data to S3 using the pipeline, provide the path for the JSON file in the source path parameter. The Copy Command automatically identifies the tables using the attributes in the JSON file. You can use the following command for the same:

COPY sales_red from s3://my_bucket/sales.json credentials ‘aws_access_key_id = <access_key_id>;aws_secret_access_key=<secret_access_key> Json = ‘auto’

This is how you can use AWS data pipeline to transfer your data to Amazon Redshift.

Step 2: Loading data from Amazon Redshift to Tableau

Once you have transferred your DynamoDB data into Amazon Redshift, you can use the Tableau’s in-built Amazon Redshift connector to load data into Tableau. 

This method requires you to install the Amazon Redshift ODBC driver. The ODBC driver allows Tableau to interact with the data stored in your Amazon Redshift clusters and hence perform an analysis easily. You can download it from the official AWS site here. This can be implemented as follows:

Launch Tableau on your workstation and select Amazon Redshift from the connect column on the left. This will open a dialogue box where you need to provide the connection details for your cluster.

Tableau.

To connect with Tableau, you need to provide the username and password for the desired database along with information about the server and port (5439 is the default port) which hosts your database.

Click on “ok” to establish a connection. This will enable a connection without SSL. To establish an SSL enabled connection, click the Require SSL checkbox before you sign in.

Amazon Redshift Connection.- Connect DynamoDB to Tableau.

This is how you connect DynamoDB to Tableau via Amazon Redshift.

For further information on how to connect Amazon Redshift to Tableau, you can check our easy step-by-step guide here.

Limitations of this method:

  • The copy command compares attributes of DynamoDB data with the names of the columns of Amazon Redshift to transfer the data. In case it doesn’t find a match, it automatically discards the fields or replaces the values with NULL or empty values.
  • DynamoDB and Amazon Redshift deal with table names differently in terms of allowed characters and limits. Amazon Redshift limits the characters at 127 whereas DynamoDB allows up to 255 characters. This can be a big problem while copying data. 
  • The copy command doesn’t support all data types.
  • This method doesn’t work well if the data transfer is a recurring process. This requires transferring data into a temporary table and then using joins to find common columns to update the target table.

Method 2: Using ODBC Driver to connect DynamoDB to Tableau

ODBC drivers can also be used to establish a connection between DynamoDB to Tableau. The ODBC driver can add a data source such as DynamoDB and then be used to connect the DynamoDB instance to Tableau. 

This can be implemented using the following steps:

Step 1: Installing DynamoDB ODBC Driver and DSN Setup

Download the latest version of Simba ODBC driver for DynamoDB on your system from the official Simba site here.

Once the ODBC driver is installed on your system, launch it to start the DSN set up process. Choose the correct 64 or 32-bit version and select the System DSN tab.

Click on the add button and select Simba ODBC driver for DynamoDB.

You can set up DSN for DynamoDB in two different ways, you can either set it up for connecting with DynamoDB as an AWS service or for connecting with DynamoDB local. Both of them require you to make separate configurations, therefore, carefully perform the set up as per your requirement.

Setting up DSN for DynamoDB driver as an AWS Service

In the System DSN tab, select the Simba ODBC driver for DynamoDB. Once you have selected it, click on configure. The DynamoDB DSN setup box will now appear, where you need to provide details regarding the connection settings and authentication as follows:

Ensure that the DynamoDB local checkbox isn’t selected and fill the necessary fields as follows:

  • Host: Enter the URL that serves as the endpoint for your DynamoDB service.
  • Region: Select the AWS region.
  • IAM Credentials: Enter your access key in the Access Key field and the security key associated with your ID in the Security Key field.

You can encrypt your credentials either by making them available only for the current user with the help of the Current User Only button or make them available for all users by selecting the All users of this machine button. If your credentials are temporary, click on the Temporary Session token option and copy & paste the session token in the field. 

You can create a schema file by clicking on the schema definition button and browse to find the desired directory where you will store your schema file. You can now provide a name for the file and click on save. Click on generate all button to start the process.

Click on ok to close the schema definition dialogue box and select the advanced options button. In the advanced option dialogue box, provide the full path of your schema file in the local file field and click on okay to save your settings.

To test the connection, click on the test button. If the connection is successful, a new dialogue box will open up as follows:

Setting up DSN for DynamoDB local

In the System DSN tab, select the Simba ODBC driver for DynamoDB. Once you have selected it, click on configure. The DynamoDB DSN setup box will now appear, where you need to provide details regarding the connection settings and authentication as follows:

Ensure that the DynamoDB local checkbox is selected and fill the necessary fields as follows:

  • Host: Enter the URL that serves as the endpoint for your DynamoDB service.
  • Port: DynamoDB uses port 8000 by default, however, you change the port number as per your database.

You can create a schema file by clicking on the schema definition button and browse to find the desired directory where you will store your schema file. You can now provide a name for the file and click on save. Click on generate all button to start the process.

Click on ok to close the schema definition dialogue box and select the advanced options button. In the advanced option dialogue box, provide the full path of your schema file in the local file field and click on okay to save your settings.

To test the connection, click on the test button. If the connection is successful, a new dialogue box will open up as follows:

Step 2: Connecting to Tableau

Once you have made the necessary configuration changes, launchTableau on your system. Now, from the connect columns on the left select the other databases (ODBC) option.

In the ODBC dialogue box, select your DSN from a drop-down list. Click on sign in.

To select the desired schema, you can use the schema drop-down list from the column on the left. You can also perform a text-based search to find the desired option. Now similarly find and select the desired table and drag it onto the canvas..

You will now be able to view the data from your desired database. You can now perform an insightful analysis on your DynamoDB data in Tableau.

This is how you can use ODBC to connect DynamoDB to Tableau.

Limitations of this method:

  • This method sometimes requires using third-party tools, which essentially is an extra tool that needs to be maintained.
  • This method results in service delays due to interruptions such as hardware or software updations. This can lead to a complete loss of connection.

Conclusion

This article introduces you to the various methods that can be used to connect DynamoDB to Tableau. It also provides in-depth knowledge about the concepts behind every step to help you understand and implement them efficiently. These methods, however, can be challenging especially for a beginner & this is where Hevo saves the day.

visit our website to explore hevo[/hevoButton]

Hevo Data, a No-code Data Pipeline helps you transfer data from a source of your choice in a fully-automated and secure manner without having to write the code repeatedly. Hevo with its strong integration with 100+ sources & BI tools, allows you to not only export & load data but also transform & enrich your data & make it analysis-ready in a jiffy.

SIGN UP for a 14-day free trial and see the difference!

Tell us about your preferred way to connect DynamoDB to Tableau! Share your thoughts with us in the comments section below.

Visualize Your DynamoDB Data In Tableau Easily