Amazon S3 is one of the most popular services of AWS, which allows you to store and fetch colossal amounts of data to and fro the S3 buckets.

Since Amazon S3 provides High Scalability, Security, and Performance, it is popularly being used among data professionals to build effective Data Pipelines.

  • Besides, the static data present inside the Amazon S3 buckets can be further processed, analyzed, and visualized to extract meaningful insights and reports out of the given data, thereby allowing companies to make data-driven decisions.
  • In order to analyze data present in Amazon S3, developers can use the external visualizations tools, frameworks, and web interfaces.
  • One such data visualization tool is Tableau, which allows you to analyze, interpret, and visualize data to make data-driven decisions.
  • In this article, you will learn about Amazon S3, Tableau, and a step-by-step guide for Tableau S3 connection. 

Methods to Connect Tableau S3

Method 1: Manually Connecting Tableau S3 using Athena JDBC Connector

Connection Prerequisites 

  • pre-installed Tableau Server, 
  • pre-registered Amazon S3, 
  • an active bucket on Amazon S3 space, 
  • and a secret access key for the AWS IAM (Identity and Access Management). 

Since you are about to establish a Tableau S3 connection via Athena JDBC driver, make sure that you installed the latest version of 64-bit Java. A minimum of at least JDK 7.0 or Java 1.7 is required to make a proper Tableau S3 connection. 

Step 1: Downloading Athena JDBC Driver

  • For downloading Athena JDBC drivers, visit the official website of Amazon Athena. On the website, you can find various Athena JDBC drivers in the form of jar files.
  • From those, download the respective JDBC driver that suits your JDK (Java Development Kit) version. After downloading the JDBC jar file, move the respective driver file to the home path or location of Tableau based on your operating system. 

The appropriate JDBC storage locations according to the operating system are given below.

1. For Windows:  C:Program FilesTableauDrivers
2. For macOS: ~/Library/Tableau/Drivers location
3. For Linux: /opt/tableau/tableau_driver/jdbc

Step 2: Setting up Athena

Before setting up and configuring Athena, you have to create a “student” table in CSV format that points to a student-db.csv file in the Amazon S3 bucket.

You should also create a view named “student_view” on top of the student table created before. For creating a student table and student view easily, you can download the respective files from the GitHub repository.

  • Now, open the Amazon S3 console and upload the student-db.csv in the bucket you created before.
  • In the next step, create a “studentdb” database using the following DDL statement in your Athena Console. 
CREATE DATABASE studentdb;
  • After creating the database, execute the below given DDL statement for creating a “student” table inside the “studentdb” database. You should also provide the name of your Amazon S3 bucket inside the location parameter, as shown above. 
CREATE EXTERNAL TABLE student(
  `school` string, 
  `country` string, 
  `sex` string, 
  `age` string, 
  `studytime` int, 
  `failures` int, 
  `preschool` string, 
  `higher` string, 
  `remotestudy` string, 
  `health` string)
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY ',' 
STORED AS INPUTFORMAT 
'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://<your_bucket_name>/'
TBLPROPERTIES (
  'has_encrypted_data'='false', 
  'skip.header.line.count'='1', 
  'transient_lastDdlTime'='1595149168')
  • Now, you have to create a view named “student_view.” On creating the view, you can limit the number of fields or columns required to build dashboards in Tableau. 
  • Execute the following command to create a View.
CREATE OR REPLACE VIEW student_view AS 
SELECT
 "school",
 "country",
 "sex",
 "age",
 "health",
 "studytime",
 "failures"
FROM
  student
  • After creating the database, table, and views, you should now check whether the entities are created properly. You can check by executing SQL queries in the Athena console. 
  • Execute the following command to check whether the student_view is created correctly.
SELECT * FROM “studentdb”.”student_view” limit=10;
  • After executing the above command, you will get the following output. 
  • From the above output, you can confirm that the view is successfully created since it only displays the columns that are mentioned inside the SQL view query.

Find a plan that’s right for you, Experience transparent pricing that ensures no billing surprises even as you scale. Get a 14 day free trial with 24×7 support. No credit card required. Get a custom quote tailored to your requirements

Step 3: Connecting Tableau S3 via Athena Connector 

Using the Amazon Athena connector, you can connect Tableau S3 data rapidly and effortlessly. After establishing the connection between both, you can seamlessly perform data visualization operations on data present in Amazon S3 with drag-and-drop flexibility. 

  • Now, open the Tableau Desktop that was previously installed and configured on your local machine.
  • Navigate to Connect > More, and search for Amazon Athena, as shown in the above image.
  • Now, the Amazon Athena dialogue box will pop, where you have to enter the connection configuration details. 
  • In the Server field, enter your response in the athena <region>.amazonaws.com format. The <region> is nothing but your AWS availability zone. Then, enter the appropriate port number in the port field. 
  • In the S3 Staging Directory field, enter the path to the Amazon S3 location where you wish to store query results.
  • You can find the S3 Staging Directory path on the Settings page of Athena Console, which is present in the Query result location field, as shown in the above image.
  • Then, you have to enter the Access Key ID and Secret Access Key field. Enter the appropriate values associated with the AWS IAM. 
  • After filling in all the fields, click on the “Sign in” button.
  • Now, you are directed to the data source pane of Tableau Desktop where you can see the previously created “student_view” and “student” tables. 
  • Drag and drop the “student_view” table from the left side panel to the workspace on the right side. Now, the respective table is ready for you to analyze and visualize using the Tableau Desktop.

Step 4: Analyzing Amazon S3 data using Tableau

  • You can create a new worksheet named “country-wise” to analyze the student data based on the country column or field, as shown below. 
  • Then, you can create another worksheet named “age-wise” to analyze students’ age using the bar chart. 
  • You can merge or comprise the previously created country-wise and age-wise worksheets for easy visualization. In the Dashboard, choose the new dashboard option. Now, drag and drop the country-wise and age-wise worksheets from the left side panel. 
  • On following the above steps, you created the Tableau dashboard successfully by utilizing data present in the Amazon S3.
  • You can also share the dashboard with your colleagues or anyone across the organization by publishing it. You must also configure the plan to update the Athena data sources utilized by the Tableau dashboard before publishing.
  • There are two ways to configure the plan to refresh the Athena data sources with respect to time. One is a Live connection, and the other is Data extract. 
  • Tableau Live connections provide real-time updates, with any changes in the data source reflected in Tableau right away, while Data extracts are snapshots of data that are optimized into system memory and may be retrieved rapidly for viewing. In sophisticated or complex visualizations with vast datasets, filters, and computations, Data extracts are likely to be significantly quicker than live connections.
  • Based on your use cases and preferences, you can choose the configuration plans.
  • With Tableau Desktop, you can also view the raw SQL query generated from the visualizations that were created previously. 
  • In the Athena Console, click on the “History” tab. There you can see the auto-generated SQL queries for all the visualizations you created using Tableau.  
  • The following results are the query generated from the country-wise and age-wise visualizations.

Country-wise

SELECT "student_view"."age" AS "age",
  "student_view"."sex" AS "sex",
  SUM("student_view"."studytime") AS "sum:studytime:ok"
FROM "studentdb"."student_view" "student_view"
GROUP BY "student_view"."age",
  "student_view"."sex"

Age-wise

SELECT "student_view"."country" AS "country",
  SUM("student_view"."studytime") AS "sum:studytime:ok"
FROM "studentdb"."student_view" "student_view"
GROUP BY "student_view"."country"

On following the above-mentioned steps, you successfully established a Tableau S3 connection.

Method 2: Using Hevo Data to Connect Tableau S3

Hevo provides you with a truly efficient and fully automated solution to manage data in real-time and always have analysis-ready data at Tableau.

Now you can transfer data from your desired source to your target Destination for Free using Hevo!

Hevo focuses on three simple steps to get you started

  • Connect: Connect Hevo with Stripe and various other payments, sales & marketing data sources by simply logging in with your credentials.
  • Integrate: Consolidate your payments & customer data from several sources in Hevo’s Managed Data Integration Platform and automatically transform it into an analysis-ready form.
  • Visualize: Connect Hevo with your desired BI tool such as Tableau and easily visualize your unified payments and sales data to gain better insights.

As can be seen, you are simply required to enter the corresponding credentials to implement this fully automated data pipeline without using any code.

Let’s look at some salient features of Hevo

  • Fully Managed
  • Data Transformation 
  • Real-Time
  • Schema Management
  • Live Monitoring
  • Live Support

Move data effortlessly with Hevo’s zero-maintenance data pipelines, Get a demo that’s customized to your unique data integration challenges

Conclusion

In this article, you learned about Amazon S3, Tableau, steps to establish a Tableau S3 connection, and how to analyze S3 data using Tableau Desktop.

This article mainly focused on integrating S3 and Tableau using the Athena JDBC connector or Driver.

However, you can also use third-party drivers or subscribe to online data pipelining platforms to seamlessly integrate the Amazon S3 with Tableau like Hevo.

Ishwarya M
Technical Content Writer, Hevo Data

Ishwarya is a skilled technical writer with over 5 years of experience. She has extensive experience working with B2B SaaS companies in the data industry, she channels her passion for data science into producing informative content that helps individuals understand the complexities of data integration and analysis.

No-code Data Pipeline for Tableau