Data Analysis gives an organization useful insights to understand the best and worst-performing assets, understand customers better, improve efficiency, and help organizations to survive in a competitive market. All the departments in an organization need analysis and Tableau is a collaborative Business Intelligence Software that allows you to visualize your data and derive meaningful insights.
All the data related to the organization needs to be extracted from an external source and loaded into Tableau for analysis. This is done by creating a Tableau Extract. The data in external sources keep changing and this change needs to reflect in the Tableau Extracts and for this, Tableau Extract Refresh is used. In this article, you will learn about Tableau Incremental Refresh and Tableau Full Refresh in detail.
Understanding Tableau
Tableau is a BI and Data Analysis software used by organizations to visualize data and derive insights from it. Tableau has helped healthcare, communication & media, education, and many other industries to reduce their analysis time and make their business decisions more data-driven. Tableau helps its customers to seek answers to their questions by analyzing their business data. Tableau ensures robustness and security for your sensitive data.
Understanding Tableau Extracts
Tableau Extracts are used to improve server performance. The data from the external data source is compressed and stored as a Tableau Extract. Tableau runs queries on top of the Tableau Extract to create visualizations. A Tableau Extract can store data in a normalized or denormalized format. Normalized data will require complicated join queries whereas data is easy to fetch from the denormalized format.
Understanding the Need for Tableau Incremental Refresh
Tableau Extract Refresh can be an Incremental Refresh or a Full Refresh. Tableau has set Full Extract Refresh as the default option for a Tableau Extract Refresh. In a Full Extract Refresh, the entire Tableau Extract data is replaced with the data in the external data source. So this means every time a Full Refresh is performed on a huge external data source, millions of rows of data need to be extracted and loaded which is time-consuming and expensive. This brings you to the need for Tableau Incremental Refresh.
In a Tableau Incremental Refresh, you can configure the Refresh to add the rows which may have changed since the last Refresh. This type of Refresh is extremely useful because organizations use sales, marketing, and other data which keep increasing gradually over time. A Tableau Incremental Refresh can help update this change in data in real-time.
Prerequisites
To perform Tableau Extracts incrementally you need to be equipped with the following skills:
- Good working knowledge of Tableau.
- A working Tableau account.
- Admin privileges associated with Tableau.
You also need to take care of the file format before using Extract Refresh. If you perform an Extract Refresh on a .tde Extract using version 2020.4, Tableau changes the Extract to .hyper Extract automatically. Even though there are many benefits of upgrading to a .hyper Extract, you will be unable to open the Extract with previous versions of Tableau Desktop.
Steps to Set up Tableau Incremental Refresh
To set up a Tableau Incremental Refresh you need to follow the given steps below:
Step 1: Selecting the Data Source
Open your Tableau workbook and log in to your account. Go to the data drop-down menu on the top. Then click on Extract Data.
Step 2: Configuring the Tableau Extract Data
The Extract Data dialogue box appears. Select all rows as the number of rows to extract. Incremental Refresh in Tableau can only be configured when you are extracting all rows from the external data source. You cannot increment a sample of rows from the source.
Step 3: Selecting the Tableau Incremental Refresh
Select the Incremental Refresh option and then choose a column in the database that will be used to identify new rows. Any additions or changes in rows of that particular column will help identify the change in the rows. For example, if you select a Date field column, Refreshing Extract will add all rows whose date is after that last time you Refreshed the Extract. Also, you can use an ID column that increments as rows are added to the database.
Step 4: Starting the Data Extraction
After finishing the above-given steps, click on Extract. Now your Extracts will start getting updated with Tableau Incremental Refresh.
The steps explained above can be used to define a brand new Tableau Extract or edit an existing Tableau Extract for an Incremental Refresh. If you are modifying an existing Extract, the last Refresh made is displayed to assure that you are updating the Extract with the correct data.
Also, you will be able to take a glance at the Extract Refresh history by following the steps given on the official website. The Extract History dialogue box will show the date and time for each Refresh.
Limitations of Refreshing Tableau Extracts
- If the schema of the external data source changes (for example let’s assume that a brand new column is added), you should do a Full Extract Refresh to start doing Incremental Refreshes again.
- Since a Tableau Incremental Refresh detects a change in a particular column (column that was used to configure Incremental Refresh), it may not recognize the change when the data is added to other columns.
- Sometimes, the data schema in an external source needs updating as per the organization’s needs and this would result in improper configurations of Extracts. As a result, the Extract Refresh will not work properly.
Conclusion
In this article, you have learned about Tableau Incremental Refresh and Tableau Full Refresh. Though configuring Extract Refresh may save your time it is not very efficient. Sometimes a lot of discrepancies are found and also every time the schema changes you have to manually re-configure everything to make sure your Tableau Extracts contain up-to-date data.
Easha is a programming enthusiast with 2+ years of experience. She has worked in automation test script creation, regression testing, and integration projects like Thyrocare Integration. She has a bachelor's degree in Computer Science and loves writing technical articles about data engineering. Her goal is to help people solve everyday problems through her work.