Data and Visualization are becoming an essential aspect in today’s digital world. Businesses often use data to interpret the current market scenario and make the decision based on that.
- Visualization helps businesses analyze their data in the form of charts and graphs and allows them to make data-driven decisions.
- One such collaborative BI (Business Intelligence) software that allows you to visualize your data and become more data-driven is Tableau. A new technique termed Data Blending in Tableau allows you to combine big data from multiple sources and merge them onto a single Data Warehouse.
- It also gives you a brief overview of Tableau and its features. Finally, it describes a few limitations of Data Blending in Tableau.
What is Tableau?
Tableau is one of the leading visualization tools available in the market, with several in-house connectors that help users connect to various data sources.
Tableau is available as different products, and users can use it based on their requirements. The various product offerings of Tableau are:
Key Features of Tableau
Tableau has a wide range of features which makes it a better choice over other BI tools. Some of these are as follows:
- Tableau provides extensive features in Dashboard to perform analytical analysis on the data and allows users to create a visual masterpiece.
- Tableau supports real-time data as well as batch data with robust in-memory computation.
- Tableau has over 200+ connectors available in its library, which can connect to any relational and non-relational databases, CSV files, excels, Hive, Snowflake, etc.
- Tableau allows users to develop advanced charts and graphs to create high-quality visuals.
With Hevo, you can seamlessly integrate data from multiple sources into any data warehouse, ensuring your organization has a unified view of its data assets.
Why Use Hevo for Data Warehouse Integration?
- Broad Source and Destination Support: Connect to over 150+ sources, including databases, SaaS applications, and more, and load data into your preferred data warehouse.
- Real-Time Data Sync: Keep your data warehouse up-to-date with real-time data flow, ensuring your analytics are always based on the latest information.
- No-Code Platform: With Hevo’s user-friendly interface, you can easily set up and manage your data pipeline without any technical expertise.
Get Started with Hevo for Free
What is Data Blending?
Data Blending is the process in which data from multiple data sources are combined into a single repository. When you perform Data Blending in Tableau, when the data is combined, the query is sent to the database for each used source and that result when returned from the query will be sent back to Tableau as aggregated data in the form of a Visualization.
To understand Data Blending, consider the below example where the primary table has UserId, and the secondary table has PartonId and the task is to link both the tables. The following attributes need to be understood to accomplish this task:
- The left table, which contains UserId as the key, blends with the right table with PartonId as the key.
- If a row in the left table doesn’t match the row in the right table, it will be represented by null.
- If there are multiple values in the right table, it will be denoted by an asterisk (*).
The tables described above are shown below.
After Data Blending is performed on the tables, the aggregated tables look like the diagram shown below.
Primary and Secondary Tableau Data Sources in Tableau
- The Primary Data Source is the main table for Data Blending, whereas the Secondary Data Source is the additional table.
- Tableau defines graphs and charts on the Primary Data Source. But you can’t perform Data Blending in a Tableau sheet without any Secondary Data Source.
- Only the data values from Secondary Data Sources that are corresponding to the values of Primary Data Sources are taken for Data Blending.
- So, it becomes increasingly important to spend a considerable amount of time and effort before selecting your Primary and Secondary Data Sources. To select a Data Source as primary, simply use the fields of the source first in a chart.
Why Do You Need Data Blending in Tableau?
Data Blending is best suited for the below-mentioned conditions.
- Data Blending in Tableau can be implemented when you want to combine related data from multiple sources in a single view.
- Cross-Database joins do not support connections to cubes such as Oracle Essbase or to some extract-only connections such as Google Analytics. Hence, Data Blending is the best way to combine data from different Databases that are not compatible with Cross-Database joins.
- Data Blending is best suited in cases where transactional values of data are captured at different levels of granularity/levels in each data set. While analyzing transactional data and quota data, transactional data might capture all transactions, but quota data might aggregate transactions at the quarter level.
- Data Blending is also suitable for blending larger data sets.
Preparing Data for Blending
Before you start blending your data in Tableau, it is very essential to make sure that the data is compatible with the process. So, the first step of Data Blending is Data Preparation.
This step requires loading the data from the Primary and Secondary Data Sources. To do so, go to the Menu – Data → New Data Source and browse for the required data file to upload it. Before going ahead, make sure that the data sets connected are distinct.
Data Blending in Tableau
Data Blending in Tableau is performed on two separate data sources that can be combined to view a single sheet. You can perform Data Blending in Tableau and integrate two data sources by following the below steps:
- Connect to the primary data source and set up the data source on the data source page.
- Go to Data> New Data Source to connect to the second data source.
- Ensure that the data sets connected are distinct.
- From the primary data source, drag the field to the view.
- The secondary data source checks if the secondary data source has a blended relationship to the primary data source.
- On the secondary data source, check if there is an orange link icon. If present, it shows the can be blended.
- If there is a broken icon, select the field on the secondary data source that can link two data sources. The grey link will turn into an orange link, and after that, the data can be blended.
- Drag a field into the view from the secondary data source.
Once both the data sources are in the view and blended stage, you can now analyze them in a single blended worksheet. We will be using Sales Target as the primary data source and Sample-superstore as the secondary data source in this use case. This is represented by the below image.
As seen in the above image, the primary data source has a blue checkmark, whereas the secondary data source will have an orange checkmark.
Difference between Data Blending and Join in Tableau
Data Blending is very similar to a traditional left join.
- While combining data with a left join, a query is sent to the Database where the join is performed. It will return all rows from the left table (primary) and the rows from the right table (secondary) that have a corresponding match with rows in the left table.
- The rows are then aggregated by Tableau. You can observe that it first combines the two tables and then aggregates the data in them.
- The main difference between joining two data tables and blending two data tables is the step at which aggregation of data happens. As discussed above, in joining, both the tables are first combined and then the data is aggregated, which may even result in duplicating values (if a single value in the left table corresponds to more than one value in the right table).
- In blending, both the tables are kept separated at the Database. Then it aggregates the data and sends it to Tableau to combine it with no duplicated data. It can often happen that a single value in the left table corresponds to more than one value in the right table.
Limitations of Data Blending in Tableau
Although Data Blending in Tableau can be a vital asset to your organization, it has a few limitations. Some of these limitations are:
- Tableau does not support nonadditive aggregates such as Median, RaqSQL.
- While dealing with highly granular data, Data Blending in Tableau compromises the query speed.
- Calculated fields are not available for sorting the sheets.
- If you are using a cube data source, they can be used as a primary data source.
Conclusion
- This article provided you with a comprehensive guide on Data Blending in Tableau. It also gave you a brief overview of Tableau and its features. It also described a few limitations of this process.
- Overall, Data Blending in Tableau plays a pivotal role especially in extracting data from multiple sources.
- By setting up a proper Data Blending process in Tableau, you can derive meaningful insights from your customers seamlessly.
FAQ
What is data blending in Tableau?
Data blending in Tableau combines data from different sources with separate data connections into a single view by linking common fields, without merging the datasets.
What is an example of blending data?
An example is blending sales data from an Excel file with customer demographics stored in a SQL database, linked by a common “Customer ID” field.
What is the difference between blending and joins in Tableau?
Blending occurs after data is aggregated and allows combining disparate sources, while joins combine tables within the same connection at the row level before aggregation.
Vishal Agarwal is a Data Engineer with 10+ years of experience in the data field. He has designed scalable and efficient data solutions, and his expertise lies in AWS, Azure, Spark, GCP, SQL, Python, and other related technologies. By combining his passion for writing and the knowledge he has acquired over the years, he wishes to help data practitioners solve the day-to-day challenges they face in data engineering. In his article, Vishal applies his analytical thinking and problem-solving approaches to untangle the intricacies of data integration and analysis.