Tableau is a well-known Business Intelligence tool that is used by organizations to derive insights about their data. Tableau excels at its ability to connect to a variety of data sources including on-premise and cloud data sources. It allows business analysts to perform ad-hoc analysis of data and build visualizations using the results.

Tableau can be used as an on-premise Business Intelligence tool through Tableau Server or as a completely managed tool through Tableau Online service. Both flavors allow users to publish their reports and dashboards to other users in the organization. It is also possible to schedule and run periodic jobs to ensure data freshness for these reports and dashboards.

In this post, you will learn about Tableau Prep and Tableau Prep Conductor: Two critical applications that facilitate the Data Preparation and Job Scheduling Flow in Tableau parlance.

What Is Tableau Prep?

Tableau Prep alters the method of traditional data prep in an organization. The analysts and business users combine, shape, and clean data. While it simplifies this to start analysis and get faster insights using a self-service data preparation.

Tableau Prep has two products, one is the Tableau Prep Builder for building your data flows, and the other one is the Tableau Prep Conductor for scheduling, monitoring, and managing flows across the organization.

What is Tableau Prep Builder?

Tableau Prep Builder helps one to easily and intuitively prepare the data for further analysis or reporting. It provides a visual interface where analysts can specify the steps that they need to take the data through. Through its built-in steps, the Builder supports combining, reshaping, or cleaning the data. 

Broadly the application contains a connection pane, a flow pane where the user can visually build the data flow, a profile page that displays the auto-generated summary of the data, and a data grid that displays the row-level data.

Through the built-in steps, it supports all the typical data operations like filter, split, union, joins pivot, etc. After creating the datasets through these operations, the results can be published for use by other people.

Others can use it through Tableau Server or Tableau Desktop. They can also edit the flow you created and implement data based on modifications on the original flow.

Tableau Prep Builder
Image Source

Tableau Prep Builder Capabilities

Using Tableau Prep Builder has the following capabilities:

  • You can connect to multiple data sources.
  • Your data can be cleaned using a filter, split, and rename.
  • Values can be edited directly on rows of data.
  • The data sources can be combined using unions and joins.
  • Pivots and aggregations will shape your data.
  • You can create an output of your cleaned data for analysis in Tableau Desktop or Tableau Server/Online.
  • The outputs of your cleaned data as a new table, or add to or replace data in an existing table. The data can be in several formats like a Tableau extract (.hyper) file, a text (.csv) file, or an external relational database.

Tableau Prep Builder Features

  • Connect and extract data- You can connect to more than 40 data sources and extract the data. You can also connect to other ODBC and JDBC connections with the relevant credentials.
  • Clean the Data– This is step the data is cleaned using Filter, Replace values with Null values. You can also follow the Cleaning operations using recommendations.
  • Group and Replace-Using group by you can group similar categories together.
  • Common Characters: Tableau Prep’s Group – Common Characters feature can combine two similar entities into a single entity. You can assign your data role and use it to match and group the values, this will group the invalid values based on the spelling and pronunciation.
  • Pivoting Data– Pivot columns to rows when it’s difficult to analyze data in the crosstab format.

What is Tableau Prep Conductor?

Once the data sets built using the Tableau Prep Builder are published, they need to be continuously updated for downstream analysts or consumers to derive value from them. 

Tableau Prep Conductor manages the scheduling and execution of these data flows. It enables the users to view and monitor the details of their flow. It has mechanisms to generate alerts in case of failed flows.

Tableau Prep Conductor can also help the users view the results of the data flows and even edit the existing flows. The Tableau Prep Conductor is closely integrated into Tableau Server and Tableau Online. It is available only if the user has published his data flow to either of the above. 

SuperStore2 Dataset
Image Source

Here are a few benefits of Tableau Prep Conductor:

  • In case a flow fails, you get much wider visibility of errors.
  • The flow doesn’t need constant monitoring on a daily basis.
  • Tableau Prep Conductor acts as a central repository for all kinds of flows and promotes reuse.
  • The flow can be maintained and downloaded by anyone with the right permissions.
  • The likely Tableau Prep Conductor runs on a computer happens to have more processing resources than on the author’s computer.

All these benefits provided by Tableau Prep Conductor help build resilient processes based on the Data Preparation work in your organization. The users are encouraged to collate more data given how much time is saved by the usage of Tableau Prep Conductor.

Tableau Prep Conductor is a part of the ‘Creator’ subscription package that also includes Tableau Prep. You can use the Tableau Prep Conductor by paying for the ‘Data Management’ add-on, applying for the new license key, and restarting the server. Once you do this, the Prep Conductor will be present on every ‘node’ of the server with a Backgrounder process running on it.

The Backgrounder is a crucial component of the Tableau Server as it helps update data extracts and Tableau Prep flows. It ensures that there are enough of these to match the volume of tasks carried out as a part of the Tableau Server administration.

This Prep Conductor can be turned off by an admin by going to the ‘Settings’ Menu in the server and switching the Prep Conductor off in the General tab. In case this function isn’t available you can contact your server admin for further clarification.

Prerequisites

  • Tableau Server or Tableau Online license with Data management add-on.
  • Basic understanding of Data Analysis and SQL.

Working with Tableau Prep Builder and Tableau Prep Conductor

Now that you have a good idea about the purpose of these two applications, it is time for you to try watching them in action. The below set of steps will take you through a sequence of steps to build a data flow using Tableau Prep Builder and then monitor it using Tableau Prep Conductor.

For this exercise, you will use a sample Text file that is available from the Tableau website here. Use the Orders_Central.csv file from the above link. 

  • Step 1: On the connections pane, click ‘Connect to Data’ and select Text File. Point it to the CSV file you just downloaded.
Tableau Prep Builder Flow
Image Source
  • Step 2: From the next screen, click the + button near the data element you just added. This is to create a clean step.
Creating a Clean Step
Image Source
  • Step 3: In the CSV file that you just uploaded the order data is split into three fields – Order year, month, and day. You will attempt to combine the three fields into one field called Order Date. From the toolbar that appears, click Create Calculated Field. In the Calculation editor, enter the following formula and click Save.
MAKEDATE([Order Year],[Order Month],[Order Day])
Edit Field Window
Image Source
  • Step 4: Now click the + button near the clean task and select the Output step. In the ‘Save Output to’ drop-down select File and enter the name of the output file. In the Output Type field, you can select CSV.
  • Step 5: After selecting the output file, Click on Run flow in the output pane to trigger the flow. Once the flow is successfully run, a status dialog indicating the success will be displayed. 
  • Step 6: Now head to the Tableau Prep Conductor. On the overview page, you will be able to see all the existing flows including the one which you just created. Head to the Scheduled Tasks tab and click New to create a new scheduled task. Enter the details and the schedule you want. Tableau Prep Conductor will ensure that the flow runs as configured.
Scheduled Tasks Window
Image Source

That completes the steps involved in using Tableau Prep Builder and Tableau Prep Conductor to design and schedule a data cleaning operation. 

Conclusion

In this article, you learned that Tableau Prep Builder and Tableau Prep Conductor are intuitive tools that help one to visually design data flows and then schedule them. Organizations prefer Tableau over other tools because of this ease of implementation and the richness of visualizations that are possible.

A common roadblock that users hit while using Tableau for business intelligence is the lack of support for data connections when it comes to third-party software as a service offering like Hubspot, Woocommerce, etc.

Extracting complex data from a diverse set of data sources to carry out an insightful analysis can be a challenging task and this is where Hevo saves the day! Hevo offers a faster way to move data from Databases or SaaS applications into your Data Warehouse to be visualized in a BI tool such as Tableau. Hevo is fully automated and hence does not require you to code. You can try Hevo for free by signing up for a 14-day free trial. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

Talha
Software Developer, Hevo Data

Talha is a Software Developer with over eight years of experience in the field. He is currently driving advancements in data integration at Hevo Data, where he has been instrumental in shaping a cutting-edge data integration platform for the past four years. Prior to this, he spent 4 years at Flipkart, where he played a key role in projects related to their data integration capabilities. Talha loves to explain complex information related to data engineering to his peers through writing. He has written many blogs related to data integration, data management aspects, and key challenges data practitioners face.