Data Analytics and Business Intelligence have become an integral part of strategy creation and decision-making for Business organizations. The data generated and stored due to daily operations require to be analysed using visualizations tools to obtain actionable insights. Apache Superset is one such Business Intelligence and analysis tool that can be used by an analytics team to complete their tasks. 

In this article, you will be introduced to the Apache Superset platform and its key features. You will learn the process of installation of Apache Superset Dashboards and the steps that need to be followed for creating Superset Dashboards after establishing a connection with a database. 

Table of Contents

Introduction to Apache Superset

Superset Dashboards - Superset Logo

Apache Superset is a Python-based Business Intelligence platform. It is an open-source tool offered by Airbnb. It is a fast, lightweight, and intuitive tool that enables users to interact with different kinds of data sources and provide actionable insights of the data using visualizations such as Aggregation Charts, Tables, and Interactive Maps. The interface is simple and user-friendly and can be operated by users at any skill level to create simple line charts to complex geospatial charts. 

Programming knowledge is not required to enable users to explore, filter, and organise data. Although there is additional support for SQL / IDE editor to facilitate interactive querying for advanced users. The platform allows defining a list of users and default functionalities associated with that group of users. It also provides statistics on the usage of the platform and an action log for every user. 

Various Databases and Cloud Data Warehouses are supported by Apache Superset. Most SQL databases such as MySQL, PostgreSQL, Oracle, SQL Server, MariaDB, etc. including Python ORM (Object Relational Mappers) [SQLAlchemy] are supported. Cloud Data Warehouses such as Amazon Redshift, Google BigQuery, and Snowflake are also supported. This platform is also compatible with Apache Druid for a seamless experience. 

Official documentation regarding Apache Superset can be found here

Key Features of Apache Superset 

  • Simple yet Effective User Interface: Simple no-code visual builder or the SQL IDE can be used to organise and visualize data according to the requirements. 
  • Latest Architecture: The platform is lightweight and highly scalable that utilizes the power of existing data infrastructure. 
  • Custom Visualisations and Dashboards: Various beautiful visualizations are available and the plugin architecture makes it easy to build visualizations and add them directly to the Superset platform. 
  • Secure Permission Model: It supports user access control and database authentication using various security protocols such as OpenID, LDAP, OAuth, REMOTE_USER using Flask AppBuilder.  
  • Integration with Modern Databases: Multiple SQL-based databases are supported through SQLAlchemy. Various Cloud Data Warehouses are supported at a petabyte scale.  
  • User Field Control: Control the way data sources are displayed and processed using a simple semantic layer.  

Understanding Installation of Apache Superset Dashboards

The following process is required to be followed for a standard installation of Apache Superset using a Linux-based Command Line Interface. 

Superset Dashboards Installation Step 1: Installation of Dependencies

Run the following command to install OS-level dependencies: 

sudo apt-get install build-essential libssl-dev libffi-dev python-dev 

If you have python3.5 installed alongside python2.7, run the following commands also:

sudo apt-get install build-essential libssl-dev libffi-dev python3.5-dev python-pip libsasl2-dev libldap2-dev

Superset Dashboards Installation Step 2: Installation of Setuptools 

Run the following command to install the latest version of pip and setup tools libraries,

pip install --upgrade setuptools pip

Superset Dashboards Installation Step 3: Installation on Apache Superset

Run the following commands to install and initialize Apache Superset: 

# Install superset
pip install superset

# Create an admin user (you will be prompted to configure username, first and last name before setting up a password)

fabmanager create-admin --app superset

# Initialize the database
superset db upgrade

# Load some data to play with
superset load_examples

# Create default roles and permissions
superset init

# To start a development web server on port 8088, use -p to bind to another port
superset runserver -d

Superset Dashboards Installation Step 4: Creation of Admin Account 

After successful installation, from your browser go to http://localhost:8088 and create an admin account using the credentials. 

For connecting with databases such as MySQL, you will be required to install the database connector using the following command, 

install pip install mysqlclient

Superset Dashboards Installation Step 5: Addition of Connection

Superset Dashboards - databases option
Image Source

To add a new connection, firstly login to Apache Superset, then click on “Sources” and subsequently select “Databases”.

Then you need to click the “+ Database” button in the top right corner.

Image Source

Provide the credentials such as Connection Name and SQLAlchemy URI [Uniform Resource Identifier]. A URI should look similar to this: 

mysql://root:XXXXXXXXXX@104.198.32.xxx:3306/rd_demo_db

Superset Dashboards - Add Database Dialog Box
Image Source

Click on the “Test Connection” button to confirm if the connection can be established. On success, click on the “Add” button to save the connection.

Superset Dashboards - Test Connection button
Image Source
Hevo Data, Seamless Data Migration to Apache Superset

Hevo is a No-code Data Pipeline that offers a fully managed solution to set up data integration from 100+ data sources (including 40+ Free Data Sources) and will let you directly load data to a Data Warehouse and visualize it in a BI tool of your choice such as Apache Superset. It will automate your data flow in minutes without writing any line of code. Its fault-tolerant architecture makes sure that your data is secure and consistent. Hevo provides you with a truly efficient and fully automated solution to manage data in real-time and always have analysis-ready data.

Get Started with Hevo for Free

Check out what makes Hevo amazing:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with minimal latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
Sign up here for a 14-Day Free Trial!

Creating Apache Superset Dashboards

The process for the creation of Apache Superset Dashboards starts with the Data Sources [Databases and Tables] then they proceed with Data Slices and lastly proceed with Dashboards.

Each Dashboard is associated with multiple Slices and Each Slice can be associated with multiple Dashboards. 

Superset Dashboards - Sources, Slice, Dashboard relationship
Image Source

After a connection with the data source has been established, the following steps are required to be taken for creating a Dashboard: 

Superset Dashboards Creation Step 1: Registration of a New Table

You need to start with specific tables [termed as Datasets in Superset] that are supposed to be accessed in Superset Dashboards for querying. Proceed by selecting “Data” then “Datasets” and lastly the “+Dataset” button.

Image Source

Choose the Database Schema and then the Table of your choice from the options being displayed.

Superset Dashboards - Add Dataset Dialog Box
Image Source

Superset Dashboards Creation Step 2: Customisation of Column Properties

Configuration of the Column Properties is to control how the column should be accessed in the Explore Workflow. 

Below you can observe how the access of Temporal, Filterable, and Dimension are controlled in the dataset.

Superset Dashboards - Edit Dataset Window
Image Source

Superset Semantic Layer provides functionality to store 2 types of data as follows: 

  • Virtual Metrics: SQL queries that use Aggregate values from numerous columns come in this category. Example: 
SUM(recovered) / SUM(confirmed);
Superset Dashboards - Edit Dataset virtual metrics
Image Source
  • Virtual Calculated Columns: SQL queries that customize the appearance and behaviour of a specific column come in this category. Example: 
CAST(recovery_rate as FLOAT);
Superset Dashboards - edit dataset virtual calculated columns
Image Source

Superset Dashboards Creation Step 3: Creation of Charts on Explore View

There are two main interfaces to explore data in Superset:

  • Explore: It is a no-code visual builder in which the user can select a chart and customize the appearance. 
  • SQL Lab: It is an SQL IDE that can be used for cleaning, joining, and preparing data for Explore Workflow. 
Superset Dashboards - explore data
Image Source

Using the Explore builder, you can check the Dataset view on the left side of the window, which displays the columns and metrics scoped to the currently selected dataset.

Superset Dashboards - Dataset View
Image Source

A Time-series Bar Chart can be easily created by clicking the options in the drop-down menu. 

Superset Dashboards - Time Series Bar Chart
Image Source

Superset Dashboards Creation Step 4: Creation of Slice and Dashboard

You can either save your chart and add it to the existing Dashboard or to a new Dashboard of your choice. 

Superset Dashboards - Save chart Dialog Box
Image Source

The chart can be published by clicking “Save & Go To Dashboard”

Superset Dashboards - Sales Dashboard
Image Source

The chart can be resized using the “Pencil” button, found in the top right corner. 

Superset Dashboards - Pencil Button
Image Source

You can also click and drag on the bottom right corner of the chart until it snaps into the position that you desire depending upon the underlying grid. 

Superset Dashboards - Click and Drag for better positioning
Image Source

Click on the “Save” button to save your progress. 

Understanding Various Apache Superset Dashboards Operations

You may need to enable the functionality to upload a CSV or Excel file to your database. The following section explains how to enable this functionality for a sample database.

Superset Dashboards Operations: Pivot Table

Pivot Table is a powerful tool to calculate, summarize, and analyze data allowing you to identify patterns and trends in your data and set up comprehensive comparisons. After selecting a Chart from the top right corner, choose the data source and click on the “visualization” type to see the “Visualisation” menu. Then proceed with selecting the Pivot Table visualization. 

Superset Dashboards - Create a new chart Dialog Box
Image Source

Enter the parameters on the window according to your requirement keeping into consideration the context of the data being handled.  

Superset Dashboards - Time parameters
Image Source

After addition of the Query conditions in the window, the Pivot Table will be generated like the one shown below. 

Pivot Table - Superset Dashboards
Image Source

Superset Dashboards Operations: Line Chart

The line chart is a type of chart which helps display information as a series of data points connected by straight line segments. In Superset, you are required to select the Data Range from the Dataset of your choice. You can also implement certain queries of the Data Range and plot the results on the Line chart. 

Setting Metrics- Superset Dashboards
Image Source

Next, select Run Query to show the data on the chart.

After you select the Data Range and enter the required Query, click on the “Run Query” button to show the data on the chart. 

Pivot Chart- Superset Dashboards
Image Source

Superset Dashboards Operations: Markup

Users can incorporate Markups and annotations on the content depending on the requirements of the user. You can add text to your dashboard by navigating to the dashboard using the top menu and then going to edit mode by selecting “Edit Dashboard”.

Then in the “Insert Components” pane, you will be required to drag and drop a Markdown box on the dashboard. 

insert components- Superset Dashboards
Image Source

You can edit the text in the textbox in the Markdown format, and also toggle between Edit and Preview mode using the menu located on the top of the window.  

Edit and Preview Mode- Superset Dashboards
Image Source

Superset Dashboards Operations: Time Comparison

Progress evaluation using Time comparison can be exceptionally crucial for any organization. You need to navigate to the Time Comparison option in the Advanced Analytics section of the Visualization menu. Subsequently, you can run a query to build/generate a new chart that will have additional series with the same values shifted only a week back in time. 

Time Comparison - Superset Dashboards
Image Source

You can then change the calculator type to Absolute difference and select Run Query, to see one series at a time showing the difference between the two series. 

Absolute Difference Type Calcuclator - Superset Dashboards
Image Source

Conclusion 

In this article, you learned about the Apache Superset platform and its key features. You also learned about the process of installing Apache Superset Dashboards using a command-line interface and the steps that need to be followed for creating Superset Dashboards after establishing a connection with a database. Various Operations that can be performed on the Dashboard Data were also discussed. If you are interested in carrying out a similar analysis on the Looker platform, you can find the guide here.

Visit our Website to Explore Hevo

Integrating and analyzing data from a huge set of diverse sources can be challenging, this is where Hevo comes into the picture. Hevo Data, a No-code Data Pipeline helps you transfer data from a source of your choice in a fully automated and secure manner without having to write the code repeatedly. Hevo with its strong integration with 100+ sources & BI tools such as Apache Superset, allows you to not only export & load Data but also transform & enrich your Data & make it analysis-ready in a jiffy.

Get started with Hevo today! Sign Up for a 14-day free trial! and experience the feature-rich Hevo suite first hand. Check out the pricing details to get a better understanding of which plan suits you the most.

Abhishek Duggal
Former Research Analyst, Hevo Data

Abhishek is a data analysis enthusiast with a passion for data, software architecture, and writing technical content. He has experience writing articles on diverse topics such as data integration and infrastructure.

Visualize your Data in Apache Superset in Real-time Easily

Get Started with Hevo