Data Analytics and Business Intelligence have become an integral part of strategy creation and decision-making for Business organizations. The data generated and stored due to daily operations require to be analysed using visualizations tools to obtain actionable insights. Apache Superset is one such Business Intelligence and analysis tool that can be used by an analytics team to complete their tasks.
In this article, you will be introduced to the Apache Superset platform and its key features. You will learn the process of installation of Apache Superset Dashboards and the steps that need to be followed for creating Superset Dashboards after establishing a connection with a database.
Table of Contents
Introduction to Apache Superset
Apache Superset is a Python-based Business Intelligence platform. It is an open-source tool offered by Airbnb. It is a fast, lightweight, and intuitive tool that enables users to interact with different kinds of data sources and provide actionable insights of the data using visualizations such as Aggregation Charts, Tables, and Interactive Maps. The interface is simple and user-friendly and can be operated by users at any skill level to create simple line charts to complex geospatial charts.
Programming knowledge is not required to enable users to explore, filter, and organise data. Although there is additional support for SQL / IDE editor to facilitate interactive querying for advanced users. The platform allows defining a list of users and default functionalities associated with that group of users. It also provides statistics on the usage of the platform and an action log for every user.
Various Databases and Cloud Data Warehouses are supported by Apache Superset. Most SQL databases such as MySQL, PostgreSQL, Oracle, SQL Server, MariaDB, etc. including Python ORM (Object Relational Mappers) [SQLAlchemy] are supported. Cloud Data Warehouses such as Amazon Redshift, Google BigQuery, and Snowflake are also supported. This platform is also compatible with Apache Druid for a seamless experience.
Official documentation regarding Apache Superset can be found here.
Key Features of Apache Superset
- Simple yet Effective User Interface: Simple no-code visual builder or the SQL IDE can be used to organise and visualize data according to the requirements.
- Latest Architecture: The platform is lightweight and highly scalable that utilizes the power of existing data infrastructure.
- Custom Visualisations and Dashboards: Various beautiful visualizations are available and the plugin architecture makes it easy to build visualizations and add them directly to the Superset platform.
- Secure Permission Model: It supports user access control and database authentication using various security protocols such as OpenID, LDAP, OAuth, REMOTE_USER using Flask AppBuilder.
- Integration with Modern Databases: Multiple SQL-based databases are supported through SQLAlchemy. Various Cloud Data Warehouses are supported at a petabyte scale.
- User Field Control: Control the way data sources are displayed and processed using a simple semantic layer.
Understanding Installation of Apache Superset Dashboards
The following process is required to be followed for a standard installation of Apache Superset using a Linux-based Command Line Interface.
Superset Dashboards Installation Step 1: Installation of Dependencies
Run the following command to install OS-level dependencies:
sudo apt-get install build-essential libssl-dev libffi-dev python-dev
If you have python3.5 installed alongside python2.7, run the following commands also:
sudo apt-get install build-essential libssl-dev libffi-dev python3.5-dev python-pip libsasl2-dev libldap2-dev
Superset Dashboards Installation Step 2: Installation of Setuptools
Run the following command to install the latest version of pip and setup tools libraries,
pip install --upgrade setuptools pip
Superset Dashboards Installation Step 3: Installation on Apache Superset
Run the following commands to install and initialize Apache Superset:
# Install superset
pip install superset
# Create an admin user (you will be prompted to configure username, first and last name before setting up a password)
fabmanager create-admin --app superset
# Initialize the database
superset db upgrade
# Load some data to play with
# Create default roles and permissions
# To start a development web server on port 8088, use -p to bind to another port
superset runserver -d
Superset Dashboards Installation Step 4: Creation of Admin Account
After successful installation, from your browser go to
http://localhost:8088 and create an admin account using the credentials.
For connecting with databases such as MySQL, you will be required to install the database connector using the following command,
install pip install mysqlclient
Superset Dashboards Installation Step 5: Addition of Connection
To add a new connection, firstly login to Apache Superset, then click on “Sources” and subsequently select “Databases”.
Then you need to click the “+ Database” button in the top right corner.
Provide the credentials such as Connection Name and SQLAlchemy URI [Uniform Resource Identifier]. A URI should look similar to this:
Click on the “Test Connection” button to confirm if the connection can be established. On success, click on the “Add” button to save the connection.
Hevo is a No-code Data Pipeline that offers a fully managed solution to set up data integration from 100+ data sources (including 40+ Free Data Sources) and will let you directly load data to a Data Warehouse and visualize it in a BI tool of your choice such as Apache Superset. It will automate your data flow in minutes without writing any line of code. Its fault-tolerant architecture makes sure that your data is secure and consistent. Hevo provides you with a truly efficient and fully automated solution to manage data in real-time and always have analysis-ready data.
Get Started with Hevo for Free
Check out what makes Hevo amazing:
Sign up here for a 14-Day Free Trial!
- Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
- Schema Management: Hevo takes away the tedious task of schema management & automatically detects schema of incoming data and maps it to the destination schema.
- Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
- Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with minimal latency.
- Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
- Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
- Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
Creating Apache Superset Dashboards
The process for the creation of Apache Superset Dashboards starts with the Data Sources [Databases and Tables] then they proceed with Data Slices and lastly proceed with Dashboards.
Each Dashboard is associated with multiple Slices and Each Slice can be associated with multiple Dashboards.
After a connection with the data source has been established, the following steps are required to be taken for creating a Dashboard:
Superset Dashboards Creation Step 1: Registration of a New Table
You need to start with specific tables [termed as Datasets in Superset] that are supposed to be accessed in Superset Dashboards for querying. Proceed by selecting “Data” then “Datasets” and lastly the “+Dataset” button.
Choose the Database Schema and then the Table of your choice from the options being displayed.
Superset Dashboards Creation Step 2: Customisation of Column Properties
Configuration of the Column Properties is to control how the column should be accessed in the Explore Workflow.
Below you can observe how the access of Temporal, Filterable, and Dimension are controlled in the dataset.
Superset Semantic Layer provides functionality to store 2 types of data as follows:
- Virtual Metrics: SQL queries that use Aggregate values from numerous columns come in this category. Example:
SUM(recovered) / SUM(confirmed);
- Virtual Calculated Columns: SQL queries that customize the appearance and behaviour of a specific column come in this category. Example:
CAST(recovery_rate as FLOAT);
Superset Dashboards Creation Step 3: Creation of Charts on Explore View
There are two main interfaces to explore data in Superset:
- Explore: It is a no-code visual builder in which the user can select a chart and customize the appearance.
- SQL Lab: It is an SQL IDE that can be used for cleaning, joining, and preparing data for Explore Workflow.
Using the Explore builder, you can check the Dataset view on the left side of the window, which displays the columns and metrics scoped to the currently selected dataset.
A Time-series Bar Chart can be easily created by clicking the options in the drop-down menu.
Superset Dashboards Creation Step 4: Creation of Slice and Dashboard
You can either save your chart and add it to the existing Dashboard or to a new Dashboard of your choice.
The chart can be published by clicking “Save & Go To Dashboard”
The chart can be resized using the “Pencil” button, found in the top right corner.
You can also click and drag on the bottom right corner of the chart until it snaps into the position that you desire depending upon the underlying grid.
Click on the “Save” button to save your progress.
Understanding Various Apache Superset Dashboards Operations
You may need to enable the functionality to upload a CSV or Excel file to your database. The following section explains how to enable this functionality for a sample database.
Superset Dashboards Operations: Pivot Table
Pivot Table is a powerful tool to calculate, summarize, and analyze data allowing you to identify patterns and trends in your data and set up comprehensive comparisons. After selecting a Chart from the top right corner, choose the data source and click on the “visualization” type to see the “Visualisation” menu. Then proceed with selecting the Pivot Table visualization.
Enter the parameters on the window according to your requirement keeping into consideration the context of the data being handled.
After addition of the Query conditions in the window, the Pivot Table will be generated like the one shown below.
Superset Dashboards Operations: Line Chart
The line chart is a type of chart which helps display information as a series of data points connected by straight line segments. In Superset, you are required to select the Data Range from the Dataset of your choice. You can also implement certain queries of the Data Range and plot the results on the Line chart.
Next, select Run Query to show the data on the chart.
After you select the Data Range and enter the required Query, click on the “Run Query” button to show the data on the chart.
Superset Dashboards Operations: Markup
Users can incorporate Markups and annotations on the content depending on the requirements of the user. You can add text to your dashboard by navigating to the dashboard using the top menu and then going to edit mode by selecting “Edit Dashboard”.
Then in the “Insert Components” pane, you will be required to drag and drop a Markdown box on the dashboard.
You can edit the text in the textbox in the Markdown format, and also toggle between Edit and Preview mode using the menu located on the top of the window.
Superset Dashboards Operations: Time Comparison
Progress evaluation using Time comparison can be exceptionally crucial for any organization. You need to navigate to the Time Comparison option in the Advanced Analytics section of the Visualization menu. Subsequently, you can run a query to build/generate a new chart that will have additional series with the same values shifted only a week back in time.
You can then change the calculator type to Absolute difference and select Run Query, to see one series at a time showing the difference between the two series.
In this article, you learned about the Apache Superset platform and its key features. You also learned about the process of installing Apache Superset Dashboards using a command-line interface and the steps that need to be followed for creating Superset Dashboards after establishing a connection with a database. Various Operations that can be performed on the Dashboard Data were also discussed. If you are interested in carrying out a similar analysis on the Looker platform, you can find the guide here.
Visit our Website to Explore Hevo
Integrating and analyzing data from a huge set of diverse sources can be challenging, this is where Hevo comes into the picture. Hevo Data, a No-code Data Pipeline helps you transfer data from a source of your choice in a fully automated and secure manner without having to write the code repeatedly. Hevo with its strong integration with 100+ sources & BI tools such as Apache Superset, allows you to not only export & load Data but also transform & enrich your Data & make it analysis-ready in a jiffy.
Get started with Hevo today! Sign Up for a 14-day free trial! and experience the feature-rich Hevo suite first hand. Check out the pricing details to get a better understanding of which plan suits you the most.