Connecting GitHub to Metabase: A Comprehensive Guide

By: Published: July 14, 2020

The purpose of this article is to help you understand how to connect GitHub to Metabase. To do that, you will first be introduced to the concept of data analytics, business intelligence, and the roles they can play within your organization. A brief overview will also be provided for the two main tools you will use in this article – GitHub and Metabase.

At the end of this article, you will have a firm understanding of how to use Metabase as a business intelligence tool to monitor changes in your GitHub code repositories through the GitHub API and Hevo. You will also understand some of the benefits of using an easy to access business intelligence tool to lower the barrier of entering the gathered data insights so that non-technical staff will also be able to reap its benefits. This article assumes that the reader is familiar with the basic usage of Git and the command line.

Let’s see how this blog is structured for you:

What is Metabase?

What is Metabase? It’s an open-source business intelligence tool that gives anyone access to data (typically a database), the ability to ask questions to their data, and draw insights that may summarize the data. You can then present it in a new and digestible form or use graphs and charts to plot the statistics associated with that data. The main use of Metabase is as a data analytics/presentation layer over data stored in databases.

With Metabase, users can use a simple, intuitive user interface (UI), to run queries that return results that are clear and can be formatted in several ways. Some of the presentation options available are tables, line graphs, pie charts, bar charts, maps, etc. Another benefit of Metabase is that users must not be overly technical or skilled in interacting with databases using a declarative language like SQL to run queries. However, there is provision for power users who may want to run complex SQL queries as Metabase includes a SQL query engine that can be accessed directly from the UI.

For organizations that require access to data insights to all members of the organization, Metabase is a particularly good choice as dashboards can be created that show Key Performance Indicators (KPIs), metrics, and other information that the organization may want to track periodically. Once such a dashboard is built, the data it displays does not need to be manually updated to show the most current version as changes in the database to which Metabase is connected to, is automatically synchronized. It means that the data team will not need to respond to queries for the same kind of questions over and over again.

As Metabase is easy to use, members across the organization can be encouraged to explore their questions using Metabase and share their findings with their colleagues, team members, and across the organization. Metabase is open-source and freely available for anyone to use and host. The team behind Metabase also offers an enterprise solution for enterprises that may want to host Metabase on-premise to safeguard proprietary data or limit access to certain data and dashboards based on the privileges/access rights within the enterprise. This managed solution provides support services. However, it is important to note that Metabase can also be hosted on cloud platforms like AWS, GCP, and Azure, with the attendant technical known-how required to run and scale such deployments.

2 Easy Methods to Connect GitHub to Metabase

There are two convenient methods to connect your data from GitHub to Metabase:

  1. Connect GitHub to Metabase Manually
    To connect GitHub to Metabase manually, you will use an experimental Metabase HTTP driver that allows the use of a RESTful API as a data source. This method requires engineering skills and expertise.
  2. Connect GitHub to Metabase Using Hevo
    You can automate your data flow using Hevo. It is a fully automated platform and requires zero engineering skills from your side. It efficiently transfers GitHub data to Metabase for free.
Visit our Website to Explore Hevo

What is GitHub?

In the world of software development, GitHub occupies a conspicuous position that shows its usefulness to the entire software development ecosystem and its clear market leader status. GitHub is a code hosting platform that offers built-in version control using Git, which is an open-source distributed version control system that is used for large and small projects. Git is used by all the major software companies, startups, and by any organization or individual that wants to track source code changes. Git is by far the most popular version control system in use today.

GitHub wraps Git and provides some additional features such as a hosted environment, advanced collaboration tools, and agile style process implementations that makes working on software development with a distributed team very efficient. Most open-source projects and enterprise projects are hosted on GitHub as it has both free and paid tiers. Like Git, which is the most popular version control system in the world, GitHub is the most popular Git enabled collaboration platform.

In this article, you will make use of GitHub’s API, which provides access to information about GitHub projects, users, repositories, issues, pull requests, teams, etc. For this specific example, you will use the user’s endpoint that gives information about the repositories owned by a user. A user, in this case, is an organization or individual that owns a GitHub user account.

What is Data Analytics?

Data analytics can be defined as the process of analyzing data in its raw form so that insights or conclusions can be made, which can then provide decision making. The importance of data analytics has grown in today’s world, which is increasingly data-driven. Data is being produced at phenomenal levels, and the information gleaned from this data can be the differentiator between a company and its competitors.

There are many kinds of data analytics, namely descriptive analytics, diagnostic analytics, predictive analytics, and prescriptive analytics. Descriptive analytics mainly deals with the summarization of historical data to better understand changes that have occurred over time. This article deals mainly with descriptive analytics using Metabase as the tool of choice. The other forms of analytics are beyond the scope of this article.

Benefits of Visualizing Data

As defined earlier, descriptive analytics is about summarizing historical data so that trends can be detected. It can be done most efficiently by leveraging visualization. Humans are visual creatures, and showing data using visualization techniques like charts, maps, or tables, makes it easier to reason data. Trends that are not visible while looking at hard numbers, suddenly pop off the page when they are put in visualizations. Therefore, an integral part of descriptive analytics is the visual presentation of the results of findings.

Metabase makes visualizing answers for the questions asked about your data, simple and straight-forward. You can choose from a wide selection of visualization tools that best describes your metric of interest. Below is an example of a dashboard created using Metabase that contains various visualizations.

GitHub to Metabase: Data visualization
Source: https://tentacode.dev/easy-charting-with-metabase

Connect GitHub to Metabase Manually

Metabase was primarily built to support traditional databases like MySQL, PostgreSQL, SQL Server, in addition to cloud offerings like Amazon Athena, Google BigQuery, Snowflake, etc. as its data source. However, there are extensions by the community that provide access to other types of data sources. In this section of the tutorial, you are going to use the experimental Metabase HTTP Driver that allows the use of a RESTful API as a data source.

To connect GitHub to Metabase, you will use the HTTP driver to access the GitHub API and feed that in as a data source to Metabase. This section assumes that you have access to an installed Metabase instance, be that a local installation or a cloud-hosted installation. It is also assumed that you have a GitHub account because you will be querying the GitHub API as an authenticated user. GitHub API permits unauthenticated requests, but you should be aware that the limit on the number of requests per hour is significantly lower as compared to authenticated requests.

The first step to accomplish the aim of connecting GitHub to Metabase is to install the HTTP Driver. Since this is a community-supported solution and not an official Metabase driver, the steps required to achieve this are more.

Step 1: You will first clone the Metabase repository. It is assumed that you have Git installed. Fire up a terminal, and issue the following command:

git clone https://github.com/metabase/metabase.git

Step 2: Next, change the directory to the folder in which you cloned the Metabase Git repository.

cd /path/to/cloned_metabase_repository

Step 3: Run the following command to install prerequisites for building drivers: 

lein install-for-building-drivers

Step 4: Now, clone the HTTP Driver repository like so:

git clone https://github.com/tlrobinson/metabase-http-driver.git

Step 5: Next, change the directory to the HTTP driver repository that you just cloned and run the following commands to build the HTTP driver:

lein clean
DEBUG=1 LEIN_SNAPSHOTS_IN_RELEASE=true lein uberjar

Step 6: The last step of the set-up process is to copy the output of the build step which is a .jar file to a plugins folder that you will create within the Metabase directory and then you will restart Metabase. The commands to do so are shown below:

mkdir -p /path/to/metabase/plugins/
cp target/uberjar/http.metabase-driver.jar /path/to/metabase/plugins/
jar -jar /path/to/metabase/metabase.jar

Step 7: The setup is now complete, and you are ready to start using the HTTP driver within Metabase. The HTTP driver currently supports only REST APIs that utilize JSON. To issue a query, within Metabase, use the “native” query editor. A sample query is shown below:

{
  "url": "your_api_endpoint",
  "method": "POST",
  "headers": {
    "Authentication": "YOUR_AUTH_TOKEN"
  },
  "body": {
    "foo": "bar"
  }
}

Step 8: Let us now construct a simple query that will hit the GitHub API for a particular user and list the public repositories associated with that user. For fun, we will use Metabase as the user. The full query is shown below:

{
  "url": "https://api.github.com/users/metabase/repos",
  "method": "GET",
  "headers": {
    "Authentication": "YOUR_GITHUB_TOKEN"
  }
}

If you do not have a GitHub account or do not want to issue the query as an authenticated GitHub user, you can use the simplified query below:

{"url": "https://api.github.com/users/metabase/repos"}

The sample results from the above query can be seen below:

GitHub to Metabase:  Query Result
Source: https://github.com/metabase/metabase/pull/7047

With this setup, it is possible to try out other GitHub API endpoints, ask more questions from your GitHub data, and play around with the presentation styles.

Connect GitHub to Metabase Using Hevo

Hevo automates your data flow in minutes. It connects with a range of marketing applications and pulls data in easily. It provides a single source of truth for your marketing data. It makes sure that you have access to the most accurate and real-time data without any coding. It transfers GitHub data to any other data source for free.

You can connect GitHub to Metabase using the following steps:

  • Connect: Connect your GitHub data with Hevo just by providing the credentials.
  • Integrate: Integrate your data from multiple sources, store it in our No-Code Data Integration Platform, and make it analytics-ready.
  • Analyze: Connect and visualize your unified data in Metabase and derive actionable insights in minutes.
Visit our Website to Explore Hevo

Let’s talk about some amazing features of Hevo:

  • Simple: Hevo offers a simple and intuitive user interface. It can be set-up in minutes. Hevo also has a minimal learning curve.
  • Fully Automated: Hevo can automate your data flow without writing any custom codes. It will detect any errors in the data.
  • Zero Maintenance: Hevo requires no maintenance from your side. Set-up once, and you are ready to go.
  • Data Transformations: Hevo provides a simple interface to clean, transform, and enrich your data before moving it to your desired destination.
  • Real-Time: Hevo provides data migration in real-time so that your data is analysis-ready always.
  • Secure: Hevo makes sure that your data is safe and secure by offering two-factor authentication and end-to-end encryption.

Excited to use Hevo? If yes, then Sign Up for a 14-day free trial today.

Conclusion

You have come to the end of this article. In this article, you were introduced to two prominent tools used extensively by software development teams and organizations around the world – GitHub and Metabase.

You were also introduced to data analytics, particularly descriptive analytics, how it can be used to analyze data, and the visualization techniques adopted to present that data. Thereafter, you took a whirlwind tour of the steps involved in setting up Metabase to allow it to consume APIs as a data source. You then used GitHub API to connect GitHub data for analysis inside of Metabase. It requires a lot of steps to take in, especially the portion that dealt with connecting an HTTP driver to Metabase. You may be wondering whether there is a more convenient way of achieving all that has been discussed above, and the answer is that there is a solution. Hevo, an integrated analytics platform that can act as a data warehouse through which you can load your GitHub data and send them to Metabase for free. them through a modern, unified interface.

Give Hevo a try, Sign Up for a 14-day free trial today.

Share your experience of connecting GitHub to Metabase in the comment section below.

Ofem Eteng
Freelance Technical Content Writer, Hevo Data

Ofem is a freelance writer specializing in data-related topics, who has expertise in translating complex concepts. With a focus on data science, analytics, and emerging technologies.

No-code Data Pipeline for Metabase