PostgreSQL, also known as Postgres, is an RDBMS (this stands for advanced Relational Database Management System); it allows you to collect any data effortlessly and securely. 

  • It’s a mainstream choice for many projects regardless of the size and enjoys the benefit of being consistent and having great features such as reliable transaction and concurrency with no read locks, which makes it popular.
  • Some of the reasons PostgreSQL went mainstream are due to the proven Architecture, Data Integrity, Reliability, Extensibility, Robust feature set, and, not to forget, its dedication to the Open-Source core platform development community.
  • This made it easy to deliver continuous performance development and innovative solutions. 
  • GitHub is an App Development platform that allows you to host and review program Codes, manage Projects, and design Software alongside millions of Developers.
  • Developers choose GitHub since it makes available the significant DevOps features that organizations need for their projects, regardless of the size. 
  • This article teaches two easy methods to perform a GitHub Postgres ETL. So let’s start!

Methods to Set up GitHub Postgres ETL

Let us start with the manual method of ETL

Method 1: Manual GitHub Postgres ETL

Step 1: PostgreSQL Installation

Before you begin setting up GitHub Postgres ETL, you will have to install it. Follow these steps to get started.

Step 1: There are several ways to install PostgreSQL. One of the easier and more popular ways among programmers to get started is with Postgres.app

PostgreSQL Github: app download | Hevo Data

Step 2: Once you’ve downloaded it on your system, double-tap to unzip the file and then move the PostgreSQL elephant symbol into your app folder. Double-tap the symbol to launch the app.

Step 3: You should see another window launched that shows “Welcome to Postgres“. If it says that it can’t connect with the Postgres server, the DEFAULT port is presumably already being used. Ensure you don’t have some other instances of Postgres on your PC. Uninstall them if you do so, and continue with these steps. Click the “Open psql” button:

PostgreSQL Github: opening app | Hevo Data

Step 4: Postgres.app will create a role and database by default that match your current macOS username. Then proceed to connect straight away by running psql.

Step 5: You will then see something in your terminal looking like this (with your macOS username in front of the prompt rather than ‘Postgres):

PostgreSQL Github: cmd window | Hevo Data

Step 6: You can start using PostgreSQL now. For documentation on command-line tools etc. visit http://postgresapp.com/documentation/

2) Installing PostgreSQL on Ubuntu

Ubuntu’s default repositories contain Postgres packages, so we can install these easily using the apt packaging system. Since this is our first time using apt in this session, you need to refresh your local package index. You can then install the Postgres package and a -contrib package that adds some additional utilities and functionalities:

sudo apt-get update
sudo apt-get install PostgreSQL PostgreSQL-contrib

By default the only role created is the default ‘Postgres’, so PostgreSQL will only respond to connections from an Ubuntu user called ‘Postgres’

Now that your software is installed, you can go over how it works and how it may be different from similar Database Management Systems you may have used.

You’ve come across both operating systems to integrate PostgreSQL and GitHub in this post. However, you may use any of the environments available. For continuous integration, you can use GitHub Actions (PowerShell (pwsh)). GitHub Actions are an excellent mechanism for continuous integration or other automated tasks on your repo and a perfect fit for integrating PostgreSQL. 

Besides, you will be using the latest GitHub Actions Virtual Environments that already has PostgreSQL installed by default after installation. 

Each of the GitHub Actions in this guide will:

  • Start PostgreSQL and check if PostgreSQL GitHub is integrated and running.
  • Create a special user scheduler.
  • Create a test database timetable.

Of course, you may want to add more steps in real life, e.g., import test data, checkout, build, test, gather coverage, release, etc.

Step 2: Integrating GitHub and PostgreSQL

1) PostgreSQL GitHub Actions for MacOS

To integrate GitHub and PostgreSQL using GitHub Actions on MacOS, write the following lines of code on the PostgreSQL interface of the recently installed application.

Setup-PostgreSQL-macOS:
  If: true # false to skip job during debug
  name: Setup PostgreSQL on MacOS
  runs-on: macos-latest
  steps:
  - name: Start PostgreSQL on MacOS
    run: |
      brew services start postgresql
      echo "Check PostgreSQL service is running"
      i=10
      COMMAND='pg_isready'
      while [ $i -gt 0 ]; do
          echo "Check PostgreSQL service status"
          eval $COMMAND && break
          ((I--))
          If [$i == 0]; then
              echo "PostgreSQL service not ready, all attempts exhausted"
              exit 1
          fi
          echo "PostgreSQL service not ready, wait 10 more sec, attempts left: $i"
          sleep 10
      done
  # Homebrew creates an account with the same name as the installing user, but no password
  - name: Create scheduler user
    run: |
      psql --command="CREATE USER scheduler PASSWORD 'somestrong'" --command="du" postgres
  - name: Create timetable database
    run: |
      createdb --owner=scheduler timetable
      PGPASSWORD=somestrong psql --username=scheduler --host=localhost --list timetable 

There are only a few differences between Ubuntu and macOS:

  • Use brew services to start the server.
  • The pg_is ready procedure is more complicated.

As you probably noticed, you may skip sudo -u Postgres prefixes since the current user has all the rights needed in this environment.

2) PostgreSQL GitHub Actions for Ubuntu
setup-postgresql-ubuntu:
  if: true # false to skip job during debug
  name: Setup PostgreSQL on Ubuntu
  runs-on: ubuntu-latest
  steps:
  - name: Start PostgreSQL on Ubuntu
    run: |
      sudo systemctl start postgresql.service
      pg_isready
  - name: Create scheduler user
    run: |
      sudo -u postgres psql --command="CREATE USER scheduler PASSWORD 'somestrong'" --command="du"
  - name: Create timetable database
    run: |
      sudo -u postgres createdb --owner=scheduler timetable
      PGPASSWORD=somestrong psql --username=scheduler --host=localhost --list timetable

Nothing unusual here for Ubuntu users. You can use systemctl to start PostgreSQL and the pg_is ready utility to check if the server is running. To create a scheduler user, you can use a psql client in non-interactive mode. You can then send two commands to it:

  • CREATE USER …;
  • du — list users.

First, create the user. Second, output the list of users for control.

To create a timetable database, you can use the createdb utility. Pay attention to the fact that sudo -u Postgres allows you to not specify connection credentials because a system user is allowed to connect locally without any restrictions. Then, just like in the previous step, list the databases with psql for control. You have successfully integrated PostgreSQL GitHub.

3) PostgreSQL GitHub Action for Windows

To integrate PostgreSQL GitHub on the Windows operating system, please write the following commands on the PostgreSQL Windows OS interface. 

Setup-postgresql-windows:
  if: true # false to skip job during debug
  name: Setup PostgreSQL on Windows
  runs-on: windows-latest
  steps:
  - name: Start PostgreSQL on Windows
    run: |
      $pgService = Get-Service -Name postgresql*
      Set-Service -InputObject $pgService -Status running -StartupType automatic
      Start-Process -FilePath "$env:PGBINpg_isready" -Wait -PassThru
  - name: Create scheduler user on Windows
    run: |
      & $env:PGBINpsql --command="CREATE USER scheduler PASSWORD 'somestrong'" --command="du"
  - name: Create timetable database
    run: |
      & $env:PGBINcreatedb --owner=scheduler timetable
      $env:PGPASSWORD = 'somestrong'
      & $env:PGBINpsql --username=scheduler --host=localhost --list timetable 

With Windows, everything is different, but not as scary as haters usually paint it to be:

  • The Windows environment uses PowerShell (pwsh) by default.
  • Windows administers services.

The code is straightforward. The only thing we want to highlight is the ampersand sign (& aka call operator) before some commands. The call operator (&) allows you to execute a command, script, or function.

Since the PostgreSQL utilities psql and createdb are not in the PATH, you need to specify their full path. You use the environmental variable PGBIN for that. But after the concatenation operation, you’ll have a regular string. Therefore, a call operator will allow you to execute a resulting command.

Challenges of Setting up GitHub Postgres ETL Manually

  • For PostgreSQL, remembering psql commands can be challenging. You can try to decode them. For instance, dt – describe tables, du – describe users, etc.
  • Make sure you are using the recent virtual environment for your PostgreSQL installation.

Method 2: Using Hevo to Set up PostgreSQL GitHub Integration

Hevo helps you directly transfer data from various sources to PostgreSQL, Business Intelligence tools, Data Warehouses, or a destination of your choice in a completely hassle-free & automated manner.

The following steps can be implemented to set up GitHub Postgres Integration using Hevo:

Configure Source: Hevo can bring data from your GitHub account into your Destination for Free. Hevo connects to GitHub through Webhooks. Add Webhook URL to your GitHub Account, copy the generated webhook URL, and add it to your GitHub account. 

Integrate Data: Complete GitHub Postgres Integration by providing your PostgreSQL database credentials, such as your authorized Username and Password, along with information about your Host IP Address and Port Number value. You will also need to provide a name for your database and a unique name for this destination.

Conclusion

  • You have now carried out a continuous integration between PostgreSQL GitHub using GitHub (PowerShell (pwsh) actions on the latest GitHub Actions Virtual Environments, with the latest PostgreSQL installed by default.
  • Be that as it may, there is still a lot more to learn with Postgres and there’s a much simpler method of integrating PostgreSQL GitHub using the Hevo platform. 

Want to take Hevo for a spin?

[hevoButton text=”SIGN UP and experience the feature-rich Hevo suite first hand.

You can also look at pricing that is adaptable to your needs. Clear, simple, and predictable. Pick the ideal plan for your team based on your requirements.

Share your experience of learning the methods of GitHub Postgres ETL in the comments!

Samuel Salimon
Technical Content Writer, Hevo Data

Samuel is a versatile writer specializing in the data industry. With over seven years of experience, he excels in data science, data integration, and data analysis, crafting engaging content on these topics. He is also adept at WordPress development. Samuel holds a Bachelor's degree in Computer Science from Lagos State University.

No-code Data Pipeline For PostgreSQL