Are you trying to derive deeper insights from PostgreSQL by moving the data into a Data Warehouse like Google BigQuery? Well, you have landed on the right article. Now, it has become easier to replicate data from Postgres to BigQuery. This article will give you a brief overview of PostgreSQL and Google BigQuery. You will also get to know how you can set up your Postgres to BigQuery integration using 2 methods.

Moreover, the limitations in the case of the manual method will also be discussed in further sections. Read along to decide which method of connecting PostgreSQL to BigQuery is best for you.

What Is PostgreSQL?

PostgreSQL to BigQuery: PostgreSQL Logo

PostgreSQL,  primarily used as an OLTP database, is also a popular tool for analyzing data at scale. Its novel architecture, reliability at scale, robust feature set, and extensibility give it an advantage over other databases.

What is Google BigQuery?

PostgreSQL to BigQuery: BigQuery Logo

Google BigQuery is a serverless, cost-effective, highly scalable data warehouse platform with built-in machine learning capabilities. Its operations are carried out using the business intelligence engine. BigQuery integrates speedy SQL queries with Google’s infrastructure’s processing capacity to manage business transactions, data from several databases, and access control restrictions for users who see and query data.

Simplify Postgres to BigQuery Migrations with Hevo

Take advantage of PostgreSQL’s novel architecture, reliability at scale, and robust feature set by seamlessly connecting it with various destinations like BigQuery using Hevo. Hevo’s no-code platform empowers teams to:

  1. Integrate data from 150+ sources(60+ free sources).
  2. Simplify data mapping and transformations using features like drag-and-drop.
  3. Easily migrate different data types like CSV, JSON, etc., with the auto-mapping feature. 

Join 2000+ happy customers like Whatfix and Thoughtspot, who’ve streamlined their data operations. See why Hevo is the #1 choice for building modern data stacks.

Get Started with Hevo for Free

How To Setup PostgreSQL to BigQuery Connection?

Method 1: Using Hevo Data

The steps to load data from PostgreSQL to BigQuery using Hevo Data are as follows:

Step 1: Configure PostgreSQL As Source

  • Connect your PostgreSQL account to Hevo’s platform.
    PostgreSQL Source Connection

    Step 2: Configure BigQuery As Destination

    • Select Google BigQuery as your destination and start moving your data.
    BIgQuery Destination

    With this, you have successfully set up Postgres to BigQuery replication using Hevo Data.

    Here are more reasons to try Hevo:

    • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
    • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
    • Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to transfer.
    Move Data from PostgreSQL to BigQuery
    Move Data from MySQL to BigQuery
    Move Data from Google Ads to BigQuery
    Move Data from MongoDB to BigQuery

    Method 2: Manual ETL Process to Set Up PostgreSQL to BigQuery Integration

    To execute the following steps, you need a pre-existing database and a table populated with PostgreSQL records. 

    Let’s take a detailed look at each step.

    Step 1: Extract Data From PostgreSQL

    The data from PostgreSQL needs to be extracted and exported into a CSV file. To do that, write the following command in the PostgreSQL workbench.

    COPY your_table_name TO ‘new_file_location\new_file_name’ CSV HEADER
    
    PostgreSQL workbench

    After the data is successfully migrated to a CSV file, you should see the above message on your console.

    Step 2: Clean and Transform Data

    To upload the data to Google BigQuery, you need the tables and the data to be compatible with the bigQuery format. The following things need to be kept in mind while migrating data to bigQuery:

    • BigQuery expects CSV data to be UTF-8 encoded. 
    • BigQuery doesn’t enforce Primary Key and unique key constraints. Your ETL process must do so.
    • Postgres and BigQuery have different column types. However, most of them are convertible.

    You can visit their official page to know more about BigQuery data types.

    • DATE value must be a dash(-) separated and in the form YYYY-MM-DD (year-month-day).
    • Fortunately, the default date format in Postgres is the same, YYYY-MM-DD.So if you are simply selecting date columns it should be the incorrect format.
    • The TO_DATE function in PostgreSQL helps in converting string values into dates.
    • If the data is stored as a string in the table for any reason, it can be converted while selecting data.
    Syntax : TO_DATE(str,format)
    Example : SELECT  TO_DATE('31,12,1999','%d,%m,%Y');
    Result : 1999-12-31
    • In TIMESTAMP type, the hh:mm:ss (hour-minute-second) portion must use a colon (:) separator.
    • Similar to the Date type, the TO_TIMESTAMP function in PostgreSQL is used to convert strings into timestamps.
    Syntax : TO_TIMESTAMP(str,format)
    Example : SELECT TO_TIMESTAMP('2017-03-31 9:30:20','YYYY-MM-DD HH:MI:SS');
    Result: 2017-03-31 09:30:20-07
    • Make sure text columns are quoted if they can potentially have delimiter characters.

    Step 3: Upload to Google Cloud Storage(GCS) bucket

    If you haven’t already, you need to create a storage bucket in Google Cloud for the next step

    Step 3.1: Go to your Google Cloud account and Select the  Cloud Storage → Bucket.

    Google Cloud Platform

    Step 3.2: Select a bucket from your existing list of buckets. If you do not have a previously existing bucket, you must create a new one.

     GCP buckets

    Step 3.3: Upload your .csv file into the bucket by clicking the upload file option. Select the file that you want to upload.

    GCP file upload

    Step 4: Upload to BigQuery table from GCS

    Step 4.1: Go to the Google Cloud console and select BigQuery from the dropdown. Select the Project ID you want to work with and select Create Dataset

    BigQuery console

    Step 4.2: Provide the configuration per your requirements and create the dataset.

     Creating Dataset in BigQuery

    Your dataset should be successfully created after this process.

    Step 4.3: To create a table, select the project ID of the dataset, choose the newly created dataset, and click ‘Create Table’ from the side menu.

    Creating Table in BigQuery

    Step 4.4: To create a table, select the source as Google Cloud Storage. Next, select the correct GCS bucket with the .csv file. Then, select the file format that matches the GCS bucket. In your case, it should be in .csv file format. You must provide a table name for your table in the BigQuery database. Select the mapping option as automapping if you want to migrate the data as it is.

     Configuration for creating table

    Step 4.5: Your table should be created next and loaded with the same data from PostgreSQL.

    Table Preview

    Step 5: Query the table in BigQuery

    After loading the table into BigQuery, you can query it by selecting the QUERY option above the table. You can query your table by writing basic SQL syntax.

    Query table option

    Note: Mention the correct project ID, dataset name, and table name.

    Query

    The above query extracts records from the emp table where the job is manager.

    Limitations of the Manual Method:

    1. The manual migration process can be time-consuming, requiring significant effort to export, transform, and load data, especially if the dataset is large or complex.
    2. Manual processes are susceptible to human errors, such as incorrect data export settings, file handling mistakes, or misconfigurations during import.
    3. If the migration needs to be performed regularly or involves multiple tables and datasets, the repetitive nature of manual processes can lead to inefficiency and increased workload.
    4. Manual migrations can be resource-intensive, consuming significant computational and human resources, which could be utilized for other critical tasks.
    Additional Read –

    Conclusion

    Migrating data from PostgreSQL to BigQuery manually can be complex, but automated data pipeline tools can significantly simplify the process. We’ve discussed two methods for moving data from PostgreSQL to BigQuery: the manual process, which requires a lot of configuration and effort, and automated tools like Hevo Data. Whether you choose a manual approach or leverage data pipeline tools like Hevo Data, following the steps outlined in this guide will help ensure a successful migration.

    With Hevo Data, you can seamlessly automate the entire migration process, eliminating the need for complex setups. Sign up for a 14-day free trial and experience the feature-rich Hevo suite firsthand.

      FAQ on PostgreSQL to BigQuery

      How do you transfer data from Postgres to BigQuery?

      To transfer data from PostgreSQL to BigQuery, export your PostgreSQL data to a format like CSV or JSON, then use BigQuery’s data import tools or APIs to load the data into BigQuery tables.

      Can I use PostgreSQL in BigQuery?

      No, BigQuery does not natively support PostgreSQL as a database engine. It is a separate service with its own architecture and SQL dialect optimized for large-scale analytics and data warehousing.

      Can PostgreSQL be used for Big Data?

      Yes, PostgreSQL can handle large datasets and complex queries effectively, making it suitable for big data applications.

      How do you migrate data from Postgres to Oracle?

      To migrate data from PostgreSQL to Oracle, use Oracle’s Data Pump utility or SQL Developer to export PostgreSQL data as SQL scripts or CSV files, then import them into Oracle using SQL Loader or SQL Developer.

      Skand Agrawal
      Customer Experience Engineer, Hevo Data

      Skand is a dedicated Customer Experience Engineer at Hevo Data, specializing in MySQL, Postgres, and REST APIs. With three years of experience, he efficiently troubleshoots customer issues, contributes to the knowledge base and SOPs, and assists customers in achieving their use cases through Hevo's platform.