Are you trying to derive deeper insights from PostgreSQL by moving the data into a Data Warehouse like Google BigQuery? Well, you have landed on the right article. Now, it has become easier to replicate data from Postgres to BigQuery. This article will give you a brief overview of PostgreSQL and Google BigQuery. You will also get to know how you can set up your Postgres to BigQuery integration using 2 methods.
Moreover, the limitations in the case of the manual method will also be discussed in further sections. Read along to decide which method of connecting PostgreSQL to BigQuery is best for you.
What Is PostgreSQL?
PostgreSQL, primarily used as an OLTP database, is also a popular tool for analyzing data at scale. Its novel architecture, reliability at scale, robust feature set, and extensibility give it an advantage over other databases.
What is Google BigQuery?
Google BigQuery is a serverless, cost-effective, highly scalable data warehouse platform with built-in machine learning capabilities. Its operations are carried out using the business intelligence engine. BigQuery integrates speedy SQL queries with Google’s infrastructure’s processing capacity to manage business transactions, data from several databases, and access control restrictions for users who see and query data.
Take advantage of PostgreSQL’s novel architecture, reliability at scale, and robust feature set by seamlessly connecting it with various destinations like BigQuery using Hevo. Hevo’s no-code platform empowers teams to:
- Integrate data from 150+ sources(60+ free sources).
- Simplify data mapping and transformations using features like drag-and-drop.
- Easily migrate different data types like CSV, JSON, etc., with the auto-mapping feature.
Join 2000+ happy customers like Whatfix and Thoughtspot, who’ve streamlined their data operations. See why Hevo is the #1 choice for building modern data stacks.
Get Started with Hevo for Free
How To Setup PostgreSQL to BigQuery Connection?
Method 1: Using Hevo Data
The steps to load data from PostgreSQL to BigQuery using Hevo Data are as follows:
Step 1: Configure PostgreSQL As Source
- Connect your PostgreSQL account to Hevo’s platform.
Step 2: Configure BigQuery As Destination
- Select Google BigQuery as your destination and start moving your data.
With this, you have successfully set up Postgres to BigQuery replication using Hevo Data.
Here are more reasons to try Hevo:
- Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
- Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
- Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to transfer.
Move Data from PostgreSQL to BigQuery
Move Data from MySQL to BigQuery
Move Data from Google Ads to BigQuery
Move Data from MongoDB to BigQuery
Method 2: Manual ETL Process to Set Up PostgreSQL to BigQuery Integration
To execute the following steps, you need a pre-existing database and a table populated with PostgreSQL records.
Let’s take a detailed look at each step.
Step 1: Extract Data From PostgreSQL
The data from PostgreSQL needs to be extracted and exported into a CSV file. To do that, write the following command in the PostgreSQL workbench.
COPY your_table_name TO ‘new_file_location\new_file_name’ CSV HEADER
After the data is successfully migrated to a CSV file, you should see the above message on your console.
Step 2: Clean and Transform Data
To upload the data to Google BigQuery, you need the tables and the data to be compatible with the bigQuery format. The following things need to be kept in mind while migrating data to bigQuery:
- BigQuery expects CSV data to be UTF-8 encoded.
- BigQuery doesn’t enforce Primary Key and unique key constraints. Your ETL process must do so.
- Postgres and BigQuery have different column types. However, most of them are convertible.
You can visit their official page to know more about BigQuery data types.
- DATE value must be a dash(-) separated and in the form YYYY-MM-DD (year-month-day).
- Fortunately, the default date format in Postgres is the same, YYYY-MM-DD.So if you are simply selecting date columns it should be the incorrect format.
- The
TO_DATE
function in PostgreSQL helps in converting string values into dates.
- If the data is stored as a string in the table for any reason, it can be converted while selecting data.
Syntax : TO_DATE(str,format)
Example : SELECT TO_DATE('31,12,1999','%d,%m,%Y');
Result : 1999-12-31
- In TIMESTAMP type, the hh:mm:ss (hour-minute-second) portion must use a colon (:) separator.
- Similar to the Date type, the
TO_TIMESTAMP
function in PostgreSQL is used to convert strings into timestamps.
Syntax : TO_TIMESTAMP(str,format)
Example : SELECT TO_TIMESTAMP('2017-03-31 9:30:20','YYYY-MM-DD HH:MI:SS');
Result: 2017-03-31 09:30:20-07
- Make sure text columns are quoted if they can potentially have delimiter characters.
Step 3: Upload to Google Cloud Storage(GCS) bucket
If you haven’t already, you need to create a storage bucket in Google Cloud for the next step
Step 3.1: Go to your Google Cloud account and Select the Cloud Storage → Bucket.
Step 3.2: Select a bucket from your existing list of buckets. If you do not have a previously existing bucket, you must create a new one.
Step 3.3: Upload your .csv file into the bucket by clicking the upload file option. Select the file that you want to upload.
Step 4: Upload to BigQuery table from GCS
Step 4.1: Go to the Google Cloud console and select BigQuery from the dropdown. Select the Project ID you want to work with and select Create Dataset
Step 4.2: Provide the configuration per your requirements and create the dataset.
Your dataset should be successfully created after this process.
Step 4.3: To create a table, select the project ID of the dataset, choose the newly created dataset, and click ‘Create Table’ from the side menu.
Step 4.4: To create a table, select the source as Google Cloud Storage. Next, select the correct GCS bucket with the .csv file. Then, select the file format that matches the GCS bucket. In your case, it should be in .csv file format. You must provide a table name for your table in the BigQuery database. Select the mapping option as automapping if you want to migrate the data as it is.
Step 4.5: Your table should be created next and loaded with the same data from PostgreSQL.
Step 5: Query the table in BigQuery
After loading the table into BigQuery, you can query it by selecting the QUERY option above the table. You can query your table by writing basic SQL syntax.
Note: Mention the correct project ID, dataset name, and table name.
The above query extracts records from the emp table where the job is manager.
Say Goodbye to these Long steps and Migrate Data within Minutes!
No credit card required
Limitations of the Manual Method:
- The manual migration process can be time-consuming, requiring significant effort to export, transform, and load data, especially if the dataset is large or complex.
- Manual processes are susceptible to human errors, such as incorrect data export settings, file handling mistakes, or misconfigurations during import.
- If the migration needs to be performed regularly or involves multiple tables and datasets, the repetitive nature of manual processes can lead to inefficiency and increased workload.
- Manual migrations can be resource-intensive, consuming significant computational and human resources, which could be utilized for other critical tasks.
Additional Read –
Conclusion
Migrating data from PostgreSQL to BigQuery manually can be complex, but automated data pipeline tools can significantly simplify the process. We’ve discussed two methods for moving data from PostgreSQL to BigQuery: the manual process, which requires a lot of configuration and effort, and automated tools like Hevo Data. Whether you choose a manual approach or leverage data pipeline tools like Hevo Data, following the steps outlined in this guide will help ensure a successful migration.
With Hevo Data, you can seamlessly automate the entire migration process, eliminating the need for complex setups. Sign up for a 14-day free trial and experience the feature-rich Hevo suite firsthand.
FAQ on PostgreSQL to BigQuery
How do you transfer data from Postgres to BigQuery?
To transfer data from PostgreSQL to BigQuery, export your PostgreSQL data to a format like CSV or JSON, then use BigQuery’s data import tools or APIs to load the data into BigQuery tables.
Can I use PostgreSQL in BigQuery?
No, BigQuery does not natively support PostgreSQL as a database engine. It is a separate service with its own architecture and SQL dialect optimized for large-scale analytics and data warehousing.
Can PostgreSQL be used for Big Data?
Yes, PostgreSQL can handle large datasets and complex queries effectively, making it suitable for big data applications.
How do you migrate data from Postgres to Oracle?
To migrate data from PostgreSQL to Oracle, use Oracle’s Data Pump utility or SQL Developer to export PostgreSQL data as SQL scripts or CSV files, then import them into Oracle using SQL Loader or SQL Developer.
Skand is a dedicated Customer Experience Engineer at Hevo Data, specializing in MySQL, Postgres, and REST APIs. With three years of experience, he efficiently troubleshoots customer issues, contributes to the knowledge base and SOPs, and assists customers in achieving their use cases through Hevo's platform.