Amazon Redshift and Google BigQuery are the world’s two most popular data warehouses, both are built on the foundation to scale and enable enterprises to leverage and picture the economies of tomorrow.

Choosing the right cloud-based data warehouse is the underlying criteria for any enterprise to support its analysis, reporting, and BI functions.

Both Amazon Redshift and Google BigQuery are used widely across a company’s business functions — from BI to trends forecasting and budgeting.

Both are fully managed petabyte-scale cloud data warehouses and have strengths and weaknesses, but the right balance of needs and expectations will provide the best results. 

What is Amazon Redshift?

Amazon Redshift is built and designed for today’s data scientists, data analysts, data administrators, and software developers. It gives users the ability to perform operations on billions of rows, making Redshift perfect for analyzing large quantities of data. Its architecture is based on an extensive communication channel between the client application and the data warehouse cluster. To learn more about Amazon Redshift infrastructure, see Redshift documentation.

Important Amazon Redshift Features:

  • AWS’s Integrated Analytics Ecosystem: AWS offers a wide range of in-build ecosystem services, making it easier to handle end-to-end analytics workflows without compliance and operational roadblocks. Some of the famous examples include AWS Lake Formation, AWS Glue, AWS EMR, AWS DMS, AWS Schema Conversion Tool, and many more.
  • Redshift ML: A must for today’s data professionals, Redshift ML allows users to create and train Amazon SageMaker models through data from Redshift for predictive analytics.
  • Machine Learning For Maximum Performance:  Amazon Redshift offers advanced ML capabilities which deliver high throughput and performance. Its advanced algorithms predict incoming queries based on certain factions to help prioritize critical workloads.

What is Google BigQuery?

Google BigQuery is a fully managed, serverless data warehouse. Similar to Amazon Redshift, it allows users to run analysis over petabytes of data in real-time. It’s cost-effective and only requires users to understand and write standard SQL.

Important Google BigQuery Features:

  • BigQuery Omni: With a consistent data experience, the ability to break data silos to gain crucial insights, and agility, BQ Omni allows users to execute queries through foreign cloud platforms as well, which come in handy when analyzing across clouds such as AWS and Microsoft Azure.
  • BigQuery ML: Using simple SQL queries, users in BigQuery execute ML models. Some of the widely used models are as follows: Linear regression, Binary and Multiclass Logistic Regression, Matrix Factorization, Time Series, and Deep Neural Network models.
  • BigQuery Data Transfer Service: It automates the data movement into BigQuery, which comes in handy when multiple data sources, including data warehouses, are involved.

Amazon Redshift to Google BigQuery Migration 

Method 1: Amazon Redshift to Google BigQuery Migration Using the Console

Before you begin with the migration process from Amazon Redshift to Google BigQuery, a few sets of requirements and permissions are required to be objectified.

For Google Cloud, you first have to meet the prerequisites and receive permissions. And, to gain access to the Amazon Redshift cluster, you will need the AWS access key pair — to use in a later process. To obtain the AWS access key pair, follow the steps provided in the Redshift’s documentation

Before setting up the migration process, it is also a prerequisite that you obtain the JDBC URL, the username, and password of your Amazon Redshift database, and the URI of the Amazon S3 bucket, too.

The required permissions for Google BigQuery are as follows:

  1. Permissions to create the transfer: bigquery.transfers.update 
  2. Permissions to the target dataset: bigquery.datasets.get and bigquery.datasets.update

Note: All two permissions fall under the umbrella category of the bigquery.admin, predefined in the IAM (Identity and Access Management) role. To learn more about IAM roles, see the Access control reference section of the BigQuery Data Transfer Service guide. To ensure you have the required configurations and permissions to enable transfer, have a look at AWS managed policy.

Moving forward, it is also important that you comply with the predefined requirements stated by Google Cloud. Go along with the points shown below:

  1. To store your migration data, choose or create a Google Cloud project: Go to project selector then choose between ‘SELECT PROJECT’ or ‘CREATE PROJECT.’
SELECT/CREATE PROJECT IMAGE
  1. Enable the BigQuery Data Transfer Service API in the Google Cloud Console by clicking on the ‘Enable’ button (A green checkmark will indicate that the API is enabled).
BIGQUERY DATA TRANSFER API ENABLE
  1. To store data, create a BigQuery dataset.

Note: For the Amazon Redshift cluster, you have to allowlist the IP addresses which will then correspond to your data set’s location. The list for all the IP addresses is provided in the ‘Grant access to your Amazon Redshift cluster’ section of the Google Cloud’s documentation. After this step, you have to gain access to the Amazon S3 bucket, by creating an AWS access key, as discussed above.

Let’s start with the migration process…

Step 1: Open the BigQuery page on the Google Cloud Console

Step 2: On the left, under the Analysis section, click on ‘Data transfers

DATA TRANSFER TAB IMAGE

Step 3: Click on ‘CREATE A TRANSFER

CREATE DATA TRANSFER

Step 4: Select ‘Source’ as ‘Amazon S3,’ and enter the migration name in the ‘Transfer config name’ box. There you will also see the ‘Schedule options.’ Now, you have to choose between the ‘Start now’ and the ‘Start at set time’ radio buttons

Step 5: Enter the data set ID in the ‘Destination settings’ box, and continue the process by filling in the ‘Data source details.’ It should look like this:

CREATE DATA TRANSFER STEPS

Step 6: This step is optional. You can enable notifications to receive email notifications if the transfer run fails. Next, click on ‘Save’ to continue

Step 7: After successful execution, the Cloud Console will display all the transfer setup details, Resource name is included, too

CREATE DATA TRANSFER STEPS

Method 2: Procedure to Enable Amazon Redshift to Google BigQuery Migration Using Hevo

Hevo provides Google BigQuery as a Destination for loading/transferring data from any Source system, which also includes Amazon Redshift. You can refer to Hevo’s documentation for Permissions, User Authentication, and Prerequisites for Google BigQuery as a destination here.

Step 1: Setup Amazon Redshift as a Source by following these four steps:

  1. On the left-hand side of the Hevo UI, Asset Palette is present, click on ‘PIPELINES
  2. Under Pipelines List View, click on ‘+CREATE
  3. Select Amazon Redshift as the source on the ‘Select Source Type’ page
  4. ‘Configure your Amazon Redshift Source’ page will appear, it should look like this:
Amazon Redshift to Google BigQuery Migration Using Hevo

Step 2: Setup Google BigQuery as a Destination by following these five steps:

  1. On the left-hand side of the Hevo UI, Asset Palette is present. Click on ‘Destinations
  2. Under ‘Destination List View’ click on ‘+CREATE
  3. Select Google BigQuery as the Destination type in the ‘Add Destination page
  4. Select between the two authentication methods presented on the ‘Configure your Google BigQuey Account’ page
  5. After choosing one, and giving the required permissions, continue with the ‘Configure your Google BigQuery Warehouse’ page and specify the given details. It should look like this:
Amazon Redshift to Google BigQuery Migration Using Hevo

Conclusion

In this article, we waded through some basics of data warehouse and successfully discussed methods to migrate data from Amazon Redshift to Google BigQuery. We used two ways to obtain our desired results:

In the first method, we discussed in detail the manual way to migrate data. This approach requires users to have a sound understanding of Redshift and BigQuery, and their migration customs — leaving the door open for a new user to make mistakes.

In the second method, we used Hevo Data to achieve our desired results. Through Hevo, the process was much faster, fully automated, and required no code. H

Yash Arora
Content Manager, Hevo Data

Yash is a Content Marketing professional with over three years of experience in data-driven marketing campaigns. He has expertise in strategic thinking, integrated marketing, and customer acquisition. Through comprehensive marketing communications and innovative digital strategies, he has driven growth for startups and established brands.

Migrate Data From Redshift To BigQuery In No Time