As data volumes continue exploding, many organizations find their Amazon RDS databases constrained by storage limits, scale challenges, and rising costs. Migrating from Amazon RDS to Google BigQuery offers a tempting path to reap benefits like scalable storage, faster queries, and lower costs. However, moving terabytes of production data from RDS into BigQuery can seem daunting.

This post covers two methods for easy AWS RDS to BigQuery migration using an automated tool like Hevo and a manual method. You can choose the one that suits you the best.

Methods to Set up Amazon RDS to BigQuery Migration

Method 1: Amazon RDS to BigQuery Migration Using Hevo Data

Hevo Data provides a hassle-free solution and helps you directly transfer data from Amazon RDS to BigQuery without any intervention in an effortless manner. Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code.

Get Started with Hevo for Free

Method 2: Manual Amazon RDS to BigQuery Migration

This method involves manually extracting data from your Amazon RDS database as CSV and then loading it into Google BigQuery to set up manual Amazon RDS to BigQuery Migration.

Method 1: Amazon RDS to BigQuery Migration Using Hevo Data

RDS to BigQuery: Hevo Logo
Image Source: Self

Hevo’s pre-built integration with databases in Amazon RDS and 150+ Sources (including 50+ free Data Sources) will take full charge of the data transfer process, allowing you to set up Amazon RDS to BigQuery Migration seamlessly and focus solely on key business activities. 

Sign up here for a 14-Day Free Trial!

Step 1: Set up Amazon RDS as a Source

There are various connectors through which you can connect Amazon RDS as a Source. In Hevo Data, you can use PostgreSQL to configure Amazon RDS PostgreSQL as a Source as shown below:

AWS RDS to BigQuery

Step 2: Connect Google BigQuery as Destination

You can set up Google BigQuery as Destination as shown below:

To know more about AWS RDS to BigQuery migration, refer to the Hevo Documentation:

Method 2: Manual Amazon RDS to BigQuery Migration

Users can set up manual Amazon RDS to BigQuery Migration by implementing the following steps:

Step 1: Exporting Data from Amazon RDS

This blog shows how this step can be implemented for Amazon RDS MySQL. You have to execute corresponding queries for the database in your use case if not using MySQL.

You can export the required data from Amazon RDS MySQL using the mysqldump Unix command. Execute the following query to export the required data:

mysql --password=XXX --user=XXX --host=XXX.amazonaws.com --port=XXX --database=DataBaseName -e  "select * from TableName" | sed 's/t/","/g;s/^/"/;s/$/"/;s/n//g' > /usr/home/TableName.csv

The required data will now be downloaded as a CSV.

Step 2: Importing CSV Data into Google BigQuery

Once the data has been exported from Amazon RDS as CSV, it has to be imported into Google BigQuery. Users can easily perform a batch-load job in Python by running the following code:

from google.cloud import bigquery

# Construct a BigQuery client object.
client = bigquery.Client()

# TODO(developer): Set table_id to the ID of the table to create.
# table_id = "your-project.your_dataset.your_table_name"

job_config = bigquery.LoadJobConfig(
    source_format=bigquery.SourceFormat.CSV, skip_leading_rows=1, autodetect=True,
)

with open(file_path, "rb") as source_file:
    job = client.load_table_from_file(source_file, table_id, job_config=job_config)

job.result()  # Waits for the job to complete.

table = client.get_table(table_id)  # Make an API request.
print(
    "Loaded {} rows and {} columns to {}".format(
        table.num_rows, len(table.schema), table_id
    )
)

Step 3: Validating the Data Load

You can now use the Query Editor to query & fetch the test data after the manual AWS RDS to BigQuery migration so that you can validate the process.

Additionally, you can validate the count of records in the Amazon RDS tables and import Google BigQuery tables to cross-check whether you have imported all the required records.

Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery
Learn the best practices and considerations for setting up high-performance ETL to BigQuery

Best Practices for Manual Amazon RDS to BigQuery Migration

The best practices that all users should follow while implementing manual Amazon RDS to BigQuery Migration are as follows:

  • While using the mysqldump, you can use command filter parameters such as –compress, –compact. COMPACT is a format supported by Antilope. It stores the first 768 bytes of BLOB in case its value doesn’t fit on the page. COMPRESSED is used for compressed tables.
  • If you select Auto-Detect schema while creating a table on BigQuery UI, you need to validate it to be on the safe side. If you need to modify or add or remove columns from the table, we can also do that with the EDIT schema option available in Google BigQuery.
  • If you don’t choose Auto-Detect schema, you will need to create it manually one by one with the help of the Add Field option.

Limitations of Manual Amazon RDS to BigQuery Migration

  • Brittle Processes: The entire mechanism to move data from Amazon RDS to BigQuery is set up manually. This setup is brittle and prone to error as it requires multiple coherent steps to be executed one after the other. Failure of any single process part can cause the entire data pipeline to fail. This can eventually result in data loss. 
  • Inability to Transform Data: If your use case demands you to clean, transform and enrich data when moving it from Amazon RDS to BigQuery, then you will need to write additional code to accomplish that. This makes the overall process very cumbersome. 
  • Constant Monitoring and Maintenance: More often than not, schema changes, and scripts break and ultimately result in data inconsistencies. Since you are moving critical transactional data from Amazon RDS to BigQuery, you will need to station dedicated engineering resources to monitor both the infrastructure and flow of data constantly. This adds to the maintenance overhead.

Use Cases of Amazon RDS to BigQuery Migration

  • Real-Time Analytics: Connecting AWS RDS to BigQuery can help businesses to do real-time data analytics. Support for streaming data processing, quick query processing, flexible data modeling, etc., imparts powerful analytics capabilities to BigQuery. 
  • Data Security: BigQuery provides security features such as identity management, access restriction, and data encryption. Thus, BigQuery data loading from AWS can ensure you security of your data. 
  • Machine Learning: The built-in feature in BigQuery enables Data Scientists and Data Analysts to build and operationalize ML models for structured, semi-structured, and unstructured data. So, the AWS RDS BigQuery data pipeline can be used to build and train ML models.

Conclusion

This article provided you with a step-by-step guide on how you can set up Amazon RDS to BigQuery Migration manually or using Hevo. However, there are certain limitations associated with the manual method. If those limitations are not a concern to your operations, then using it is the best option but if it is, then you should consider using automated Data Integration platforms like Hevo. This article also takes you through use cases when you do BigQuery data transformation for AWS data.

Visit our Website to Explore Hevo

Hevo helps you directly transfer data from a source of your choice to a Data Warehouse, Business Intelligence, or desired destination in a fully automated and secure manner without having to write the code. It will make your life easier and make data migration hassle-free. It is User-Friendly, Reliable, and Secure.

Want to try Hevo? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite firsthand. Have a look at our unbeatable pricing here, which will help you choose the right plan for you.

What are your thoughts on moving data from Amazon RDS to Google BigQuery? Let us know in the comments.

Aniket Renuse
Freelance Technical Content Writer, Hevo Data

Aniket is an experienced freelance writer, skilled in creating engaging content focused on the data industry, covering areas such as data analytics, machine learning, AI, big data, and business intelligence.

mm
Associate Customer Experience, Hevo Data

Parthiv, proficient in MongoDb, Mysql, Rest API, and Snowflake, elevates Hevo's customer experience by prioritizing feedback, proactive support, and feature advocacy. Committed to continuous improvement, Parthiv ensures every interaction reflects Hevo's dedication to excellence.

No-code Data Pipeline For BigQuery

Get Started with Hevo