In today’s data-driven world, data storage and analysis are essential to derive deeper insights for smarter decision-making. As data volumes increase, organizations consider shifting transactional data from Oracle databases on AWS RDS to a powerful platform like Google BigQuery. It can be due to several reasons, which include AWS RDS Oracle’s storage limits, high query costs, and scalability issues

When migrating to BigQuery, your organization can utilize its optimized, durable, and encrypted storage component to execute analytical queries on large datasets. You can successfully move data from AWS RDS Oracle to BigQuery with meticulous planning and appropriate tools.

Why Integrate AWS RDS Oracle with Google BigQuery?

When integrating AWS RDS Oracle with BigQuery, you can enjoy many benefits. Some of them are:

  • Faster Query Performance: BigQuery uses slots to refer to the CPU and RAM resources allocated for query processing. These slots can execute numerous queries concurrently. Increased parallel processing leads to faster query completion.
  • Scalable Storage: BigQuery stores data in a highly compressed and columnar storage format. This storage architecture allows auto-scaling, enabling you to adjust to workload demands automatically without additional costs.

AWS RDS Oracle: A Brief Overview

AWS RDS for Oracle is part of Amazon Relational Database Service (RDS) that is compatible with Oracle databases. Using a lift-and-shift approach, you can deploy the Oracle database to Amazon RDS, minimizing the risk of data loss.

AWS RDS Oracle enables you to quickly configure, operate, and scale Oracle deployments on the AWS cloud, according to your needs.

When using AWS RDS for Oracle, you can select the License Included or Bring-Your-Own-License (BYOL) options. In addition, you can opt to buy Reserved DB Instances with a small upfront payment for a one or three-year reservation period. 

BigQuery: A Brief Overview

BigQuery is a serverless data warehouse platform that does not require coding skills. It is a fully managed analytics platform that provides a unified workspace to easily store and analyze your organization’s operational data in one place. 

By incorporating its AI capabilities, you can maximize productivity and optimize costs. In addition, BigQuery offers real-time analytics with built-in business intelligence, 

BigQuery lets you quickly develop and run machine learning models using simple SQL queries. It allows you to store up to 10 GiB of data and execute up to 1 TiB of queries monthly for free

Methods for AWS RDS Oracle to BigQuery Migration

To export data from AWS RDS Oracle to BigQuery file, you can employ Hevo Data or Amazon S3 and Java. 

Method 1: Data Migration from AWS RDS Oracle to BigQuery Using Hevo Data

Hevo Data is a real-time ELT data pipeline platform with a no-code feature that allows you to create automated data pipelines cost-effectively according to your needs. With the integration to over 150+ data sources, Hevo Data lets you export, load, and transform the data for detailed analysis.

Get Started with Hevo for Free

Below are some key features of Hevo Data:

  • Data Transformation: Hevo Data simplifies data transformation by offering the option to use Python-based Transformation scripts or Drag-and-Drop Transformation blocks. These analyst-friendly options will help you clean, prepare, and standardize data before loading it to your desired destination.
  • Incremental Data Load: Hevo Data enables transferring updated data, which helps to use bandwidth on both sides of the pipeline efficiently.
  • Auto-Schema Mapping: Hevo Data’s Auto Mapping feature eliminates the need for manual schema management. This functionality automatically identifies the format of incoming data and transfers it to the desired destination schema. Depending on your data replication needs, you can choose either full or incremental mappings.

Let’s see how to integrate AWS RDS Oracle to BigQuery using the Hevo Data.

Step 1: Set Up AWS RDS Oracle as Your Source Connector

Before starting the setup, make sure you have the following prerequisites in place:

Here are the steps to set up your AWS RDS Oracle source in Hevo:

  1. Log into your Hevo account and select PIPELINES from the Navigation Bar.
  2. Go to the Pipelines List View page and click the + CREATE button.
  3. Search in the Select Source Type page and choose Amazon RDS Oracle as the Source type.
  4. In the Configure your Amazon RDS Oracle Source page, specify the following fields:
AWS RDS Oracle to BigQuery: Configuring Amazon RDS Oracle Source Page
AWS RDS Oracle to BigQuery: Configuring Amazon RDS Oracle Source Page
  1. To proceed with the source configuration, click the TEST CONNECTION button followed by the TEST & CONTINUE button.

Read Hevo Documentation to learn more about configuring the AWS RDS Oracle source connector. 

Step 2: Set Up BigQuery as Your Destination Connector

Before you begin the configuration, make sure the following prerequisites are met:

  • Access to Google Cloud Platform (GCP) for a BigQuery Project or create one.
  • Assign the roles for the GCP project to the connecting Google account.
  • An active billing account linked with your GCP project.
  • You must have a Team Collaborator or any administrator role, excluding the Billing Administrator role.

Here are the steps to set up your BigQuery destination in Hevo:

  1. From the Navigation Bar, click the DESTINATIONS option.
  2. In the Destinations List View page, click the + CREATE button.
  3. Navigate to the Add Destination page and select Google BigQuery as the Destination type.
  4. Specify the following details on the Configure your Google BigQuery Destination page. 
  5. You can enable Advanced settings, including Populate Loaded Timestamp and Sanitize Table/Column Names.
  6. Click the TEST CONNECTION followed by the SAVE & CONTINUE button to finish the destination configuration.
AWS RDS Oracle to BigQuery: Configuring your Google BigQuery Destination page
AWS RDS Oracle to BigQuery: Configuring your Google BigQuery Destination page

Once the destination is created, only a few settings can be modified. Read Hevo Documentation to learn more about configuring the BigQuery destination connector.

SIGN UP HERE FOR A 14-DAY FREE TRIAL

Method 2: Data Migration from AWS RDS Oracle to BigQuery Using Amazon S3 and Java

For this migration, you can export the data from AWS RDS Oracle to an Amazon S3 bucket and then load it into BigQuery using Java code

Perform the following steps to convert AWS RDS Oracle to BigQuery:

Step 1: Export Data from AWS RDS Oracle to Amazon S3 Bucket

Before exporting the data, ensure the given prerequisites are in place:

Let’s look into the steps to export from AWS RDS Oracle to Amazon S3 Bucket.

  1. Execute the Schema Creation Script 

Use the following command to run the schema creation script:

$ORACLE_HOME/perl/bin/perl $ORACLE_HOME/rdbms/admin/catcon.pl -u sys/<your_sys_password> --force_pdb_mode 'READ WRITE' -b dbms_cloud_install -d /home/rdsdb -l /home/rdsdb dbms_cloud_install.sql
  1. Execute the ACEs for DBMS_CLOUD

Run the following command in the CDB ROOT Container:

sqlplus sys/<your_sys_password> as sysdba
  1. Connect with the Database User to the Pluggable Database
  • Run the following command and create your database table with the required data.
sqlplus <your db user name>/<your_password>@<your RDS endpoint>/<your PDB name>
  1. Move Data to the CSV File in the DATA_PUMP_DIR Directory

Execute the following command to write the data in your database table to a CSV file:

declare

  cursor c is select <column_name1, column_name2, ….> from <table_name>;

  filename varchar2(50) default 'emp.csv';

  f utl_file.file_type;

begin

  f := utl_file.fopen ('DATA_PUMP_DIR',filename,'w');

  for c1 in c loop

    utl_file.put_line(f, c1.<column_name1>||','||c1.<column_name2>||','||c1.<column_name3>||','||c1.<column_name4>);

  end loop;

  utl_file.fclose(f);

end;

/
  1. Export the Data to the S3 Bucket

Run the following command to load the data from the AWS RDS Oracle database to the S3 bucket:

begin

  dbms_cloud.put_object (

    credential_name => '<credential_name>',

    object_uri      => 'https://s3.ap-southeast-1.amazonaws.com/<your bucket name>/orcl//path_to_your_csv_file/',

    directory_name  => 'DATA_PUMP_DIR',

    file_name       => '/path_to_your_csv_file/');

end;

/

Step 2: Import Data into BigQuery from Amazon S3 Bucket

Before exporting the data, ensure you have obtained the given prerequisites:

Execute the following Java code to load the data into a BigQuery table:

import com.google.cloud.bigquery.datatransfer.v1.DataTransferServiceClient;

import com.google.cloud.bigquery.datatransfer.v1.CreateTransferConfigRequest;

import com.google.cloud.bigquery.datatransfer.v1.TransferConfig;

import com.google.cloud.bigquery.datatransfer.v1.ProjectName;

import com.google.api.gax.rpc.ApiException;

import com.google.protobuf.Struct;

import com.google.protobuf.Value;

import java.util.Map;

import java.util.HashMap;

import java.io.IOException;

// Class to Create Amazon S3 Transfer Configuration

public class AmazonS3TransferConfig {

  public static void main(String[] arg) {

    // Replace the ProjectID, DatasetID, TableID.

    final String ProjectID = "Your_ProjectID";

    String DatasetIdD = "Your_DatasetID";

    String TableID = "Your_TableID";

    // Setting Up Amazon S3 Bucket URI with Read Privilege

    String SourceURI = "s3://YourBucketName/*";

    String AwsAccessKeyID = "Your_AWS_AccessKeyID";

    String AwsSecretAccessID = "Your_AWS_SecretAccessID";

    String SourceFormat = "CSV";

    String FieldDelimiter = ",";

    String SkipLeadingRows = "1";

    Map<String, Value> params = new HashMap<>();

    params.put(

        "destination_table_name_template", Value.newBuilder().setStringValue(TableID).build());

    params.put("data_path", Value.newBuilder().setStringValue(SourceURI).build());

    params.put("access_key_id", Value.newBuilder().setStringValue(AwsAccessKeyID).build());

    params.put("secret_access_key", Value.newBuilder().setStringValue(AwsSecretAccessID).build());

    params.put("source_format", Value.newBuilder().setStringValue(SourceFormat).build());

    params.put("field_delimiter", Value.newBuilder().setStringValue(FieldDelimiter).build());

    params.put("skip_leading_rows", Value.newBuilder().setStringValue(SkipLeadingRows).build());

    TransferConfig transferConfiguration =

        TransferConfig.newBuilder()

            .setDestinationDatasetId(DatasetID)

            .setDisplayName("Your_AmazonS3_ConfigurationName")

            .setDataSourceId("Amazon_s3")

            .setParams(Struct.newBuilder().putAllFields(params).build())

            .setSchedule("Every 24 hours")

            .build();

    AmazonS3TransferConfig(ProjectID, transferConfiguration);

  }

  public static void AmazonS3TransferConfig(String ProjectID, TransferConfig transferConfiguration)

      throws IOException {

    try (DataTransferServiceClient client = DataTransferServiceClient.create()) {

      ProjectName ProjectParent = ProjectName.of(ProjectID);

      CreateTransferConfigRequest ConfigRequest =

          CreateTransferConfigRequest.newBuilder()

              .setParent(ProjectParent.toString())

              .setTransferConfig(transferConfig)

              .build();

      TransferConfig configuration = client.createTransferConfig(ConfigRequest);

      System.out.println("Data Loading is Successful:" + configuration.getName());

    } catch (ApiException ex) {

      System.out.print("Data Loading is Unsuccessful." + ex.toString());

    }

  }

}

With the above steps, you can manually migrate the AWS RDS Oracle table to the BigQuery table.

Limitations of AWS RDS Oracle to BigQuery Migration Using Amazon S3 and Java

  • High Complexity: The AWS RDS Oracle to BigQuery migration using Amazon S3 and Java can be complex and confusing due to the numerous configurations involved. Identifying errors during configurations can also be challenging, making the process labor-intensive. 
  • Coding Skill: To transfer data from AWS RDS Oracle to BigQuery via Amazon S3, you need strong technical knowledge in Oracle SQL, PL/SQL, and Java. 

Use Cases for AWS RDS Oracle to BigQuery Migration

  • Partitioning and Clustering: AWS RDS Oracle to BigQuery migration enables cost-efficient data analysis within optimized query costs using data partitioning and clustered tables.
  • Reduced Processing Issues: Your organization may face challenges in processing large files on a regular basis. By migrating to BigQuery, only relevant data can be processed by configuring Scheduled Jobs as BigQuery Scripts. This will minimize resource consumption and improve query performance. 

Conclusion

This article explains how to migrate your AWS RDS Oracle databases to BigQuery using Hevo Data or manually using Amazon S3 and Java code

However, the manual method has some limitations that may impact a smooth migration. For optimal migration performance, it is recommended to use an automated real-time ELT platform like Hevo

In addition, this article offers a few use cases where migrating from AWS RDS Oracle to BigQuery can benefit from BigQuery’s partitioning and clustering features. These features enable the analysis of operational data within your organization, and using BigQuery Scripts helps minimize processing issues.

To learn more about Partitioning in BigQuery, read this Hevo documentation

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite firsthand. Also checkout our unbeatable pricing to choose the best plan for your organization.

Share your experience of AWS RDS Oracle to BigQuery integration in the comments section below!

Frequently Asked Questions (FAQ)

Q. For AWS RDS Oracle to BigQuery, I use AWS Lambda to read changes from RDS Oracle to Kinesis Data Streams. How do I use the Lambda function to insert data into BigQuery from the Kinesis stream?

A.  You can use the BigQuery streaming API, specifically the tabledata.insertAll method, to insert data into BigQuery. This method is more suitable for small datasets. If the dataset size is large, you can utilize automated data integration platforms like Hevo Data.

Sony Saji
Technical Writer, Hevo Data

Sony is a former Computer Science teacher turned technical content writer, specializing in crafting blogs and articles on topics such as ML, AI, Python Frameworks, and other emerging trends in Data Science and Analytics.

All your customer data in one place.