Summary IconQuick Takeaway

There are three reliable methods to replicate data from Oracle to BigQuery, each designed for different technical requirements:

  • Using Hevo’s Automated Replication: Perfect for seamless data flow with zero coding. Hevo handles the entire process, manages schema mapping, and keeps your Oracle data synchronized with BigQuery in real-time.
  • Using Cloud Data Fusion Web Interface: Great for teams preferring Google’s native visual tools. You get a user-friendly replication job setup, connection configuration, and monitoring through Google’s managed environment.
  • Using Custom ETL Scripts: Ideal for technical teams wanting complete control over the migration process. You design exactly how data exports, transforms, and loads through Google Cloud Storage with full optimization flexibility.

Shifting your database from Oracle to BigQuery opens up key advantages: stronger analytics foundation, custom reporting freedom, effortless Google Cloud integration, and blazing performance for complex operations that fuel intelligent business choices.

This tutorial demonstrates three effective methods to move your data from Oracle to BigQuery. Whether you’re a small business wanting a hassle-free setup or a corporation needing precise migration control, we’ll pinpoint exactly which method works for your situation. You’ll access clear step-by-step walkthroughs, answers to frequent obstacles, and hands-on examples to guarantee smooth data transfer.

By the end of this guide, you’ll have your enterprise data flowing smoothly into BigQuery, ready for advanced analytics. Let’s dive in.

Methods to Connect Oracle to BigQuery

Method 1: Using Hevo Data to Set up Oracle to BigQuery Integration

Prerequisites

  • Oracle database (version 11 or above) with SYSDBA privileges for user creation.
  • Active Google Cloud Project with BigQuery API enabled and billing account attached.
  • Assigned Team Administrator, Collaborator, or Pipeline Administrator role in Hevo.

Step 1: Create an Oracle Database User and Grant Privileges

  • Connect to the Oracle server as a DBA using SQL Developer:
CREATE USER <username> IDENTIFIED BY <password>;

GRANT SELECT ANY DICTIONARY to <username>;

GRANT CREATE SESSION, ALTER SESSION TO <username>;

GRANT SELECT ON ALL_VIEWS TO <username>;

GRANT SELECT ON <schema_name>.<table_name> TO <username>;
  • Grant LogMiner permissions for real-time replication:
GRANT LOGMINING TO <username>;

GRANT SELECT ON SYS.V_$DATABASE TO <username>;

GRANT EXECUTE ON SYS.DBMS_LOGMNR TO <username>;

Step 2: Configure Redo Logs and Archive Settings

  • Enable Archive Log mode for data replication:
SELECT LOG_MODE FROM V$DATABASE;

SHUTDOWN IMMEDIATE;

STARTUP MOUNT;

ALTER DATABASE ARCHIVELOG;

ALTER DATABASE OPEN;
  • Set retention policy to a minimum of 3 days and enable supplemental logging:
CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 3 DAYS;

ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;

Step 3: Configure Oracle as a Source in Hevo

Oracle as a Source
    • In the Hevo dashboard, go to Pipelines → + CREATE PIPELINE → Select Oracle.
    • Configure connection settings:
      • Pipeline Name (max 255 characters).
      • Database Host (IP or DNS) and Port (default 1521).
      • Database User and Password (created in Step 1).
      • Select Ingestion Mode (RedoLog recommended).
      • Service Name: Get with select name from v$database;
    • Click TEST CONNECTION to verify settings.

    Step 4: Set Up BigQuery as Destination

    Oracle to BigQuery - Destination Config

    Prerequisites

    • Service Account created in Google Cloud with BigQuery Data Editor and Job User roles.
    • The JSON key file was downloaded for authentication.

    BigQuery Configuration

    • In Hevo, go to Destinations → + CREATE DESTINATION → Select Google BigQuery.
    • Configure destination:
      • Destination Name (max 255 characters).
      • Account Type: Service Account (upload JSON key file).
      • Project ID from Google Cloud Console.
      • Dataset: Auto-create or select an existing dataset.
      • GCS Bucket: Auto-create or select existing bucket.
    • Enable optional settings like Sanitize Table/Column Names.

    Step 5: Complete Pipeline Setup and Start Replication

    • Review Oracle source and BigQuery destination configurations.
    • Configure data objects:
      • Select specific tables/schemas to replicate.
      • Set field mappings and transformation rules (optional).
      • Choose Historical Data loading preference.
    • Click TEST & CONTINUE after successful validation.
    • Activate the pipeline to start automatic Oracle to BigQuery replication.

    You can now create the pipeline to start replicating your Oracle data to BigQuery automatically and effortlessly with Hevo!

    Migrate data from Oracle to BigQuery
    Migrate data from Oracle to Snowflake
    Migrate data from Amazon S3 to BigQuery

    Method 2: Using Cloud Data Fusion Web Interface to Set up Oracle to BigQuery Integration

    Prerequisites

    • Active Google Cloud Project with Cloud Data Fusion API enabled.
    • Oracle database (version 11 or above) with network connectivity to Google Cloud.
    • BigQuery dataset created with appropriate permissions.
    • Cloud Data Fusion instance provisioned and running.

    Step 1: Set Up Cloud Data Fusion Instance

    • In Google Cloud Console, go to Data Fusion → Create Instance.
    • Configure instance settings:
      • Instance name and region selection.
      • Choose Basic or Enterprise edition based on requirements.
      • Configure network settings and enable private IP if needed.
    • Wait for instance provisioning to complete (5-10 minutes).
    • Click “View Instance” to access the Data Fusion web interface.

    Step 2: Create Oracle Source Connection

    • In the Data Fusion interface, go to System Admin → Connections → Add Connection.
    • Select Oracle as the database type and configure:
      • Connection name and description.
      • Oracle hostname, port (1521), and service name.
      • Database username and password with SELECT privileges.
      • Test connection to verify Oracle accessibility.
    • Save the connection for reuse across multiple pipelines.

    Step 3: Configure BigQuery Destination Connection

    • Add a new connection and select BigQuery as the destination type.
    • Configure BigQuery settings:
      • Project ID where the BigQuery dataset exists.
      • Service account key or use the default Compute Engine service account.
      • Dataset name where Oracle data will be replicated.
      • Enable automatic table creation if required.
    • Test BigQuery connection and save configuration.

    Step 4: Build Replication Pipeline

    • Create new Data Pipeline (Batch or Realtime, based on needs).
    • Drag the Oracle source connector to canvas and configure:
      • Select the Oracle connection created in Step 2.
      • Choose tables/schemas to replicate.
      • Set query mode (Table or Custom SQL).
    • Add BigQuery sink connector and configure:
      • Select the BigQuery connection from Step 3.
      • Map source tables to destination tables.
      • Configure write mode (Append/Overwrite).

    Step 5: Deploy and Monitor Pipeline

    • Validate pipeline configuration and resolve any errors.
    • Set pipeline schedule (real-time streaming or batch intervals).
    • Deploy the pipeline to start Oracle to BigQuery replication.
    • Monitor pipeline execution through the Data Fusion dashboard.
    • Set up alerts for pipeline failures or data quality issues.

    Limitations of Using Google Cloud Fusion

    • As a Google Cloud service, Data Fusion might have some vendor lock-in associated with it.
    • Although it supports CDC, the capabilities are limited in comparison to other third party tools.
    • For large-scale migrations or high-volume data streams, performance can be a challenge. Careful tuning and optimization might be necessary to avoid delays.
    • It can get very expensive for high-volume usage.

    If Google Cloud Fusion is not the right fit for your organisation, check out the 5 Best Oracle Replication Tools.

    Method 3: Using Custom ETL Scripts

    Manual method from Oracle to BigQuery

    Prerequisites

    • An Oracle database with export privileges and network connectivity.
    • Google Cloud Storage bucket with write permissions.
    • A BigQuery dataset created for data loading.
    • Python/Java development environment with required libraries.

    Step 1: Set Up Development Environment and Authentication

    • Install required libraries (cx_Oracle, google-cloud-bigquery, google-cloud-storage).
    • Configure Google Cloud authentication:
      • Create a service account with BigQuery Data Editor and Storage Admin roles.
      • Download the JSON key file and set GOOGLE_APPLICATION_CREDENTIALS.
    • Set up Oracle client libraries and test database connectivity.
    • Create a project structure with separate modules for extract, transform, load.

    Step 2: Develop a Data Export Script from Oracle

    • Create a Python script to connect to the Oracle database:
    import cx_Oracle
    
    import pandas as pd
    
    # Connect to Oracle and extract data
    
    connection = cx_Oracle.connect(username, password, hostname:port/service)
    
    query = "SELECT * FROM schema.table_name"
    
    df = pd.read_sql(query, connection)
    • Implement incremental data extraction using timestamps or sequence numbers.
    • Export data to CSV format with proper encoding and delimiter handling.
    • Add error handling and logging for robust data extraction.

    Step 3: Upload Data to Google Cloud Storage

    • Create a script to upload CSV files to the GCS bucket:
    from google.cloud import storage
    
    client = storage.Client()
    
    bucket = client.bucket('your-bucket-name')
    
    blob = bucket.blob('oracle-data/table_name.csv')
    
    blob.upload_from_filename(csv_file_path)
    • Organize files with a partitioning strategy (date-based folders).
    • Implement file compression and cleanup after successful upload.
    • Add retry logic for failed uploads and network issues.

    Step 4: Load Data from GCS to BigQuery

    • Create BigQuery loading script:
    from google.cloud import bigquery
    
    client = bigquery.Client()
    
    job_config = bigquery.LoadJobConfig(source_format=bigquery.SourceFormat.CSV)

    load_job = client.load_table_from_uri(gcs_uri, table_ref, job_config=job_config)

    • Configure schema auto-detection or provide an explicit schema definition.
    • Handle data type conversions and null value processing.
    • Implement upsert logic for handling duplicate records.

    Step 5: Orchestrate and Schedule ETL Process

    • Create a monitoring dashboard for tracking ETL job success rates and performance.
    • Create a master script orchestrating extract, upload, and load steps.
    • Implement a comprehensive logging and error notification system.
    • Set up scheduling using cron jobs, Airflow, or Cloud Scheduler.
    • Add data validation and quality checks between each step.

    Limitations of Using Custom ETL Scripts to Connect Oracle to BigQuery

    • Often, a need arises to transform the data (E.g., hide Personally Identifiable Information) before loading it into BigQuery. Achieving this would require you to add time and resources to the process.
    • Writing custom code would add value only if you are looking to move data once from Oracle to BigQuery.
    • When a use case that needs data to be synced on an ongoing basis or in real-time from Oracle into BigQuery arises, you would have to move it in an incremental format. This process is called Change Data Capture. The custom code method mentioned above fails here. You would have to write additional lines of code to achieve this.
    • When you build custom SQL scripts to extract a subset of the data set in Oracle DB, there is a chance that the script breaks as the source schema keeps changing or evolving.

    In a nutshell, ETL scripts are fragile and highly likely to break. Therefore, we strongly recommend that you use ETL tools, check out our blog on the 10 Best Oracle ETL Tools to see which tools suits your needs.

    You can also read about:

    Enhance your Bigquery ETL with Hevo!

    Leverage BigQuery’s features like machine learning, search, geospatial analysis, and business intelligence by migrating your data to it using Hevo. Skip long and tedious manual setup and choose Hevo’s no-code platform to: 

    1. Effortlessly extract data from Oracle and other 150+ connectors
    2. Transform and map data easily with drag-and-drop features.
    3. Real-time data migration to leverage AI/ML features of BigQuery.

    Still not sure? See how Postman, the world’s leading API platform, used Hevo to save 30-40 hours of developer efforts monthly and found a one-stop solution for all its data integration needs. 

    Get Started with Hevo for Free

    Why is Oracle the Backbone of Many Enterprise Applications?

    Oracle Logo

    Oracle database is a relational database system that helps businesses store and retrieve data. Oracle DB(as it’s fondly called) provides a perfect combination of high-level technology and integrated business solutions which is a non-negotiable requisite for businesses that store and access huge amounts of data. This makes it one of the world’s trusted database management systems.

    Key Features of Oracle that Make it Popular

    • Recovery Operations: Oracle supports redo logs for ensuring data integrity and enabling robust recovery mechanisms. 
    • Database Management: Oracle Database provides robust and secure data management for data-driven decision-making in the enterprise.
    • High Performance and Scalability: It can handle high transaction rates and scale up for growing data and users.
    • CDC Support: Oracle supports CDC for real-time data integration.
    • Integration: Oracle products integrate with many applications and technologies to simplify across departments. You can seamlessly integrate it with other warehouses such as Snowflake, Redshift, etc. 

    What makes BigQuery a Go-To Data Warehouse?

    BigQuery Logo

    Google BigQuery is a fully managed and serverless enterprise cloud data warehouse. It uses Dremel technology, which transforms SQL queries into tree structures. BigQuery provides an outstanding query performance owing to its column-based storage system.

    Key Features:

    • Machine Learning: BigQuery ML allows users to train and run machine learning models in BigQuery using only SQL syntax.
    • High Scalability: It scales seamlessly to handle petabytes of data.
    • Serverless Architecture: BigQuery manages servers and storage in the background, so a user does not need to.
    • SQL Compatibility: It supports ANSI SQL, which is useful for people who already know SQL and want to write and run queries. This also allows a user to combine various BI tools for data visualization.

    Conclusion

    This blog talks about the two methods you can use to connect Oracle to BigQuery in a seamless fashion. If you rarely need to transfer your data from Oracle to BigQuery, then the first manual Method will work fine. Whereas, if you require Real-Time Data Replication and looking for an Automated Data Pipeline Solution, then Hevo is the right choice for you!

    Sign up for a 14-day free trial with Hevo to experience seamless data integration. Also, check out Hevo’s pricing to choose a plan that is best for you.

    FAQs on Oracle to BigQuery

    How to Convert an Oracle Query to BigQuery?

    Map Oracle data types and functions to their BigQuery equivalents (e.g., VARCHAR2 to STRING, NVL() to IFNULL()). Adjust syntax for differences in SQL functions and features.

    How to connect Oracle to BigQuery?

    Use a data migration tool like Google Cloud Dataflow, Informatica, or Fivetran, or set up an ETL pipeline to extract data from Oracle and load it into BigQuery.

    What is the difference between Oracle and BigQuery?

    Oracle is an on-premise, traditional relational database, while BigQuery is a fully managed, serverless data warehouse optimized for large-scale analytics on Google Cloud.

    mm
    Freelance Technical Content Writer, Hevo Data

    Bukunmi is curious about learning on complex concepts and latest trends in data science and combines his flair for writing to curate content for data teams to help them solve business challenges.