Oracle Change Data Capture (CDC): Best Methods to Set Up Real-time Sync

on Tutorials • March 10th, 2020 • Write for Hevo

Oracle CDC - Change Data Capture

What is Oracle Change Data Capture or Oracle CDC?

Oracle change data capture or Oracle CDC paradigm deals with methods to efficiently identify and capture data that has been inserted, updated or deleted from the database and making it available for applications to consume.

Oracle explains change data capture based on the concept of publishers and subscribers. Publishers are oracle users who publish the changes and subscribers are individuals or applications that subscribe to the change data.

Oracle CDC – Use Case

Oracle is a proprietary database management system that can be used to run transaction processing, warehousing or mixed loads. Even though there are many open-source counterparts that can do the same job, Oracle’s powerful querying layer with advanced analytical functions, enterprise-grade security, access controls, and support has made it a favorite for highly critical enterprise transactional workloads.

The rise in popularity of open-source distributed data warehouse systems has led to a scenario where a combination of Oracle-based transactional database and a separate data warehouse is a common sight in enterprise ETL stack.

A common challenge found in such cases is in implementing continuous real-time sync between Oracle and the data warehouse so that reports always have the most recent data possible. This is accomplished using the change data capture paradigm. This post details how to implement Oracle CDC (Oracle change data capture) to copy data to various target databases in real-time.

Methods to Implement Oracle Change Data Capture (CDC)

You can implement Oracle CDC in one of the following ways:

  1. Synchronous change data capture using triggers
  2. Asynchronous change data capture using redo logs
  3. Using Hevo Data – A No Code Data Pipeline that supports Oracle CDC out of the box

Oracle CDC: Synchronous Change Data Capture

A synchronous Oracle change data capture is implemented using triggers that inserts entries to a change-table every time a data modification operation is executed. Triggers are hooks that get executed as part of the transactions. The first step is creating a user to serve as the change data publisher. This user must have all access to the namespace and the table from which change data is to be captured.

Assuming this user is created, let’s go to the core part of this exercise – creating a change set, change table, and subscribing to changes. A change set is nothing but a logical group of change data that can be handled as a unit. 

Steps to Perform Synchronous Change Data Capture on Oracle

  1. Create a Change Set
  2. Create a Change Table
  3. Create a Subscription to Capture Data
  4. Subscribe to the Changes on Source Table
  5. Activate the Subscription
  6. Get Data from Subscribed Table
  7. Using Select Query to get new Changes

Step 1: Create a Change Set

To create a change set, use the below command in Oracle shell.

BEGIN
    DBMS_CDC_PUBLISH.CREATE_CHANGE_SET( 
    change_set_name    => 'PRODUCT_SET', 
    description        => 'Change set for product change info', 
    change_source_name => 'SYNC_SOURCE');
END;

In the above command, ‘SYNC_SOURCE’ is a predefined source defined by Oracle to implement synchronous change data capture.

Step 2: Create a Change Table

Using the changeset defined above, create a changing table that will capture information from the triggers.

BEGIN
   DBMS_CDC_PUBLISH.CREATE_CHANGE_TABLE(

  owner             => 'cdcuser',
   change_table_name => 'products_ct', 
   change_set_name   => 'PRODUCT_SET',
   source_schema     => 'SH',
   source_table      => 'PRODUCTS',
   column_type_list  => 'ID NUMBER(6), 
                         NAME VARCHAR2(50),
                         PRICE NUMBER(8,2)', 
   capture_values    => 'both',
   rs_id             => 'y', 
   row_id            => 'n',
   user_id           => 'n',
   timestamp         => 'n',
   object_id         => 'n',
   source_colmap     => 'y',
   target_colmap     => 'y',
   options_string    => 'TABLESPACE TS_PRODUCT');
END;

In the above command, column_type_list attribute captures the list of columns for which change will be captured.

Step 3: Create a Subscription to Capture Data

The next step is to create a subscription to capture the data that is being inserted to the change table. For that execute the below command.

BEGIN
       DBMS_CDC_SUBSCRIBE.CREATE_SUBSCRIPTION(
       change_set_name   => 'PRODUCT_SET',
       description       => 'Change data for PRODUCTS',
       subscription_name => 'PRODUCT_SUB');
END;

Step 4: Subscribe to the Changes on Source Table

Subscribe to the source table changes using the subscription object created in the previous step.

BEGIN
       DBMS_CDC_SUBSCRIBE.SUBSCRIBE(
       subscription_name => 'PRODUCT_SUB',
       source_schema     => 'SH',
       source_table      => 'PRODUCTS',
       column_list       => 'ID, NAME,PRICE',
       subscriber_view   => 'PRODUCT_VIEW');
END;

Step 5: Activate the Subscription

Run the following query to activate the subscription.

BEGIN
   DBMS_CDC_SUBSCRIBE.ACTIVATE_SUBSCRIPTION(
       subscription_name => 'PRODUCT_SUB');
END;

Step 6: Get Data from Subscribed Table

Getting data from the subscribed table is done based on a collection window for the next available set of change data.

BEGIN
   DBMS_CDC_SUBSCRIBE.EXTEND_WINDOW(
       subscription_name => 'PRODUCT_SUB');
END;

Every time the above command is called, new change data that was created since the last call of this command is added to the subscription window.

Step 7: Using Select Query to get new Changes

Use a select query to get the new changes and process them as per the target database.

SELECT ID, NAME, PRICE FROM SALES_VIEW;

To copy data to another destination database, the above action will have to done using a script that has the ability to process these records and insert to the target database.

The biggest disadvantage with the above approach is that triggers affect the performance of the source database. For cases where this cannot be afforded, Oracle provides another alternative to implement change data capture. 

Oracle CDC: Asynchronous Change Data Capture

Asynchronous change data capture is implemented using redo logs. Redo logs are logs that keep a log of all activities in a database. As the name suggests, this is an asynchronous process and does not cause any performance hit.  To enable logging, execute the following steps.

Steps to Perform Asynchronous Change Data Capture on Oracle

  1. Set Database to FORCE LOGGING Mode
  2. Enable Supplemental Logging
  3. Enable Logging for all Columns in the required Table
  4. Prepare Source Tables for Log Data Capture
  5. Create a Change Set
  6. Create a Change table
  7. Enable Change Set

Step 1: Set Database to FORCE LOGGING Mode

The first step is to enable FORCE LOGGING mode on your database. Run the following query to achieve that.

ALTER DATABASE FORCE LOGGING;

Step 2: Enable Supplemental Logging

This is to enable the database to log in case of UPDATE statements.

ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;

Step 3: Enable Logging for all Columns in the required Table

First, you would need to identify the columns that you need to capture changes from. If you choose to capture changes from all columns, you can run the following query.

ALTER TABLE ts.products ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;

Step 4: Prepare the Source Tables for Log Data Capture

BEGIN DBMS_CAPTURE_ADM.PREPARE_TABLE_INSTANTIATION(TABLE_NAME => 'ts.products'); END;

Step 5: Create Change Set

Create a change set with the source as HOTLOG_SOURCE.

BEGIN
       DBMS_CDC_PUBLISH.CREATE_CHANGE_SET(
       change_set_name => 'PRODUCT_SET',
       description => 'Change set for product info',
       change_source_name => 'HOTLOG_SOURCE',
       stop_on_ddl => 'y',
       begin_date => sysdate,
       end_date => sysdate+5);
END;

HOTLOG_SOURCE is a predefined source defined by oracle for asynchronous change data capture.

Step 6: Create Change Table

BEGIN
   DBMS_CDC_PUBLISH.CREATE_CHANGE_TABLE(
   owner              => 'cdcpub',
   change_table_name  => 'products_ct', 
   change_set_name    => 'PRODUCT_SET',
   source_schema      => 'TS',
   source_table       => 'PRODUCTS',
   column_type_list   => 'ID NUMBER(6), NAME VARCHAR2(50),
      PRICE NUMBER(8,2)',
   capture_values     => 'both',
   rs_id              => 'y',
   row_id             => 'n',
   user_id            => 'n',
   timestamp          => 'n',
   object_id          => 'n',
   source_colmap      => 'n',
   target_colmap      => 'y',
   options_string     => 'TABLESPACE TS_PRODUCT_SET');
END;

Step 7: Enable Change Set

BEGIN
   DBMS_CDC_PUBLISH.ALTER_CHANGE_SET(
      change_set_name => 'PRODUCT_SET',
      enable_capture => 'y');
END;

That’s it, we now have all the change data being captured in the changing table. But, our job is not complete. We still need to implement a script that creates a subscription to this table and process accordingly. A subscription can be by following the same steps that we followed in case of synchronous change data capture.

Oracle CDC: Limitations of Building Custom Code

As obvious from the above set of steps, configuring change data capture in Oracle is a tedious process. And, the problems don’t end there. Some of the other challenges that can happen in the above approach are listed below.

  1. There is an umpteen number of configuration changes and user permission changes that need to be done to initialize and complete this process. 
  2. The process is only about enabling and capturing change data. The developers still need to implement the logic for processing the change data and insert into their target databases.
  3. Implementing this complete pipeline needs expert oracle admin skills, target database skills and programming skills to implement the logic. This would consume a considerable amount of cost and time from precious developer resources.

Considering the above challenges, it would be wise to invest the developer’s time on the core business problems and use a Data Integration tool like Hevo to perform the Oracle CDC to your destination in real-time.

An Easier Approach – Oracle CDC Using Hevo

Hevo is a completely self-serve, fully managed, No Code Data Pipeline that can help you implement Oracle CDC without having to write any code. Hevo’s point and click interface ensures the lowest time to production possible.

Here’s how simple it to set up Oracle CDC with Hevo:

  • Authenticate and connect your Oracle data source
  • Select Change Data Capture as your replication mode
  • Point to the destination where you want to move data

Hevo supports Oracle CDC out of the box and will ensure that your data is loaded from Oracle to your target database or data warehouse in real-time. Experience the power and simplicity of implementing Oracle CDC by signing up for a 14-day free trial with Hevo.

What are your thoughts about setting up Change Data Capture on Oracle? Let us know in the comments?

All your customer data in one place.