What is Oracle Change Data Capture or Oracle CDC?
Oracle change data capture or Oracle CDC paradigm deals with methods to efficiently identify and capture data that has been inserted, updated or deleted from the database and making it available for applications to consume.
Oracle explains change data capture based on the concept of publishers and subscribers. Publishers are oracle users who publish the changes and subscribers are individuals or applications that subscribe to the change data.
Oracle CDC – Use Case
Oracle is a proprietary database management system that can be used to run transaction processing, warehousing or mixed loads. Even though there are many open-source counterparts that can do the same job, Oracle’s powerful querying layer with advanced analytical functions, enterprise-grade security, access controls, and support has made it a favorite for highly critical enterprise transactional workloads.
The rise in popularity of open-source distributed data warehouse systems has led to a scenario where a combination of Oracle-based transactional database and a separate data warehouse is a common sight in enterprise ETL stack.
A common challenge found in such cases is in implementing continuous real-time sync between Oracle and the data warehouse so that reports always have the most recent data possible. This is accomplished using the change data capture paradigm. This post details how to implement Oracle CDC (Oracle change data capture) to copy data to various target databases in real-time.
Oracle CDC: Synchronous Change Data Capture
A synchronous Oracle change data capture is implemented using triggers that inserts entries to a change-table every time a data modification operation is executed. Triggers are hooks that get executed as part of the transactions. The first step is creating a user to serve as the change data publisher. This user must have all access to the namespace and the table from which change data is to be captured.
Assuming this user is created, let’s go to the core part of this exercise – creating a change set, change table, and subscribing to changes. A change set is nothing but a logical group of change data that can be handled as a unit.
Steps to Perform Synchronous Change Data Capture on Oracle
- Create a Change Set
- Create a Change Table
- Create a Subscription to Capture Data
- Subscribe to the Changes on Source Table
- Activate the Subscription
- Get Data from Subscribed Table
- Using Select Query to get new Changes
Step 1: Create a Change Set
To create a change set, use the below command in Oracle shell.
BEGIN DBMS_CDC_PUBLISH.CREATE_CHANGE_SET( change_set_name => 'PRODUCT_SET', description => 'Change set for product change info', change_source_name => 'SYNC_SOURCE'); END;
In the above command, ‘SYNC_SOURCE’ is a predefined source defined by Oracle to implement synchronous change data capture.
Step 2: Create a Change Table
Using the changeset defined above, create a changing table that will capture information from the triggers.
BEGIN DBMS_CDC_PUBLISH.CREATE_CHANGE_TABLE( owner => 'cdcuser', change_table_name => 'products_ct', change_set_name => 'PRODUCT_SET', source_schema => 'SH', source_table => 'PRODUCTS', column_type_list => 'ID NUMBER(6), NAME VARCHAR2(50), PRICE NUMBER(8,2)', capture_values => 'both', rs_id => 'y', row_id => 'n', user_id => 'n', timestamp => 'n', object_id => 'n', source_colmap => 'y', target_colmap => 'y', options_string => 'TABLESPACE TS_PRODUCT'); END;
In the above command, column_type_list attribute captures the list of columns for which change will be captured.
Step 3: Create a Subscription to Capture Data
The next step is to create a subscription to capture the data that is being inserted to the change table. For that execute the below command.
BEGIN DBMS_CDC_SUBSCRIBE.CREATE_SUBSCRIPTION( change_set_name => 'PRODUCT_SET', description => 'Change data for PRODUCTS', subscription_name => 'PRODUCT_SUB'); END;
Step 4: Subscribe to the Changes on Source Table
Subscribe to the source table changes using the subscription object created in the previous step.
BEGIN DBMS_CDC_SUBSCRIBE.SUBSCRIBE( subscription_name => 'PRODUCT_SUB', source_schema => 'SH', source_table => 'PRODUCTS', column_list => 'ID, NAME,PRICE', subscriber_view => 'PRODUCT_VIEW'); END;
Step 5: Activate the Subscription
Run the following query to activate the subscription.
BEGIN DBMS_CDC_SUBSCRIBE.ACTIVATE_SUBSCRIPTION( subscription_name => 'PRODUCT_SUB'); END;
Step 6: Get Data from Subscribed Table
Getting data from the subscribed table is done based on a collection window for the next available set of change data.
BEGIN DBMS_CDC_SUBSCRIBE.EXTEND_WINDOW( subscription_name => 'PRODUCT_SUB'); END;
Every time the above command is called, new change data that was created since the last call of this command is added to the subscription window.
Step 7: Using Select Query to get new Changes
Use a select query to get the new changes and process them as per the target database.
SELECT ID, NAME, PRICE FROM SALES_VIEW;
To copy data to another destination database, the above action will have to done using a script that has the ability to process these records and insert to the target database.
The biggest disadvantage with the above approach is that triggers affect the performance of the source database. For cases where this cannot be afforded, Oracle provides another alternative to implement change data capture.
Oracle CDC: Asynchronous Change Data Capture
Asynchronous change data capture is implemented using redo logs. Redo logs are logs that keep a log of all activities in a database. As the name suggests, this is an asynchronous process and does not cause any performance hit. To enable logging, execute the following steps.
Steps to Perform Asynchronous Change Data Capture on Oracle
- Set Database to FORCE LOGGING Mode
- Enable Supplemental Logging
- Enable Logging for all Columns in the required Table
- Prepare Source Tables for Log Data Capture
- Create a Change Set
- Create a Change table
- Enable Change Set
Step 1: Set Database to FORCE LOGGING Mode
The first step is to enable FORCE LOGGING mode on your database. Run the following query to achieve that.
ALTER DATABASE FORCE LOGGING;
Step 2: Enable Supplemental Logging
This is to enable the database to log in case of UPDATE statements.
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
Step 3: Enable Logging for all Columns in the required Table
First, you would need to identify the columns that you need to capture changes from. If you choose to capture changes from all columns, you can run the following query.
ALTER TABLE ts.products ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
Step 4: Prepare the Source Tables for Log Data Capture
BEGIN DBMS_CAPTURE_ADM.PREPARE_TABLE_INSTANTIATION(TABLE_NAME => 'ts.products'); END;
Step 5: Create Change Set
Create a change set with the source as HOTLOG_SOURCE.
BEGIN DBMS_CDC_PUBLISH.CREATE_CHANGE_SET( change_set_name => 'PRODUCT_SET', description => 'Change set for product info', change_source_name => 'HOTLOG_SOURCE', stop_on_ddl => 'y', begin_date => sysdate, end_date => sysdate+5); END;
HOTLOG_SOURCE is a predefined source defined by oracle for asynchronous change data capture.
Step 6: Create Change Table
BEGIN DBMS_CDC_PUBLISH.CREATE_CHANGE_TABLE( owner => 'cdcpub', change_table_name => 'products_ct', change_set_name => 'PRODUCT_SET', source_schema => 'TS', source_table => 'PRODUCTS', column_type_list => 'ID NUMBER(6), NAME VARCHAR2(50), PRICE NUMBER(8,2)', capture_values => 'both', rs_id => 'y', row_id => 'n', user_id => 'n', timestamp => 'n', object_id => 'n', source_colmap => 'n', target_colmap => 'y', options_string => 'TABLESPACE TS_PRODUCT_SET'); END;
Step 7: Enable Change Set
BEGIN DBMS_CDC_PUBLISH.ALTER_CHANGE_SET( change_set_name => 'PRODUCT_SET', enable_capture => 'y'); END;
That’s it, we now have all the change data being captured in the changing table. But, our job is not complete. We still need to implement a script that creates a subscription to this table and process accordingly. A subscription can be by following the same steps that we followed in case of synchronous change data capture.
Oracle CDC: Limitations of Building Custom Code
As obvious from the above set of steps, configuring change data capture in Oracle is a tedious process. And, the problems don’t end there. Some of the other challenges that can happen in the above approach are listed below.
- There is an umpteen number of configuration changes and user permission changes that need to be done to initialize and complete this process.
- The process is only about enabling and capturing change data. The developers still need to implement the logic for processing the change data and insert into their target databases.
- Implementing this complete pipeline needs expert oracle admin skills, target database skills and programming skills to implement the logic. This would consume a considerable amount of cost and time from precious developer resources.
Considering the above challenges, it would be wise to invest the developer’s time on the core business problems and use a Data Integration tool like Hevo to perform the Oracle CDC to your destination in real-time.
Hevo is a completely self-serve, fully managed, No Code Data Pipeline that can help you implement Oracle CDC without having to write any code. Hevo’s point and click interface ensures the lowest time to production possible.
Here’s how simple it to set up Oracle CDC with Hevo:
- Authenticate and connect your Oracle data source
- Select Change Data Capture as your replication mode
- Point to the destination where you want to move data
Hevo supports Oracle CDC out of the box and will ensure that your data is loaded from Oracle to your target database or data warehouse in real-time. Experience the power and simplicity of implementing Oracle CDC by signing up for a 14-day free trial with Hevo.
What are your thoughts about setting up Change Data Capture on Oracle? Let us know in the comments?