The data replication process is needed to make a consistent copy of the data available across all the database nodes. Oracle databases can integrate with the MongoDB database that can accept and stream data from the Oracle database. With the Oracle database, you can detect the changes to the databases using CDC (Change Data Capture). CDC refers to the process that identifies and tracks changes to data in a database. CDC can detect changes in operations in any databases like delete, update and delete events, which are sent to the other databases during ETL (Extract, Transform and Load), database migrations, and replications.
The data replication process is needed to make a consistent copy of the data available across all the database nodes. And the Oracle database allows you to simplify the process of replication changes of the MongoDB database with a few lines of code.
In this article, you will learn about the Oracle to MongoDB replication process.
Prerequisites
Fundamental understanding of the integration between databases
What is Oracle?
Developed in 1979, Oracle is a relational database management system (RDBMS) from Oracle corporation. Oracle is the best choice for cost-effective applications and data management solutions in organizations. To interact with the databases, Oracle supports SQL as a query language. Oracle is also compatible with operating systems like Windows, Unix, Linus, and macOS.
Features of Oracle
1) Availability
Oracle’s Oracle Data Guard capability ensures excellent database availability. This functionality enables the use of the secondary database as a copy of the primary database during any failure. As a result, all typical operations, such as backups and partial failures, do not restrict the database from being utilized.
2) Security
Oracle consists of a mechanism for controlling and accessing the database to prevent unauthorized access. It provides high security due to its Oracle Advanced Security features. It comprises two solutions that protect the Oracle databases – Transparent Data Encryption (TDE) and Data Redaction.
3) Scalability
Oracle database consists of features like Real Application Cluster (RAC) and Portability that make Oracle Database scalable based on usage. Oracle includes capabilities like rolling instance migrations, maintaining application continuity, quality of service management, performing upgrades, and more in a clustered environment.
4) Performance
Oracle provides high performance due to its optimization tools like Oracle Advanced Compression, Oracle Database in Memory, Oracle Real Application Testing, and Oracle TimesTen Application-Tier Database Cache.
Hevo Data, a Fully-managed Data Pipeline platform, can help you automate, simplify & enrich your data replication process in a few clicks. With Hevo’s wide variety of connectors and blazing-fast Data Pipelines, you can extract & load data from 150+ Data Sources such as Oracle & MongoDB to the Data Warehouses of your choice. To further streamline and prepare your data for analysis, you can process and enrich raw granular data using Hevo’s robust & built-in Transformation Layer without writing a single line of code!
Check out some of the cool features of Hevo:
- 100% Complete & Accurate Data Transfer: Hevo’s robust infrastructure ensures reliable data transfer with zero data loss.
- Completely Automated: The Hevo platform can be set up in just a few minutes and requires minimal maintenance.
- Connectors: Hevo supports 150+ integrations to SaaS platforms, files, databases, analytics, and BI tools. It supports various destinations, including Amazon Redshift, Snowflake Data Warehouses, Amazon S3 Data Lakes, and MySQL, SQL Server, TokuDB, DynamoDB, and PostgreSQL databases to name a few.
- Real-Time Data Transfer: Hevo provides real-time data migration, so you can have analysis-ready data always.
GET STARTED WITH HEVO FOR FREE
What is MongoDB?
Developed in 2009, MongoDB is an open-source document-oriented database designed to store large data volumes and allow users to work with data efficiently. MongoDB is classified as NoSQL since its data is not kept in tables.
MongoDB provides official driver support for all the popular languages like C, C++, Net, Go, Java, Node.js, Perl, PHP, Python, Motor, Ruby, and more. Therefore, you can create an application using these languages.
Records in MongoDB are called documents, consisting of data in the pairs of key and value. Documents are fundamental units in MongoDB. These documents are similar to JavaScript Object Notation (JSON) but use a variant called Binary JSON (BSON). Documents consist of fields identical to the columns in a relational database. A set of documents is called collection in MongoDB, which can have any data, but the data cannot be spread across different databases.
Features of MongoDB
1) Schema-Less Database
Schema-less Database means that a collection can hold different types of documents in it. In other words, a single collection can hold multiple documents, and such documents might consist of different numbers of fields, content, and size. Since every document is not similar to another document in the MongoDB database, MongoDB provides scalability to the databases.
2) Replication
MongoDB provides high availability and redundancy due to its replication. It creates multiple copies of the data and sends such data to the different servers. Therefore, if any server fails, data is retrieved from the other servers.
3) Document-Oriented
All databases in MongoDB are stored in documents instead of tables like RDBMS. Documents hold data in pairs of keys and values, making it more versatile than RDBMS.
4) Scalability
MongoDB provides horizontal scalability in databases using Sharding. Sharding distributes data on multiple servers, where large volumes of data are partitioned into chunks using the shard key. Such data chunks are then distributed across shards that evenly reside on many physical servers.
Oracle to MongoDB Replication
Using the Apache Kafka, the Confluent Oracle CDC Connector, and the MongoDB connector for Apache Kafka, users can easily stream database changes from Oracle to MongoDB.
For connecting Oracle to MongoDB, you need a self-contained docker-compose environment that consists of the below requirements
- Oracle Database
- MongoDB
- Apache Kafka
- Confluent KSQL
1) Oracle to MongoDB Replication: Prepare the Oracle Docker Image
You can remove the section “database” from the docker-compose file if you have an Oracle database. You can also pull the Oracle Database Enterprise Edition from the Docker hub. Accept the Oracle terms and conditions and log in to your docker account with docker login. Use pull
store/oracle/database-enterprise:12.2.0.1-slim
to download the docker image locally.
2) Oracle to MongoDB Replication: Launch the Docker environment
The docker-compose file will launch the below software
- Apache Kafka, including ZooKeeper, REST API, KSQL, Schema Registry
- Apache Kafka Connect
- MongoDB Connector for Apache Kafka
- Confluent Oracle CDC Connector
- Oracle Database Enterprise
Ensure that you have your Oracle environment ready to launch the environment and then clone the Github repository. Build the docker-compose file.
docker-compose up -d --build
After that, you need to configure your Oracle environment to be used by the Confluent CDC Connector.
3) Oracle to MongoDB Replication: Configuring Oracle instance
If you are using Oracle within the docker environment, use the docker exec command, as shown below.
docker exec -it oracle bash -c "source /home/oracle/.bashrc; sqlplus /nolog"
connect / as sysdba
4) Oracle to MongoDB Replication: Configuring Oracle for CDC Connector
Check if the database is in archive log mode.
select log_mode from v$database;
If the database is not in archive log mode, use the below command.
SHUTDOWN IMMEDIATE;
STARTUP MOUNT;
ALTER DATABASE ARCHIVELOG;
ALTER DATABASE OPEN;
Verify the archive mode.
select log_mode from v$database
Enable supplemental logging for all columns. Supplemental logging in Oracle refers to all columns or selected columns specified for extra logging. It is needed for generating changelogs for the update and delete statements.
ALTER SESSION SET CONTAINER=cdb$root;
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
The below output should appear on the Oracle CDB.
Create some objects in the database using the below commands.
CREATE TABLE C##MYUSER.emp
(
i INTEGER GENERATED BY DEFAULT AS IDENTITY,
name VARCHAR2(100),
lastname VARCHAR2(100),
PRIMARY KEY (i)
) tablespace sysaux;
insert into C##MYUSER.emp (name, lastname) values ('Bob', 'Perez');
insert into C##MYUSER.emp (name, lastname) values ('Jane','Revuelta');
insert into C##MYUSER.emp (name, lastname) values ('Mary','Kristmas');
insert into C##MYUSER.emp (name, lastname) values ('Alice','Cambio');
Commit;
Integrate Oracle to Amazon S3
Integrate Oracle to Redshift
Integrate MongoDB to MySQL
5) Oracle to MongoDB Replication: Creating a Kafka Topic
Open a new terminal for connecting to your Kafka Server.
docker exec -it broker /bin/bash
After connecting to Kafka, create the Kafka topic.
kafka-topics --create --topic SimpleOracleCDC-ORCLCDB-redo-log
--bootstrap-server broker:9092 --replication-factor 1
--partitions 1 --config cleanup.policy=delete
--config retention.ms=120960000
6) Oracle to MongoDB Replication: Configuring the Oracle CDC Connector
The oracle-cdc-source.json file in the Github repository contains all information about the configuration of Confluent Oracle CDC Connector. Use the below command to execute the CDC Connector.
curl -X POST -H "Content-Type: application/json" -d @oracle-cdc-source.json http://localhost:8083/connectors
7) Oracle to MongoDB Replication: Setting up KSQL Data Flows in Kafka
Since the Oracle CRUD events arrive in the Kafka topic, use KSQL to stream these events into a new topic such that the MongoDB connector can consume them.
docker exec -it ksql-server bin/bash
ksql http://127.0.0.1:8088
Enter the below commands to create streams in KSQL. KSQL is a Java library build on top of Kafka Streams to enrich, transform and process real-time streams of data.
CREATE STREAM CDCORACLE (I DECIMAL(20,0), NAME varchar, LASTNAME varchar, op_type VARCHAR) WITH ( kafka_topic='ORCLCDB-EMP', PARTITIONS=1, REPLICAS=1, value_format='AVRO');
CREATE STREAM WRITEOP AS
SELECT CAST(I AS BIGINT) as "_id", NAME , LASTNAME , OP_TYPE from CDCORACLE WHERE OP_TYPE!='D' EMIT CHANGES;
CREATE STREAM DELETEOP AS
SELECT CAST(I AS BIGINT) as "_id", NAME , LASTNAME , OP_TYPE from CDCORACLE WHERE OP_TYPE='D' EMIT CHANGES;
Verify the Streams using the below command.
SHOW STREAMS;
Output:
8) Oracle to MongoDB Replication: Configuring MongoDB Kafka sink connector
The MongoDB Kafka sink connector is a connector that reads data from Apache Kafka and writes to MongoDB.
The below code consists of the configuration of the MongoDB Connector for the Apache Kafka.
{
"name": "Oracle",
"config": {
"connector.class": "com.mongodb.kafka.connect.MongoSinkConnector",
"topics": "WRITEOP",
"connection.uri": "mongodb://mongo1",
"writemodel.strategy": "com.mongodb.kafka.connect.sink.writemodel.strategy.UpdateOneBusinessKeyTimestampStrategy",
"database": "kafka",
"collection": "oracle",
"document.id.strategy": "com.mongodb.kafka.connect.sink.processor.id.strategy.PartialValueStrategy",
"document.id.strategy.overwrite.existing": "true",
"document.id.strategy.partial.value.projection.type": "allowlist",
"document.id.strategy.partial.value.projection.list": "_id",
"errors.log.include.messages": true,
"errors.deadletterqueue.context.headers.enable": true,
"value.converter":"io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url":"http://schema-registry:8081",
"key.converter":"org.apache.kafka.connect.storage.StringConverter"
}
}
The sink process consumes records from the WRITEOP topic and saves the data to MongoDB. UpdateOneBusinessKeyTimestampStrategy is the writemodel strategy that performs the upsert operation using the filter defined on PartialValueStrategy property. Upsert is used for insert and update operations in MongoDB, which is a combination of update and insert. This configuration script is written in mongodb-sink.json file in the Github repository. To configure the configuration, use the below command.
curl -X POST -H "Content-Type: application/json" -d @mongodb-sink.json http://localhost:8083/connectors
The delete events are written in the DELETEOP topic and are sent to MongoDB using the below sink configuration.
{
"name": "Oracle-Delete",
"config": {
"connector.class": "com.mongodb.kafka.connect.MongoSinkConnector",
"topics": "DELETEOP",
"connection.uri": "mongodb://mongo1”,
"writemodel.strategy": "com.mongodb.kafka.connect.sink.writemodel.strategy.DeleteOneBusinessKeyStrategy",
"database": "kafka",
"collection": "oracle",
"document.id.strategy": "com.mongodb.kafka.connect.sink.processor.id.strategy.PartialValueStrategy",
"document.id.strategy.overwrite.existing": "true",
"document.id.strategy.partial.value.projection.type": "allowlist",
"document.id.strategy.partial.value.projection.list": "_id",
"errors.log.include.messages": true,
"errors.deadletterqueue.context.headers.enable": true,
"value.converter":"io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url":"http://schema-registry:8081"
}
}
curl -X POST -H "Content-Type: application/json" -d @mongodb-sink-delete.json http://localhost:8083/connectors
Such sink process uses the DeleteOneBusinessKeyStrategy writemodel strategy. The write model strategy is a class that defines how your sink connector should write data using write models. In the above configuration, the sink process reads from the DELETEOP topic and deletes documents in MongoDB based on the filter defined on PartialValueStrategy property.
9) Oracle to MongoDB Replication: Writing Data to Oracle
After your environment is set up and configured, go to the Oracle database and insert the below data.
insert into C##MYUSER.emp (name, lastname) values ('Juan','Soto');
insert into C##MYUSER.emp (name, lastname) values ('Robert','Walters');
insert into C##MYUSER.emp (name, lastname) values ('Ruben','Trigo');
commit;
Notice the data arrived in MongoDB by accessing the MongoDB shell.
docker exec -it mongo1 /bin/mongo
The inserted data is now available in MongoDB, as shown below.
If you update any record in Oracle, you will see the updated record in the MongoDB document.
For example,
UPDATE C##MYUSER.emp SET name=’Rob’ WHERE name=’Robert’;
COMMIT;
Then, the document is updated in MongoDB, as below.
{
"_id" : NumberLong(11),
"LASTNAME" : "Walters",
"NAME" : "Rob",
"OP_TYPE" : "U",
"_insertedTS" : ISODate("2021-07-27T10:25:08.867Z"),
"_modifiedTS" : ISODate("2021-07-27T10:25:08.867Z")
}
If you delete the data in Oracle, as below:
DELETE FROM C##MYUSER.emp WHERE name=’Rob’; COMMIT;.
The documents having the name ‘Rob’ are deleted in MongoDB.
Integrate your data in minutes!
Conclusion
In this article, you learned to connect Oracle to MongoDB replication. This article focused on streaming databases from Oracle to MongoDB using Apache Kafka, the Confluent Oracle CDC Connector, and the MongoDB Connector for Apache Kafka. Generally, organizations dealing with massive data need a reliable, robust, and secure data management system. Therefore, Oracle provides organizations with a versatile data management system that can manage data, prevent any security threats, and provide seamless access to applications.
Hevo Data, a No-code Data Pipeline provides you with a consistent and reliable solution to manage data transfer between a variety of sources and a wide variety of Desired Destinations, with a few clicks. Hevo Data with its strong integration with 150+ sources (including 60+ free sources) such as Oracle & MongoDB allows you to not only export data from your desired data sources & load it to the destination of your choice. Try a 14-day free trial and experience the feature-rich Hevo suite firsthand. Also, check out our unbeatable pricing to choose the best plan for your organization.
FAQ on Oracle to MongoDB Replication
1. How to migrate data from Oracle to MongoDB?
To migrate your data from Oracle to MongoDB, You can use a data migration tool like Hevo, or Informatica or write a custom ETL script using Python with cx_Oracle and PyMongo libraries.
2. How to connect MongoDB to Oracle?
You can use automated pipelines like Hevo or write custom scripts to connect MongoDB to Oracle.
3. How to replicate a database in MongoDB?
To replicate a database in mongodb, you can use MongoDB’s built-in replication features like Replica Sets for data redundancy and high availability.
4. Why is MongoDB better than Oracle?
MongoDB has a schemaless structure and offers schema flexibility, horizontal scalability, and faster performance for unstructured data, while Oracle is better for complex transactions and structured data.
Manjiri is a proficient technical writer and a data science enthusiast. She holds an M.Tech degree and leverages the knowledge acquired through that to write insightful content on AI, ML, and data engineering concepts. She enjoys breaking down the complex topics of data integration and other challenges in data engineering to help data professionals solve their everyday problems.