Debezium is an open-source database monitoring platform that mainly allows you to implement the process of CDC (Change Data Capture). With Debezium connectors, you can seamlessly perform CDC to continuously monitor, capture, and stream row-level changes or updates made on your external database systems like Oracle and MySQL.
Usually, to run the Debezium application, you have to set up a Kafka environment with Zookeeper, Kafka servers, and Kafka Connect. However, every real-time messaging application will not always need the increased scalability and fault tolerance that Kafka architecture offers you by default. In such cases, you can separately develop and run a Debezium Spring Boot application using production-ready development frameworks to capture real-time changes on your database. On executing CDC with Debezium Spring Boot, you eliminate the need for simultaneously running the Kafka environment, thereby achieving low cost and highly efficient infrastructure.
In this article, you will learn about the Debezium Spring Boot application, and how to perform CDC using the embedded Debezium Spring Boot engine.
Table of Contents
Prerequisites for Running Debezium Spring Boot Application
Before starting out with the Debezium Spring Boot application, you are required to know the basics of real-time event streaming and databases.
What is Debezium?
Developed by Red Hat, Debezium is an open-source distributed platform that allows you to monitor and capture modifications on your database systems. In other words, Debezium is used to perform CDC operations for continuously tracking and streaming real-time changes that occur parallelly on your database.
With Debezium, you can easily track the row-level changes whenever operations like update, insert, and delete are performed on an external database application. In addition, Debezium provides you with a collection of Debezium Connectors that help you monitor and capture real-time changes from external database systems like MySQL, PostgreSQL, SQL Server, Oracle, Db2, and many more.
Debezium also comes with an excellent feature to record the history of data changes in durable, replicated logs. Using this your application can be stopped and restarted at any time, and it will be able to consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely.
If you would like to learn more about the Debezium SQL Server integration, visit our easy-to-follow guide here- Setting Up Debezium SQL Server Integration: 3 Easy Steps.
In case you would like to have an overview of some of the best features offered by Debezium for Data Engineers, consider reading this article- Debezium Features for Data Engineers: 5 Best Features.
Debezium Spring Boot Application Architecture
Spring Boot is an open-source Java-based web framework that allows you to build and execute end-to-end applications. The above-given image is the basic workflow architecture to perform CDC using the embedded Debezium Spring Boot database connector.
As shown in the above image, the MySQL database located at port 3305 is your source database, while the MySQL database port 3306 is the target database. The embedded Debezium Spring Boot database connector configured in the Spring Boot application will capture changes whenever any database operations like insert, update, and delete are made on the MySQL source database. Consequently, the Debezium Spring Boot database connector sends and syncs real-time database changes to the MySQL target database.
Using an embedded Debezium Spring Boot database connector, you can attain a straightforward way of capturing real-time event changes to and from the source and target databases, respectively. Since this method only involves standalone Debezium servers to read and write real-time changes, you need not rely on the Kafka Connect platform for managing external database system modifications.
Furthermore, with Spring Boot application’s simplex infrastructure, you can effectively push and pull database modifications to and from the Debezium servers to continuously monitor and capture row-level changes made on any external database system.
Hevo Data, a No-Code Data Pipeline helps to load data from any data source such as Databases, SaaS applications, Cloud Storage, SDKs, and Streaming Services and simplifies the ETL process.
Hevo supports 100+ data sources (including 40+ free data sources) like Kafka and is a 3-step process by just selecting the data source, providing valid credentials, and choosing the destination. Hevo not only loads the data into the desired Data Warehouse/destination but also enriches the data and transforms it into an analysis-ready form without having to write a single line of code.
GET STARTED WITH HEVO FOR FREE[/hevoButton]
Its completely automated pipeline offers data to be delivered in real-time without any loss from source to destination. Its fault-tolerant and scalable architecture ensure that the data is handled in a secure, consistent manner with zero data loss and supports different forms of data. The solutions provided are consistent and work with different BI tools as well.
Check out why Hevo is the Best:
SIGN UP HERE FOR A 14-DAY FREE TRIAL
- Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
- Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
- Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
- Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
- Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
- Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
- Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
Steps to Run CDC Using Debezium Spring Boot Database Connector
For running CDC (Capture Data Change) to record real-time changes of messages or events, you will be using the Debezium Spring Boot database connector in the steps mentioned below. Since Debezium provides you with a stand-alone server, you can capture and record any changes made on the external database without the Kafka environment setup.
On configuring the stand-alone Debezium server, you can seamlessly use any Debezium source connector to record modifications on external database systems like MySQL, PostgreSQL, and Oracle. Such Debezium database connectors not only write row-level source modifications to the Debezium servers but also send event updates to external messaging services like Google Cloud Pub and Amazon Kinesis.
In the steps below, you will create a Debezium Spring Boot application that runs the Debezium engine to capture data modifications from the external database systems. In addition, you should also configure the respective Debezium Spring Boot database connector within the application to implement the CDC process.
Step 1: Add Maven Dependencies
Before developing a Debezium Spring Boot application, you have to install and configure specific dependencies necessary for performing CDC operations. Initially, you must add Maven dependencies to your application’s pom.xml file.
POM (Project Object Model) is nothing but an XML file that contains all configuration and project details required to run an end-to-end application. Such configuration information includes project versions, software dependencies, a source directory, a build directory, plugins, and much more.
Add the code given below to the respective Spring Boot project’s pom.xml file.
Now, you should add dependencies for the embedded Debezium Spring Boot database connector that you would use in the Spring Boot application for performing CDC. You should add the connector dependencies based on your external database. In this case, since the Spring Boot application reads changes from the MySQL database for performing CDC, you must add the necessary dependencies for the MySQL connector.
Execute the code given below to add dependencies for the Debezium MySQL connector.
In the next step, you should install and configure the external database environment. You can implement the database configuration manually or use a docker-compose file that sets up and configures the necessary databases required to execute CDC using Spring Boot, like this:
After executing the docker file, both source and target databases are running at ports 3305 and 3306, respectively. Now, create a sample table named “customer” in the source database. Execute the following SQL command to create a new table.
CREATE TABLE customer
id integer NOT NULL,
fullname character varying(255),
email character varying(255),
CONSTRAINT customer_pkey PRIMARY KEY (id)
In the above command, you created a new customer table with id, name, and email information.
Step 2: Configure the Debezium Spring Boot Database Connector
Now that you are done adding Debezium Spring Boot database connector dependencies, the next step is to configure the Debezium MySQL connector by creating a Debezium configuration bean (bean is a method-level annotation of a specific instance in your application).
In your Spring Boot application, execute the following code to configure the Debezium MySQL connector.
In the above code, you configured Debezium MySQL connector to track changes from MySQL database. When the Debezium MySQL connector runs, it tracks and records all the modifications done on the MySQL database. The offset.storage configuration assists the application in tracking all changes or updates from your external database application.
The class named FileOffsetBackingStore is used to store all the tracked changes from the external database. With the offset.storage.file.filename configuration, you can store and organize the tracked offsets or changes in a local file.
Step 3: Run the Debezium Engine
After configuring the Debezium MySQL connector, you are now all set to run the Debezium engine. Use the code snippet below to run your Debezium engine.
On executing the above code, you have successfully configured the Debezium engine. Whenever a data change occurs on an external database, the Debezium engine calls the “handleChangeEvent” for continuously capturing every update from the respective database system.
Now, you have to start and execute it using the Executor service API, as follows:
On executing the above code, you have successfully started the Debezium engine, which is now ready to capture changes from the database. To check whether the Debezium engine is configured correctly, you can make row-level changes to the Customer table created previously in the MySQL database.
To demonstrate this, navigate back to your MySQL workspace, and execute the following command.
INSERT INTO customerdb.customer (id, fullname, email) VALUES (1, 'John Doe', 'email@example.com')
With this, you have just inserted a new row with respect to the table columns such as ID, Full name, and email. After running the SQL query for performing an inserting operation in the MySQL database table, you will get the output in your Debezium Spring Boot framework, as shown below.
When the Debezium engine tracks any change from the source database, you can notice that the new record is inserted in your target database as well.
You can further update the newly inserted records to check whether the Debezium engine is tracking the data changes from the external database. Execute the following command to update the record in the customer table.
UPDATE customerdb.customer t SET t.email = 'firstname.lastname@example.org' WHERE t.id = 1
On executing the above command, you will get the output that resembles the following image. In the last line of the output snippet, you can notice the type of operation made on the customer table.
The output of the target database will resemble the following image.
You can also perform a delete operation in your source database by executing the following command.
DELETE FROM customerdb.customer WHERE id = 1
The output of the above-executed deletion operation will be displayed with the operation name associated with it.
The deletion operation is also reflected in the target database, which can be verified by executing the following command.
select * from customerdb.customer where id= 1
If you get the output as “0 rows retrieved,” you can confirm that the change is reflected in the target database.
Following the above-given steps, you have now successfully performed a CDC operation using the embedded Debezium Spring Boot database connector.
In this article, you learned about Debezium, Spring Boot, Debezium Spring Boot application infrastructure, and how to perform CDC using Debezium Spring Boot database connector. This article mainly covered how to develop a Spring Boot application that captures real-time data changes from the MySQL database system by configuring the Debezium MySQL connector within the application.
However, you can also use other Debezium connectors like PostgreSQL and Oracle connectors to capture row-level changes from PostgreSQL and Oracle databases systems, respectively. If you are using such a diverse set of database data sources, it’s difficult to carry out insightful analysis, given the complexity of data types and different sources.
Thanks to Hevo that offers a faster way to move data from 100+ Data Sources including Databases or SaaS applications like Kafka into your Data Warehouse to be visualized in a BI tool. Hevo is fully automated and hence does not require you to code.
VISIT OUR WEBSITE TO EXPLORE HEVO
Want to take Hevo for a spin?SIGN UP and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.
Share your understanding about learning the Debezium Spring Boot database connection in the comments down below.