Modern applications continuously make changes in the Databases, store and retrieve information. Companies want to feed their applications with up-to-date data. Slow data processing delays business decisions. Organizations streamline various business activities by monitoring the changes in the Database and performing different tasks based on the change in data. 

Debezium SQL Server Integration enables users to develop applications dependent on changes in the Database such as insert, update, and delete events. In this article, you will learn the steps to set up Integration. You will also read about the benefits of using the Integration and how it helps companies make better business decisions in real time by getting up-to-date data into their applications.

Prerequisites 

  • SQL Server installed on your local machine.
  • Apache ZooKeeper, Apache Kafka, and Kafka Connect on your local machine.
  • A brief knowledge of Kafka.

Introduction to Debezium

Debezium Logo

Debezium is an open source low latency data streaming and distributed platform for change data capture (CDC). It is used by organizations and Developers to monitor any data change in the Database and then generate triggers for every row-based data change. Debezoum uses Kafka Connect for managing CDC in the Databases and streaming it to the applications. 

Key Features of Debezium

Debezium helps companies immediately respond to changes in the Database with low latency. Some of the main features of Debezium are listed below.

  • Masking: Debezium offers users an option to mask sensitive information such as card numbers, bank details, etc. To ensure data privacy while streaming data.
  • Message Transformations: Debezium comes with many ready-to-use message transformations that allow users to handle message routing, filtering, event flattening, etc.
  • Low Latency Streaming: Debezium makes it possible for companies to stay up to date with the data even in the low latency environment. It uses Kafka Connect to immediately respond to the change data capture.

Introduction to SQL Server

SQL Server Logo

SQL Server is a Relational Database Management System developed by Microsoft to help individuals and organizations to store data in rows and columns. It is widely used by companies as it offers support for transaction processing and Business Analytics applications. SQL Server allows users to store and retrieve data using applications and APIs requests. The structured data present in SQL Server can be accessed and manipulated using the SQL language.

Key Features of SQL Server

Some of the main features of the SQL Server are listed below.

  • Polybase: SQL Server comes with Polybase that is a part of SQL module to enable users to run fast and parallel T-SQL queries and it can even be used to go out into external storage.
  • Data Recovery: SQL Server offers the ADR (Accelerated Data Recovery) technology which is a new way of performing Database recovery in an event of system rollback or any transaction rollback.
  • Resumable Index Build: SQL Server can stop the index building operations and start it at some other point in time.

Setting Up Debezium SQL Server Integration

Debezium SQL Server Integration
Image Source

Now that you have understood about Debezium and SQL Server. In this section, you will learn about the steps to set up Debezium SQL Server Integration. The Integration uses connectors and before that, it is essential to enable the Change Data Capture feature in the SQL Server. The following steps for Debezium SQL Server Integration are listed below.

Step 1: Enabling CDC on the SQL Server

  • To enable the CDC, you have to run a system stored procedure. It can be done using SQL Server Management Studio or using Transact-SQL.
  • Make sure that SQL Server Agent is running.
  • Here, SQL Server Management Studio will be used for Debezium SQL Server Integration.
  • Open the SQL Server Management Studio, click on the “View” menu and select the “Template Explorer” option.
  • Then expand the “SQL Server Templates” option by clicking on it.
  • Here, from the “Change Data Capture” option select the “Configuration” option and click on the “Enable Database for CDC” option.
  • Now in the template, replace the Database name that you want to enable for CDC, next to the USE command.
  • The following example code is given below to enable CDC for Debezium SQL Server Integration.
USE MyDB
GO
EXEC sys.sp_cdc_enable_db
GO
  • Now, let’s enable the CDC for a particular table in the SQL Server Database.
  • For this the stored procedure “sys.sp_cdc_enable_table” will be used.
  • Click on the “View” menu and select the “Template Explorer” option.
  • Then expand the “SQL Server Templates” option by clicking on it.
  • Here, from the “Change Data Capture” option select the “Configuration” option and click on the “Enable Table Specifying Filegroup Option“.
  • Now in the template, replace the table name that you want to enable for CDC, next to the USE command.
  • The following example code is given below to enable CDC in a table for Debezium SQL Server Integration.
USE MyDB
GO

EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name = N'MyTable', 
@role_name = N'MyRole',  
@filegroup_name = N'MyDB_CT',
@supports_net_changes = 0
GO

Step 2: Deploying the Debezium SQL Server Connector 

  • For this make sure you have Apache ZooKeeper, Apache Kafka, and Kafka Connect installed.
  • Download the Debezium SQL Server connector plug-in archive.
  • Extract the files of the connector into your Kafka Connect environment.
  • Now, add the directory with the JAR files into Kafka Connect’s plugin.path.
  • Configure the Connector and add the configurations such as port, address.
  • You need to configure the connector in a JSON file by setting the configuration properties.
  • The sample of a JSON file for configurations of Debezium SQL Server Connector is given below.
{
    "name": "inventory-connector", 
    "config": {
        "connector.class": "io.debezium.connector.sqlserver.SqlServerConnector", 
        "database.hostname": "192.168.99.100", 
        "database.port": "1433", 
        "database.user": "sa", 
        "database.password": "Password!", 
        "database.dbname": "testDB", 
        "database.server.name": "fullfillment", 
        "table.include.list": "dbo.customers", 
        "database.history.kafka.bootstrap.servers": "kafka:9092", 
        "database.history.kafka.topic": "dbhistory.fullfillment" 
    }
}
  • In the above JSON file, the address and port of the SQL Server instance are “192.168.99.100” and “1433” respectively.
  • The user of SQL Server instance is “sa”.
  • Now, you can send the configuration of this Debezium SQL Server connector using the POST command to the running Kafka Connect services. It will record the configuration and start one connector task which will perform the following tasks listed below.
    • Connects to the SQL Server database.
    • Reads the transaction log.
    • Records change events to Kafka topics.

Step 3: Running the Debezium SQL Server Connector 

  • To run the Debezium SQL Server connector, you need to create a connector configuration.
  • For that, you have to use the Kafka Connect REST API so that you can add the connector configuration to the Kafka Cluster.
  • It can be configured using the listener’s configuration option in Kafka Connect.
  • The format for the list of listeners is given below.
protocol://host:port,protocol2://host2:port2
  • Here, the protocols are “HTTPS” or “HTTP”.
  • Once you start the Debezium SQL Server connector, it will start taking snapshots of the SQL Server Database. Then it will start generating data change events to stream the change event records to Kafka topics.

That’s it! You have completed the Debezium SQL Server Integration.

Benefits of Debezium SQL Server Integration

Some of the main features of Debezium SQL Server Integration are listed below.

  • Integration allows users to push the data streams for every event data change in the SQL Server Database.
  •  Debezium SQL Server Connector publishes many metrics about the connector’s activities that can be monitored through JMX. 
  • Integration helps in moving the change stream data into a data lake to allow for disaster recovery or replaying part of the data.

Conclusion 

In this article, you learnt about Debezium, SQL Server, and the steps to set up Integration using the Debezium SQL Server connector. You also read about the benefits of using the connector and how it helps companies manage the change data capture events and trigger events. Debezium helps organizations keep the data up to date in applications and other systems that can help them automatically trigger particular events.

Share your experience of learning about Integration in the comments section below!

Aditya Jadon
Research Analyst, Hevo Data

Aditya Jadon is a data science enthusiast with a passion for decoding the complexities of data. He leverages his B. Tech degree, expertise in software architecture, and strong technical writing skills to craft informative and engaging content. Aditya has authored over 100 articles on data science, demonstrating his deep understanding of the field and his commitment to sharing knowledge with others.

No-code Data Pipeline For your Microsoft SQL Server