Setting Up Debezium SQL Server Integration: 3 Easy Steps

on Data Integration, Data Streaming, Database Management Systems, Debezium, Kafka, Microsoft SQL Server, SQL, SQL Server • February 16th, 2022 • Write for Hevo

Debezium SQL Server

Modern applications continuously make changes in the Databases, store and retrieve information. Companies want to feed their applications with up-to-date data. Slow data processing delays business decisions. Organizations streamline various business activities by monitoring the changes in the Database and performing different tasks based on the change in data. 

Change Data Capture (CDC) keeps track of all the row-based changes that occur in the Database and helps the organization to quickly act on the changes to offer better services in real-time. Debezium is a platform designed to monitor changes in the Database. SQL Server is widely used by companies to store and retrieve data and run analytics on it. Any change happening in SQL Server can be detected by Debezium and send triggers to specific applications. Debezium SQL Server Integration allows Developers to capture low latency data in real-time.

Debezium SQL Server Integration enables users to develop applications dependent on changes in the Database such as insert, update, and delete events. In this article, you will learn about the steps to set up Debezium SQL Server Integration. You will also read the benefits of using the Debezium SQL Server Integration and how it helps companies make better business decisions in real-time by getting up-to-date data into their applications.

Table of Contents

Prerequisites 

  • SQL Server installed on your local machine.
  • Apache ZooKeeper, Apache Kafka, and Kafka Connect on your local machine.
  • A brief knowledge of Kafka.

Introduction to Debezium

Debezium Logo
Image Source

Debezium is an open source low latency data streaming and distributed platform for change data capture (CDC). It is used by organizations and Developers to monitor any data change in the Database and then generate triggers for every row-based data change. Debezoum uses Kafka Connect for managing CDC in the Databases and streaming it to the applications. 

Developers can set up any trigger on events of data add, delete, and update so that the application can get the response immediately as the change occurs in the Database by other applications or requests. To deliver high performance, durability, and reliability, Debezium reuses Kafka where each connector deployed to Kafka Connect monitors a single upstream Database. Debezium comes with the modules to ensure that all the changes are seen and processed minimally impacting the Database.

Key Features of Debezium

Debezium helps companies immediately respond to changes in the Database with low latency. Some of the main features of Debezium are listed below.

  • Masking: Debezium offers users an option to mask sensitive information such as card numbers, bank details, etc. To ensure data privacy while streaming data.
  • Message Transformations: Debezium comes with many ready-to-use message transformations that allow users to handle message routing, filtering, event flattening, etc.
  • Low Latency Streaming: Debezium makes it possible for companies to stay up to date with the data even in the low latency environment. It uses Kafka Connect to immediately respond to the change data capture.

To know more about Debezium, click here.

Introduction to SQL Server

SQL Server Logo
Image Source

SQL Server is a Relational Database Management System developed by Microsoft to help individuals and organizations to store data in rows and columns. It is widely used by companies as it offers support for transaction processing and Business Analytics applications. SQL Server allows users to store and retrieve data using applications and APIs requests. The structured data present in SQL Server can be accessed and manipulated using the SQL language.

Microsoft offers multiple editions of SQL Server that can handle workloads ranging from small machine applications to scalable applications with many concurrent users. Tabular Data Stream (TDS) is used to communicate for all the operations such as update, delete, etc. that can be invoked on SQL Server.

Key Features of SQL Server

Some of the main features of the SQL Server are listed below.

  • Polybase: SQL Server comes with Polybase that is a part of SQL module to enable users to run fast and parallel T-SQL queries and it can even be used to go out into external storage.
  • Data Recovery: SQL Server offers the ADR (Accelerated Data Recovery) technology which is a new way of performing Database recovery in an event of system rollback or any transaction rollback.
  • Resumable Index Build: SQL Server can stop the index building operations and start it at some other point in time.

To know more about SQL Server, click here.

Simplify Data Analysis with Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline helps to load data from any data source such as Databases, SaaS applications, Cloud Storage, SDK,s, and Streaming Services and simplifies the ETL process. It supports 100+ data sources (including 30+ free data sources) and is a 3-step process by just selecting the data source, providing valid credentials, and choosing the destination. Hevo not only loads the data onto the desired Data Warehouse/destination but also enriches the data and transforms it into an analysis-ready form without having to write a single line of code.

Get Started with Hevo for Free

Its completely automated pipeline offers data to be delivered in real-time without any loss from source to destination. Its fault-tolerant and scalable architecture ensures that the data is handled in a secure, consistent manner with zero data loss and supports different forms of data. The solutions provided are consistent and work with different BI tools as well.

Check out why Hevo is the Best:

  1. Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  2. Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  3. Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  4. Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  5. Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  6. Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, E-Mail, and support calls.
  7. Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
Sign up here for a 14-Day Free Trial!

Setting Up Debezium SQL Server Integration

Debezium SQL Server Integration
Image Source

Now that you have understood about Debezium and SQL Server. In this section, you will learn about the steps to set up Debezium SQL Server Integration. The Debezium SQL Server Integration uses connectors and before that, it is essential to enable the Change Data Capture feature in the SQL Server. The following steps for Debezium SQL Server Integration are listed below.

Step 1: Enabling CDC on the SQL Server

  • To enable the CDC, you have to run a system stored procedure. It can be done using SQL Server Management Studio or using Transact-SQL.
  • Make sure that SQL Server Agent is running.
  • Here, SQL Server Management Studio will be used for Debezium SQL Server Integration.
  • Open the SQL Server Management Studio, click on the “View” menu and select the “Template Explorer” option.
  • Then expand the “SQL Server Templates” option by clicking on it.
  • Here, from the “Change Data Capture” option select the “Configuration” option and click on the “Enable Database for CDC” option.
  • Now in the template, replace the Database name that you want to enable for CDC, next to the USE command.
  • The following example code is given below to enable CDC for Debezium SQL Server Integration.
USE MyDB
GO
EXEC sys.sp_cdc_enable_db
GO
  • Now, let’s enable the CDC for a particular table in the SQL Server Database.
  • For this the stored procedure “sys.sp_cdc_enable_table” will be used.
  • Click on the “View” menu and select the “Template Explorer” option.
  • Then expand the “SQL Server Templates” option by clicking on it.
  • Here, from the “Change Data Capture” option select the “Configuration” option and click on the “Enable Table Specifying Filegroup Option“.
  • Now in the template, replace the table name that you want to enable for CDC, next to the USE command.
  • The following example code is given below to enable CDC in a table for Debezium SQL Server Integration.
USE MyDB
GO

EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name = N'MyTable', 
@role_name = N'MyRole',  
@filegroup_name = N'MyDB_CT',
@supports_net_changes = 0
GO

Step 2: Deploying the Debezium SQL Server Connector 

  • For this make sure you have Apache ZooKeeper, Apache Kafka, and Kafka Connect installed.
  • Download the Debezium SQL Server connector plug-in archive, here.
  • Extract the files of the Debezium SQL Server connector into your Kafka Connect environment.
  • Now, add the directory with the JAR files into Kafka Connect’s plugin.path.
  • Configure the Debezium SQL Server Connector and add the configurations such as port, address.
  • You need to configure the Debezium SQL Server connector in a JSON file by setting the configuration properties.
  • The sample of a JSON file for configurations of Debezium SQL Server Connector is given below.
{
    "name": "inventory-connector", 
    "config": {
        "connector.class": "io.debezium.connector.sqlserver.SqlServerConnector", 
        "database.hostname": "192.168.99.100", 
        "database.port": "1433", 
        "database.user": "sa", 
        "database.password": "Password!", 
        "database.dbname": "testDB", 
        "database.server.name": "fullfillment", 
        "table.include.list": "dbo.customers", 
        "database.history.kafka.bootstrap.servers": "kafka:9092", 
        "database.history.kafka.topic": "dbhistory.fullfillment" 
    }
}
  • In the above JSON file, the address and port of the SQL Server instance are “192.168.99.100” and “1433” respectively.
  • The user of SQL Server instance is “sa”.
  • Now, you can send the configuration of this Debezium SQL Server connector using the POST command to the running Kafka Connect services. It will record the configuration and start one connector task which will perform the following tasks listed below.
    • Connects to the SQL Server database.
    • Reads the transaction log.
    • Records change events to Kafka topics.

Step 3: Running the Debezium SQL Server Connector 

  • To run the Debezium SQL Server connector, you need to create a connector configuration.
  • For that, you have to use the Kafka Connect REST API so that you can add the connector configuration to the Kafka Cluster.
  • It can be configured using the listener’s configuration option in Kafka Connect.
  • The format for the list of listeners is given below.
protocol://host:port,protocol2://host2:port2
  • Here, the protocols are “HTTPS” or “HTTP”.
  • Once you start the Debezium SQL Server connector, it will start taking snapshots of the SQL Server Database. Then it will start generating data change events to stream the change event records to Kafka topics.

That’s it! You have completed the Debezium SQL Server Integration.

Benefits of Debezium SQL Server Integration

Some of the main features of Debezium SQL Server Integration are listed below.

  •  Debezium SQL Server Integration allows users to push the data streams for every event data change in the SQL Server Database.
  •  Debezium SQL Server Connector publishes many metrics about the connector’s activities that can be monitored through JMX. 
  • Debezium SQL Server Integration helps in moving the change stream data into a data lake to allow for disaster recovery or replaying part of the data.

Conclusion 

In this article, you learnt about Debezium, SQL Server, and the steps to set up Debezium SQL Server Integration using the Debezium SQL Server connector. You also read about the benefits of using the Debezium SQL Server connector and how it helps companies manage the change data capture events and trigger events. Debezium helps organizations keep the data up to date in applications and other systems that can help them automatically trigger particular events.

Visit our Website to Explore Hevo

Companies need to analyze their business data stored in multiple data sources. The data needs to be loaded to the Data Warehouse to get a holistic view of the data. Hevo Data is a No-code Data Pipeline solution that helps to transfer data from 100+ sources to desired Data Warehouse. It fully automates the process of transforming and transferring data to a destination without writing a single line of code.

Want to take Hevo for a spin? Sign Up here for a 14-day free trial and experience the feature-rich Hevo suite first hand.

Share your experience of learning about Debezium SQL Server Integration in the comments section below!

No-code Data Pipeline For your Microsoft SQL Server