Do you want to transfer your Microsoft SQL Server data using Apache Kafka? Are you finding it challenging to connect Apache Kafka to SQL Server? If yes, then you’ve landed at the right place! This article will answer all your queries & relieve you of the stress of finding a truly efficient solution.

Follow our easy step-by-step guide to help you successfully set up a connection between Apache Kafka & Microsoft SQL Server. You can seamlessly transfer data to/from Microsoft SQL Server for a fruitful real-time analysis. It will also help you build a customized ETL pipeline for your organization. With Kafka Connect SQL Server, you can easily configure data pipelines to stream data in real-time from Kafka topics to SQL Server tables.  

Let’s jump right into it?

Methods to Connect Apache Kafka to SQL Server

There are multiple ways in which you can connect Apache Kafka to SQL Server:

Method 1: Using Hevo to Connect Apache Kafka to SQL Server

Hevo Logo: Kafka to SQL Server

Hevo, a No-code Data Pipeline, helps you transfer data from Apache Kafka (among 150+ Data sources) to Microsoft SQL Server & lets you visualize it in a BI tool. Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code.

Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss and real-time.

Sign up here for a 14-Day Free Trial!

Steps to use Hevo:

Hevo focuses on two simple steps to get you started:

  • Configure Source: Connect Hevo with a data source of your choice such as Apache Kafka by providing a unique name for your Data Pipeline, along with information about your Bootstrap Servers. You can also select the specific topics from where you want to ingest data.
Kafka to SQL Server: Source Configuration
Image Source: Hevo Docs
  • Integrate Data: Load data from Apache Kafka to SQL Server by providing a unique name for your destination along with your database credentials such as username and password. To help Hevo connect with your Microsoft SQL Server database, you will also have to provide information such as the host IP, port number and the name & schema of your database.
Kafka to SQL Server: Destination Configuration
Image Source: Hevo Docs

If you’re looking to expand your knowledge about Kafka, this guide offers insights into how Kafka can be used for Change Data Capture, allowing you to capture and process data changes in real time. Additionally, if you’re interested in learning about Kafka’s role in data ingestion, read our helpful guide Kafka for Data Ingestion.

Check out what makes Hevo amazing:

  • Secure: Discover peace with end-to-end encryption and compliance with all major security certifications including HIPAA, GDPR, SOC-2.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Transformations: Hevo provides preload transformations through Python code. It also allows you to run transformation code for each event in the pipelines you set up. You need to edit the event object’s properties received in the transform method as a parameter to carry out the transformation. Hevo also offers drag-and-drop interface like Date and Control Functions, JSON, and Event Manipulation to name a few. These can be configured and tested before putting them to use.
  • Transparent Pricing: Say goodbye to complex and hidden pricing models. Hevo’s Pricing brings complete visibility to your ELT spend. Choose a plan based on your business needs. Stay in control with spend alerts and configurable credit limits for unforeseen spikes in data flow.
Get started for Free with Hevo!

Method 2: Using the Debezium SQL Server Connector to Connect Apache Kafka to SQL Server

This method talks about using Debezium SQL Server to establish Kafka Connect SQL Server Integration.

Apache Kafka supports connecting with Microsoft SQL Server and numerous other databases/data warehouses with the help of various in-built connectors. These connectors help bring in data from a source of your choice to Apache Kafka and then stream it to the destination of your choice from Kafka Topics.

Kafka to SQL Server using Debezium
Image Source: Medium

This can be implemented using the following steps:

Step 1: Configuring Microsoft SQL Server to Enable CDC

To configure your Microsoft SQL Server instance, launch the Microsoft SQL Server Management Studio on your system, and select the properties option by right-clicking on the server option in your object explorer.

Selecting the Properties Option.
Image Source: Medium

Once you’ve clicked on it, a new window will now open up on your screen representing the server properties. To allow the Debezium’s SQL Server connector to fetch data in real-time from your Microsoft SQL Server instance, click on the “Security” option and choose “SQL Server and Windows Authentication” as the desired authentication mechanism.

Configuring Server Authentication.
Image Source: Medium

Now restart your Microsoft SQL Server instance to bring the changes in effect. With your Microsoft SQL Server Management Studio now up and running, expand the “Security” folder, found in the object explorer and right-click on the “Login” folder. 

You now need to create a new login for your Microsoft SQL Server. To do this, add a new username as “testuser” along with a password of your choice. 

Setting up the User in SQL Server.
Image Source: Medium

Click on the server roles option found in the panel on the left and select “sysadmin” from the list of options available on your screen and click on okay.

Selecting the Sysadmin Server Roles.
Image Source: Medium

Once you’ve made all the necessary configurations, you now need to enable the CDC functionality. To do this, use your Windows Powershell to start the Notepad application & create a new SQL file as “Test_DB.sql”. Add the following lines of code to your file and save it:

# Creating a temporary test database 
CREATE DATABASE Test_DB;
GO
USE Test_DB;
EXEC sys.sp_cdc_enable_db;ALTER DATABASE Test_DB
SET CHANGE_TRACKING = ON
(CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON)

# Creating a temporary table & enabling CDC for the same
CREATE TABLE prospects (
  id INTEGER IDENTITY(1001,1) NOT NULL PRIMARY KEY,
  first_name VARCHAR(255) NOT NULL,
  last_name VARCHAR(255) NOT NULL,
  email VARCHAR(255) NOT NULL UNIQUE
);
INSERT INTO prospects(first_name,last_name,email)
  VALUES ('Jeff','Thomas','thomas.jeff@afer.com');
INSERT INTO customers(first_name,last_name,email)
  VALUES ('Sasha','Bailey','s.bailey@foobar.com');
INSERT INTO customers(first_name,last_name,email)
  VALUES ('Edward','Hart','edhart@walker.com');

EXEC sys.sp_cdc_enable_table @source_schema = 'dbo', @source_name = 'fruit', @role_name = NULL, @supports_net_changes = 0;
GO

With your SQL file now ready, restart your Microsoft SQL Server instance using the following lines of code:

net stop MSSQLSERVER
net start MSSQLSERVER

Launch your Microsoft SQL Service Agent and execute your “Test_DB.sql” file using the following lines of code:

net start SQLSERVERAGENT
sqlcmd -U testuser -P {your password} -i {location of Test_DB.sql}
Executing SQL File using th Windows Powershell.
Image Source: Medium

This is how you can configure Microsoft SQL Server and enable the CDC functionality to connect Apache Kafka to SQL Server.

Learn insights and step-by-step instructions on how to export data from MySQL to CSV format, enabling you to manage and analyze your data effectively.

Step 2: Installing Apache Kafka on your Workstation

With your Microsoft SQL Server now ready, you need to download and install Apache Kafka, either on standalone or distributed mode. You can check out the following links & follow Apache Kafka’s official documentation, that will help you get started with the installation process:

Step 3: Installing the Debezium Microsoft SQL Server Connector for Apache Kafka

Confluent provides users with a diverse set of in-built connectors that act as a source or a sink and allow users to transfer data to the destination of their choice from their desired data source such as Microsoft SQL Server via Apache Kafka. One such connector that lets users connect Apache Kafka to SQL Server is the Debezium SQL Server connector for Apache Kafka.

To install the Debezium SQL Server connector, go to Confluent Hub’s official website and search for Microsoft SQL Server using the search bar found at the top of your screen. In case you’re finding it challenging to locate the correct connector, you can click here to go to the connector’s page directly.

Installing the Debezium SQL Server Connector.
Image Source: Medium

Once you’ve found the desired Debezium SQL Server connector, you can now download the connector by executing the following command on the Confluent CLI:

confluent-hub install debezium/debezium-connector-sqlserver:1.2.2

In case you don’t have the Confluent CLI installed on your system, you can click here and follow Confluent’s official documentation that will help you install and configure the Confluent CLI. 

You can also click on the download button to install the connector. Once you’ve clicked on it, a zip file will now start downloading on your system. Extract the zip file and copy all jar files, found in the lib folder to your Confluent installation.

This is how you can install the Debezium SQL Server connector that will help you connect Apache Kafka to SQL Server.

Solve your data replication problems with Hevo’s reliable, no-code, automated pipelines with 150+ connectors.
Get your free trial right away!

Step 4: Starting the Apache Kafka, Zookeeper & Connect Server

Once you’ve installed the Debezium SQL Server connector for Apache Kafka, you now need to start the Zookeeper, Kafka and schema registry. To do this, execute the following lines of code on different terminals:

/bin/zookeeper-server-start ./etc/kafka/zookeeper.properties
./bin/kafka-server-start  ./etc/kafka/server.properties
./bin/schema-registry-start ./etc/schema-registry/schema-registry.properties

In case you’re running Confluent Platform v5.3 or above, you can also use the following line of code:

confluent local start connect

Once you’ve enabled your Apache Kafka servers, you will have to create a JSON configuration file for Kafka Connect, that you can load using a POST REST API call. It will instruct the Debezium connector to monitor your Microsoft SQL Server instance, using the particular port number and host IP. You can make the API call using the following lines of code:

curl -i -X POST -H "Accept:application/json" -H "Content-Type:application/json" localhost:8083/connectors/ -d '{ "name": "test-connector", 
"config": { 
"connector.class": "io.debezium.connector.sqlserver.SqlServerConnector", "database.hostname": "{Private IP Address}", 
"database.port": "1433", 
"database.user": "testuser",
 "database.password": "password!", 
"database.dbname": "TestDB", 
"database.server.name": "TestDB", 
"table.whitelist": "dbo.topic_name", 
"database.history.kafka.bootstrap.servers": "localhost:9092", 
"database.history.kafka.topic": "dbhistory.fulfillment" } 
}';

You can now verify if the Debezium Connector is running properly or not, using the following curl command:

curl -k https://localhost:8083/connectors/sqlserver-posts-connector/status

This is how you can start the Apache Kafka, Zookeeper & Connect server.

Step 5: Integrating Apache Kafka to SQL Server to Start Ingesting Data

With Apache Kafka servers now up & running on your system, you can start ingesting data from Microsoft SQL Server. To do this, you will have to create a Consumer for your Apache Kafka server that will read data from your desired topic of the Test_DB. You can use the following line of code to create a Kafka Consumer:

/etc/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic Test_DB.dbo.topic_name --from-beginning

You will now be able to see that your Kafka Consumer starts fetching data from your Kafka Topic in the JSON format.

Kafka Consumer Fetching Data.
Image Source: Medium

You can also verify if Apache Kafka is consuming your data correctly or not by modifying your database and then checking your Kafka Consumer output. To do this, launch Windows Powershell on your system and create a new SQL file as “new_insert.sql” with the following lines of code:

use TestDB;
INSERT INTO customers(first_name,last_name,email)
VALUES ('Helen','Keyler','keyler.helen@wr.com');

Save your file and execute it using the following command:

sqlcmd -U testuser -P password! -i {location of new_insert.sql}

You will now be able to see Apache Kafka consuming new records from your Microsoft SQL Server table as follows:

Kafka Consuming New Records.
Image Source: Medium

This is how you can connect Apache Kafka to SQL Server using the Debezium SQL Server connector to start streaming your data seamlessly.

Limitations of using the Debezium SQL Server Connector:

  • Using the Debezium SQL Server connector requires you to write custom code and make API calls, hence you must have strong technical knowledge.
  • Apache Kafka Topics are case-sensitive, so you must be careful while making API calls to connect the Debezium SQL Server connector to avoid any failures/exceptions while streaming data.
  • The Debezium SQL Server connector monitors and streams any data changes made to your source table; however, it cannot replicate any structure-based changes to your source table. It thus requires users to modify the structure of the capture table manually whenever the source table structure changes.  
  • If any component in this system fails, there is a risk of data loss or compromised transactional integrity. It’s the responsibility of the development team to ensure data integrity and minimize the risk of any potential issues.
  • Although Debezium offers multiple connectors, some of them have scalability issues. Even with Postgres, plugins such as wal2json can cause out-of-memory exceptions when converting write-ahead log output to JSON. Another problem is the inability to snapshot tables while still open for incoming events. For large tables, this means that there may be extended periods when the table is unavailable while snapshotting.
  • Certain data connectors have limitations in supporting specific data types. For instance, the Oracle connector by Debezium has a limitation in handling BLOB data types as they are still in an incubation state. Hence, using BLOB data types in Debezium-based modules meant for production is not advisable.

Conclusion

This article teaches you how to connect Apache Kafka to SQL Server with ease. It provides in-depth knowledge about the concepts behind every step to help you understand and implement them efficiently. Kafka SQL Server is a powerful tool that supports various data formats and provides connectors for seamless integration with different data sources. The Kafka SQL Server Connector configuration options also allow for fine-tuning the data ingestion parameters. These methods, however, can be challenging, especially for a beginner & this is where Hevo saves the day. 

Visit our Website to Explore Hevo

Hevo, a No-code Data Pipeline, helps you transfer data from a source of your choice in a fully-automated and secure manner without having to write the code repeatedly. Hevo, with its strong integration with 150+ Data sources & BI tools, allows you to not only export & load data but also transform & enrich your data & make it analysis-ready in a jiff.

Want to take Hevo for a spin? Sign Up for the 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at our unbeatable Hevo Pricing that will help you choose the right plan for your business needs!

Tell us about your experience of connecting Apache Kafka to SQL Server! Share your thoughts in the comments section below!

Divij Chawla
Former Marketing Operations and Analytics Manager, Hevo Data

Divij Chawla worked with Hevo in driving the Marketing Operations and Analytics team. He has a keen interest in data analysis, data, software architecture, and the creation of technical content related to the realm.

No-code Data Pipeline For SQL Server

Get Started with Hevo