Kafka to MySQL ETL: Easy Steps

Q: 1. How to send data from Kafka to database?

-Use Kafka Connect with a JDBC Sink Connector to send data from Kafka to databases like MySQL, PostgreSQL, etc. -Write a custom consumer application using Kafka's consumer API to read messages from Kafka and insert them into the database.

Q: 2. What is the difference between MySQL and Kafka?

MySQL is a relational database used to store structured data with SQL queries. Kafka is a distributed event streaming platform designed for real-time data ingestion, processing, and messaging.

Q: 3. Can you use Kafka as a database?

Kafka is not a traditional database but can be used as an event store for streaming data. However, it lacks the transactional consistency and query capabilities of relational databases like MySQL. It's best used for real-time data streams rather than long-term data storage.

Are you looking to perform Kafka to MySQL ETL? Streaming data from Kafka to MySQL enables real-time data transfer to your MySQL database, reflecting changes as they occur within the Kafka event stream. Do not worry, we have you covered.

In this blog, you will not only learn about the MySQL database and Kafka but also how to establish a connection from Kafka to MySQL.

Table of Contents

What is Kafka?

Apache Kafka is an open-source, distributed event streaming platform designed for handling real-time data feeds. It enables the publication, storage, and processing of data streams at scale and is widely used for building real-time streaming applications, message brokering, and distributed logging.

Key features

Kafka horizontally scales and runs as a distributed system where multiple Kafka brokers work together, thereby efficiently handling massive loads of data.
Kafka allows hundreds of thousands of reads and writes per second and has very low latency; hence, it is appropriate for application in high-throughput systems such as real-time data pipelines and event processing.
Kafka stores data persistently to disk and thereby is durable. Data is not lost in the event of a broker restart; rather, it can be processed at any later time.
Kafka is designed to scale out seamlessly by adding more brokers to a cluster without downtime and, therefore, be able to handle ever-increasing volumes of data.

What is MySQL?

MySQL is an open-source relational database management system (RDBMS) based on Structured Query Language (SQL). It is widely used for managing and storing structured data in various applications, from web-based applications to large-scale enterprise systems.

Key Features of MySQL

MySQL follows the relational model; hence, it stores data in table format, comprising rows and columns. The relationships between different tables are set by the foreign keys linking those tables.
MySQL uses SQL for defining and manipulating data. Operating the data allows insertion, updating, deletion, and retrieval of data through powerful query functionalities.
MySQL can handle small to large-scale applications while scaling up based on data and usage. It also provides partitioning, sharding, and replication for horizontal scaling.
MySQL supports ACID attributes for Atomicity, Consistency, Isolation, and Durability to guarantee secure and reliable transaction processing in mission-critical applications.

Looking for the best ETL tools to connect your MySQL account? Rest assured, Hevo’s no-code platform seamlessly integrates with Snowflake streamlining your ETL process. Try Hevo and equip your team to:

Integrate data from 150+ sources(60+ free sources).
Simplify data mapping with an intuitive, user-friendly interface.
Instantly load and sync your transformed data into MySQL.

Choose Hevo and see why Deliverr says- “The combination of Hevo and Snowflake has worked best for us. ”

Get Started with Hevo for Free

Understanding the Methods to Connect Kafka to MySQL

Here are the methods you can use to connect Kafka to MySQL in a seamless fashion:

Method 1: Using Hevo to Connect Kafka to MySQL

Hevo helps you directly transfer data from Apache Kafka and various other sources to a Database, such as MySQL, or a destination of your choice in a completely hassle-free & automated manner. Hevo is the only real-time ELT No-code Data Pipeline platform that cost-effectively automates data pipelines that are flexible to your needs. Its fault-tolerant architecture ensures that the data is handled securely and consistently with zero data loss.

Step 1: Connect hevo to Apache Kafka by providing the Pipeline Name, Bootstrap Server(s), Use SSL, and Ingest Data Form fields.

Apache Kafka to MySQL: Source Configuration

Step 2: Connect Kafka to MySQL by providing the Destination Name, Database Host, Database Port, Database Username, password, and Database Name.

Kafka to MySQL: Destination Configuration

Click here to learn more about how to set up Kafka as the Source.
Click here to learn more about how to set up MySQL as the Destination.

As can be seen, you are simply required to enter the corresponding credentials to implement this fully automated data pipeline without using any code.

Check out what makes Hevo amazing:

Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to transfer.
Transformations: Hevo provides preload transformations through Python code. It also allows you to run transformation code for each event in the Data Pipelines you set up. Hevo also offers drag-and-drop interface like Date and Control Functions, JSON, and Event Manipulation to name a few.
Secure: Discover peace with end-to-end encryption and compliance with all major security certifications including HIPAA, GDPR, SOC-2.
Transparent Pricing – Hevo’s Transparent Pricing brings complete visibility to your ELT spend. Stay in control with spend alerts and configurable credit limits for unforeseen spikes in data flow.

Connect Kafka to MySQL

Get a Demo Try it

Connect Kafka to PostgreSQL

Get a Demo Try it

Connect Kafka to MS SQL Server

Get a Demo Try it

Method 2: Using Kafka Connector to Connect Kafka to MySQL

Follow the given steps to set up your Kafka to MySQL Connector:

Step 1: Downloading Confluence and MySQL for Java
Step 2: Copy MySQL Connector Jar and Adjust Data Source Properties
Step 3: Start Zookeeper, Kafka, and Schema Registry
Step 4: Start the Standalone Connector
Step 5: Start a Console Consumer
Step 6: Add a Row to the MySQL Table

Step 1: Downloading Confluence and MySQL for Java

First you need to download and install Confluent Open Source Platform.
Then download the MySQL connector for Java. You can download it from here.

Step 2: Copy MySQL Connector Jar and Adjust Data Source Properties

Copy the MySQL Connector Jar. Then add the jar to existing Kafka Connect JDBC Jars. [Location in Ubuntu /usr/share/java/kafka-connect-jdbc].
Then you need to adjust the Data Source Properties. Create a file, /etc/kafka-connect-jdbc/source-quickstart-mysql.properties with following content.

source-quickstart-mysql.properties
name=test-source-mysql-jdbc-autoincrement
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:mysql://127.0.0.1:3306/studentsDB?user=arjun&password=password
mode=incrementing
incrementing.column.name=rollno
topic.prefix=test-mysql-jdbc-

The given configuration values must be adjusted for your MySQL databaseconnection.url connection.url=jdbc:mysql://127.0.0.1:3306/<DatabaseName>?user=<username>&password=<password> username and password are the user credentials with which you login to MySQL Database.incrementing.column.name
The name of the column that is strictly incrementing in the tables of your database is used to detect new rows.

Empty values points to the column that must be autodetected by looking for an auto-incrementing column. This column could not be nullable. If a column with these properties does not exist, one of the columns can be updated with the following SQL Commands.

ALTER TABLE <table_name> MODIFY COLUMN <column_name> INT auto_increment
ALTER TABLE <table_name> ADD PRIMARY KEY (<column_name>)

Topic.prefix: This is used to prepend table names to get the name of the Kafka topic to publish data to, or in the case of a custom query, the full name of the topic to publish to.

Step 3: Start Zookeeper, Kafka, and Schema Registry

To start Zookeeper, Kafka and Schema Registry, use the following command:

$ confluent start schema-registry

Step 4: Start the Standalone Connector

The following command can be used to start standalone connector:

$ /usr/bin/connect-standalone /etc/schema-registry/connect-avro-standalone.properties/etc/kafka-connect-jdbc/source-quickstart-mysql.properties

This should start the Connector successfully.

Step 5: Start a Console Consumer

You can verify any message that is posted to the topic. Do this by starting a consumer that subscribes to a topic named test-mysql-jdbc-students. [students is the table name and test-mysql-jdbc is topic.prefix]. To start a consumer, use the given command:

/usr/bin/kafka-avro-console-consumer –topictest-mysql-jdbc-students –zookeeper localhost:2181 –from-beginning

You must replace test-mysql-jdbc-students with the name of your configuration and tables in the MySQL Database.

Step 6: Add a Row to the MySQL Table

The next step is to add a row to MySQL Table students and check if the Console Consumer receives this message.

mysql> use studentsDB;
mysql> INSERT INTO students (name, marks) VALUES ('Sastri',88);

The consumer receives the message.
You will get the following output:

root@tutorialkart:~# /usr/bin/kafka-avro-console-consumer --topic test-mysql-jdbc-students --zookeeper localhost:2181 --from-beginning
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
{"name":{"string":"John"},"rollno":1,"marks":{"int":84}}
{"name":{"string":"Arjun"},"rollno":2,"marks":{"int":84}}
{"name":{"string":"Prasanth"},"rollno":3,"marks":{"int":77}}
{"name":{"string":"Adarsh"},"rollno":4,"marks":{"int":78}}
{"name":{"string":"Raja"},"rollno":5,"marks":{"int":94}}
{"name":{"string":"Sai"},"rollno":6,"marks":{"int":84}}
{"name":{"string":"Ross"},"rollno":7,"marks":{"int":54}}
{"name":{"string":"Monica Gellar"},"rollno":8,"marks":{"int":86}}
{"name":{"string":"Lee"},"rollno":9,"marks":{"int":98}}
{"name":{"string":"Bruce Wane"},"rollno":10,"marks":{"int":92}}
{"name":{"string":"Jack"},"rollno":11,"marks":{"int":82}}
{"name":{"string":"Priya"},"rollno":12,"marks":{"int":88}}
{"name":{"string":"Amy"},"rollno":13,"marks":{"int":84}}
{"name":{"string":"Sastri"},"rollno":14,"marks":{"int":88}}

Limitations

You should ensure that the connector can access your service based on the network limitations of the service environment.
It is crucial to ensure that the database and Kafka cluster are situated in the same region.
To delete tombstone records, make sure you set delete.enabled to true.

Before wrapping up, let’s cover some basics.

Here are some other migrations that you may need:

Conclusion

In this blog post, you have learned how to perform MySQL to Kafka ETL using the Kafka connector. This method involves a number of technical steps and may not be suitable for all users. This is where Hevo comes to the rescue!

Visit our Website to Explore Hevo

If you use a lot of cloud-based services for running your business, a cloud-based ETL tool like Hevo can help you get the most out of your data by integrating various sources. Hevo supports many cloud-based software services and on-premise databases. Hevo can help you integrate them in a few clicks without using any code at all.

Frequently Asked Questions

1. How to send data from Kafka to database?

-Use Kafka Connect with a JDBC Sink Connector to send data from Kafka to databases like MySQL, PostgreSQL, etc.
-Write a custom consumer application using Kafka’s consumer API to read messages from Kafka and insert them into the database.

2. What is the difference between MySQL and Kafka?

MySQL is a relational database used to store structured data with SQL queries.
Kafka is a distributed event streaming platform designed for real-time data ingestion, processing, and messaging.

3. Can you use Kafka as a database?

Kafka is not a traditional database but can be used as an event store for streaming data. However, it lacks the transactional consistency and query capabilities of relational databases like MySQL. It’s best used for real-time data streams rather than long-term data storage.

Shruti Garg Technical Content Writer, Hevo Data

Shruti brings a wealth of experience to the data industry, specializing in solving critical business challenges for data teams. With a keen analytical perspective and a strong problem-solving approach, she delivers meticulously researched content that is indispensable for data practitioners. Her work is instrumental in driving innovation and operational efficiency within the data-driven landscape, making her a valuable asset in today's competitive market.

What is Kafka?

Key features

What is MySQL?

Key Features of MySQL

Understanding the Methods to Connect Kafka to MySQL

Method 1: Using Hevo to Connect Kafka to MySQL

Method 2: Using Kafka Connector to Connect Kafka to MySQL

Step 1: Downloading Confluence and MySQL for Java

Step 2: Copy MySQL Connector Jar and Adjust Data Source Properties

Step 3: Start Zookeeper, Kafka, and Schema Registry

Step 4: Start the Standalone Connector

Step 5: Start a Console Consumer

Step 6: Add a Row to the MySQL Table

Limitations

Here are some other migrations that you may need:

Conclusion

Frequently Asked Questions

1. How to send data from Kafka to database?

2. What is the difference between MySQL and Kafka?

3. Can you use Kafka as a database?

Related Articles

Optimize your data integration with Hevo!

Related articles