Are you looking to perform Kafka to MySQL ETL? Streaming data from Kafka to MySQL enables real-time data transfer to your MySQL database, reflecting  changes as they occur within the Kafka event stream. Do not worry, we have you covered.

In this blog, you will not only learn about the MySQL database and Kafka but also how to establish a connection from Kafka to MySQL.

Understanding the Methods to Connect Kafka to MySQL

Here are the methods you can use to connect Kafka to MySQL in a seamless fashion:

Method 1: Using Hevo to Connect Kafka to MySQL

Hevo Logo
Image Source

Hevo helps you directly transfer data from Apache Kafka and various other sources to a Database, such as MySQL, or a destination of your choice in a completely hassle-free & automated manner. Hevo is the only real-time ELT No-code Data Pipeline platform that cost-effectively automates data pipelines that are flexible to your needs. Its fault-tolerant architecture ensures that the data is handled securely and consistently with zero data loss.

Sign up here for a 14-Day Free Trial!
  • Step 1: Connect hevo to Apache Kafka by providing the Pipeline Name, Bootstrap Server(s), Use SSL, and Ingest Data Form fields.
Apache Kafka to MySQL: Source Configuration
Image Source
  • Step 2: Connect Kafka to MySQL by providing the Destination Name, Database Host, Database Port, Database Username, password, and Database Name.
Kafka to MySQL: Destination Configuration
Image Source
  • Click here to learn more about how to set up Kafka as the Source.
  • Click here to learn more about how to set up MySQL as the Destination.

As can be seen, you are simply required to enter the corresponding credentials to implement this fully automated data pipeline without using any code.

Check out what makes Hevo amazing:

  • Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to transfer. 
  • Transformations: Hevo provides preload transformations through Python code. It also allows you to run transformation code for each event in the Data Pipelines you set up. Hevo also offers drag-and-drop interface like Date and Control Functions, JSON, and Event Manipulation to name a few.
  • Secure: Discover peace with end-to-end encryption and compliance with all major security certifications including HIPAA, GDPR, SOC-2.
  • Transparent Pricing – Hevo’s Transparent Pricing brings complete visibility to your ELT spend. Stay in control with spend alerts and configurable credit limits for unforeseen spikes in data flow.
Get started for Free with Hevo!

Method 2: Using Kafka Connector to Connect Kafka to MySQL

Follow the given steps to set up your Kafka to MySQL Connector:

Step 1: Downloading Confluence and MySQL for Java

  • First you need to download and install Confluent Open Source Platform.
  • Then download the MySQL connector for Java. You can download it from here.

Step 2: Copy MySQL Connector Jar and Adjust Data Source Properties

  • Copy the MySQL Connector Jar. Then add the jar to existing Kafka Connect JDBC Jars. [Location in Ubuntu /usr/share/java/kafka-connect-jdbc].
  • Then you need to adjust the Data Source Properties. Create a file, /etc/kafka-connect-jdbc/source-quickstart-mysql.properties with following content.
source-quickstart-mysql.properties
name=test-source-mysql-jdbc-autoincrement
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:mysql://127.0.0.1:3306/studentsDB?user=arjun&password=password
mode=incrementing
incrementing.column.name=rollno
topic.prefix=test-mysql-jdbc-

The given configuration values must be adjusted for your MySQL databaseconnection.url connection.url=jdbc:mysql://127.0.0.1:3306/<DatabaseName>?user=<username>&password=<password> username and password are the user credentials with which you login to MySQL Database.incrementing.column.name
The name of the column that is strictly incrementing in the tables of your database is used to detect new rows.

Empty values points to the column that must be autodetected by looking for an auto-incrementing column. This column could not be nullable. If a column with these properties does not exist, one of the columns can be updated with the following SQL Commands.

ALTER TABLE <table_name> MODIFY COLUMN <column_name> INT auto_increment
ALTER TABLE <table_name> ADD PRIMARY KEY (<column_name>)

Topic.prefix: This is used to prepend table names to get the name of the Kafka topic to publish data to, or in the case of a custom query, the full name of the topic to publish to. 

Step 3: Start Zookeeper, Kafka, and Schema Registry

  • To start Zookeeper, Kafka and Schema Registry, use the following command:
$ confluent start schema-registry

Step 4: Start the Standalone Connector

  • The following command can be used to start standalone connector:
$ /usr/bin/connect-standalone /etc/schema-registry/connect-avro-standalone.properties/etc/kafka-connect-jdbc/source-quickstart-mysql.properties

This should start the Connector successfully.

Step 5: Start a Console Consumer

  • You can verify any message that is posted to the topic. Do this by starting a consumer that subscribes to a topic named test-mysql-jdbc-students. [students is the table name and test-mysql-jdbc  is topic.prefix]. To start a consumer, use the given command:
/usr/bin/kafka-avro-console-consumer –topictest-mysql-jdbc-students –zookeeper localhost:2181 –from-beginning

You must replace test-mysql-jdbc-students with the name of your configuration and tables in the MySQL Database.

Step 6: Add a Row to the MySQL Table

  • The next step is to add a row to MySQL Table students and check if the Console Consumer receives this message.
mysql> use studentsDB;
mysql> INSERT INTO students (name, marks) VALUES ('Sastri',88);

The consumer receives the message.
You will get the following output:

root@tutorialkart:~# /usr/bin/kafka-avro-console-consumer --topic test-mysql-jdbc-students --zookeeper localhost:2181 --from-beginning
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
{"name":{"string":"John"},"rollno":1,"marks":{"int":84}}
{"name":{"string":"Arjun"},"rollno":2,"marks":{"int":84}}
{"name":{"string":"Prasanth"},"rollno":3,"marks":{"int":77}}
{"name":{"string":"Adarsh"},"rollno":4,"marks":{"int":78}}
{"name":{"string":"Raja"},"rollno":5,"marks":{"int":94}}
{"name":{"string":"Sai"},"rollno":6,"marks":{"int":84}}
{"name":{"string":"Ross"},"rollno":7,"marks":{"int":54}}
{"name":{"string":"Monica Gellar"},"rollno":8,"marks":{"int":86}}
{"name":{"string":"Lee"},"rollno":9,"marks":{"int":98}}
{"name":{"string":"Bruce Wane"},"rollno":10,"marks":{"int":92}}
{"name":{"string":"Jack"},"rollno":11,"marks":{"int":82}}
{"name":{"string":"Priya"},"rollno":12,"marks":{"int":88}}
{"name":{"string":"Amy"},"rollno":13,"marks":{"int":84}}
{"name":{"string":"Sastri"},"rollno":14,"marks":{"int":88}}

Limitations

  • You should ensure that the connector can access your service based on the network limitations of the service environment. 
  • It is crucial to ensure that the database and Kafka cluster are situated in the same region.
  • To delete tombstone records, make sure you set delete.enabled to true.

Before wrapping up, let’s cover some basics.

Introduction to MySQL

MySQL stores structured data in the form of a relational database. Structured data refers to the data that is in the form of rows and columns (has a defined structure). Columns may be referred to as fields while rows are considered instances of a specific record. MySQL is considered an RDBMS because the data stored in tables can be related to each other through the concept of primary and foreign keys. Data is easily retrieved or queried from the database using SQL syntax.

Migrating data from Kafka to MySQL Database allows users to benefit from the additional power and capabilities of MySQL in gaining insights from said data. MySQL provides greater capacity in bulk editing, the volume of data, as well as ease in creating custom data sets that answer specific questions using SQL.

Introduction to Apache Kafka

Apache Kafka is a distributed streaming platform which is used to develop streaming data pipelines. It is also used to build real-time applications that rely on a constant flow of data from a source.

Written in Scala, Kafka supports bringing in data from a large variety of sources and stores them in the form of “topics” by processing the information stream. It uses two functions, namely Producers, which act as an interface between the data source and Kafka Topics, and Consumers, which allow users to read and transfer the data stored in Kafka.

Key features of Kafka:

  • Scalability: Kafka has exceptional scalability and can be scaled easily without downtime.
  • Data Transformation: Kafka offers KStream and KSQL (in case of Confluent Kafka) for on the fly data transformation.
  • Fault-Tolerant: Kafka uses brokers to replicate data and persists the data to make it a fault-tolerant system.
  • Security: Kafka can be combined with various security measures like Kerberos to stream data securely.
  • Performance: Kafka is distributed, partitioned, and has very high throughput for publishing and subscribing to the messages.

In this case, the data source is MySQL and changes to records in the database will be streamed as events into Apache Kafka. This will be done using the Kafka connector which is an interface between Apache Kafka and other systems in the data pipeline setup and can be used to move data from Kafka to MySQL.

Conclusion

In this blog post, you have learned how to perform MySQL to Kafka ETL using the Kafka connector. This method involves a number of technical steps and may not be suitable for all users. This is where Hevo comes to the rescue!

Visit our Website to Explore Hevo

If you use a lot of cloud-based services for running your business, a cloud-based ETL tool like Hevo can help you get the most out of your data by integrating various sources. Hevo supports many cloud-based software services and on-premise databases. Hevo can help you integrate them in a few clicks without using any code at all.

Sign Up for a free trial to try Hevo for free!

Have any further queries about moving data from Kafka to MySQL? Let us know in the comments section below!

Shruti Garg
Freelance Technical Content Writer, Hevo Data

With an analytical perspective and a problem-solving approach, Shruti has extensive experience in solving business problems of data teams by delivering thoroughly researched content pivotal for data practitioners.

No-Code Data Pipeline for MySQL