Are you looking to perform Kafka to MySQL ETL? Do not worry, we have you covered. In this blog, you will not only learn about the MySQL database and Kafka but also how to establish a connection from Kafka to MySQL.

Introduction to MySQL

MySQL stores structured data in the form of a relational database. Structured data refers to the data that is in the form of rows and columns (has a defined structure). Columns may be referred to as fields while rows are considered instances of a specific record. MySQL is considered an RDBMS because the data stored in tables can be related to each other through the concept of primary and foreign keys. Data is easily retrieved or queried from the database using SQL syntax.

Migrating data from Kafka to MySQL Database allows users to benefit from the additional power and capabilities of MySQL in gaining insights from said data. MySQL provides greater capacity in bulk editing, the volume of data, as well as ease in creating custom data sets that answer specific questions using SQL.

Solve your data replication problems with Hevo’s reliable, no-code, automated pipelines with 150+ connectors.
Get your free trial right away!

Introduction to Apache Kafka

Apache Kafka is a distributed streaming platform which is used to develop streaming data pipelines. It is also used to build real-time applications that rely on a constant flow of data from a source.

Written in Scala, Kafka supports bringing in data from a large variety of sources and stores them in the form of “topics” by processing the information stream. It uses two functions, namely Producers, which act as an interface between the data source and Kafka Topics, and Consumers, which allow users to read and transfer the data stored in Kafka.

Key features of Kafka:

  • Scalability: Kafka has exceptional scalability and can be scaled easily without downtime.
  • Data Transformation: Kafka offers KStream and KSQL (in case of Confluent Kafka) for on the fly data transformation.
  • Fault-Tolerant: Kafka uses brokers to replicate data and persists the data to make it a fault-tolerant system.
  • Security: Kafka can be combined with various security measures like Kerberos to stream data securely.
  • Performance: Kafka is distributed, partitioned, and has very high throughput for publishing and subscribing to the messages.

In this case, the data source is MySQL and changes to records in the database will be streamed as events into Apache Kafka. This will be done using the Kafka connector which is an interface between Apache Kafka and other systems in the data pipeline setup and can be used to move data from Kafka to MySQL.

Understanding the Methods to Connect Kafka to MySQL

Here are the methods you can use to connect Kafka to MySQL in a seamless fashion:

Method 1: Using Hevo to Connect Kafka to MySQL

Hevo Logo
Image Source

Hevo helps you directly transfer data from Apache Kafka and various other sources to a Database, such as MySQL, or a destination of your choice in a completely hassle-free & automated manner. Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled securely and consistently with zero data loss.

Sign up here for a 14-Day Free Trial!

Hevo takes care of all your data preprocessing to set up Kafka MySQL migration and lets you focus on key business activities and draw a much more powerful insight on how to generate more leads, retain customers, and take your business to new heights of profitability. It provides a consistent & reliable solution to manage data in real-time and always have analysis-ready data in your desired destination. 

  • Step 1: Connect hevo to Apache Kafka by providing the Pipeline Name, Bootstrap Server(s), Use SSL, and Ingest Data Form fields.
Apache Kafka to MySQL: Source Configuration
Image Source
  • Step 2: Connect Kafka to MySQL by providing the Destination Name, Database Host, Database Port, Database Username, password, and Database Name.
Kafka to MySQL: Destination Configuration
Image Source

As can be seen, you are simply required to enter the corresponding credentials to implement this fully automated data pipeline without using any code.

Check out what makes Hevo amazing:

  • Near Real-Time Replication: Get access to near real-time replication for all database sources with log-based replication. For SaaS applications, near real-time replication is subject to API limits.
  • Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to transfer. 
  • Transformations: Hevo provides preload transformations through Python code. It also allows you to run transformation code for each event in the Data Pipelines you set up. You need to edit the event object’s properties received in the transform method as a parameter to carry out the transformation. Hevo also offers drag-and-drop interface like Date and Control Functions, JSON, and Event Manipulation to name a few. These can be configured and tested before putting them to use.
  • Connectors: Instantly connect and read data from 150+ sources including SaaS apps and databases, and precisely control pipeline schedules down to the minute.  
  • Secure: Discover peace with end-to-end encryption and compliance with all major security certifications including HIPAA, GDPR, SOC-2.
  • Tremendous Connector Availability: Hevo houses a large variety of connectors and lets you bring in data from numerous Marketing & SaaS applications, databases, etc. such as Google Analytics 4, Google Firebase, Airflow, HubSpot, Marketo, MongoDB, Oracle, Salesforce, Redshift, etc. in an integrated and analysis-ready form.
  • Simplicity: Using Hevo is easy and intuitive, ensuring that your data is exported in just a few clicks. 
  • Completely Managed Platform: Hevo is fully managed. You need not invest time and effort to maintain or monitor the infrastructure involved in executing codes.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Transparent Pricing – Say goodbye to complex and hidden pricing models. Hevo’s Transparent Pricing brings complete visibility to your ELT spend. Choose a plan based on your business needs. Stay in control with spend alerts and configurable credit limits for unforeseen spikes in data flow.
Get started for Free with Hevo!

Method 2: Using Kafka Connector to Connect Kafka to MySQL

Follow the given steps to set up your Kafka to MySQL Connector:

Step 1: Downloading Confluence and MySQL for Java

  • First you need to download and install Confluent Open Source Platform.
  • Then download the MySQL connector for Java. You can download it from here.

Step 2: Copy MySQL Connector Jar and Adjust Data Source Properties

  • Copy the MySQL Connector Jar. Then add the jar to existing Kafka Connect JDBC Jars. [Location in Ubuntu /usr/share/java/kafka-connect-jdbc].
  • Then you need to adjust the Data Source Properties. Create a file, /etc/kafka-connect-jdbc/source-quickstart-mysql.properties with following content.
source-quickstart-mysql.properties
name=test-source-mysql-jdbc-autoincrement
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:mysql://127.0.0.1:3306/studentsDB?user=arjun&password=password
mode=incrementing
incrementing.column.name=rollno
topic.prefix=test-mysql-jdbc-

The given configuration values must be adjusted for your MySQL databaseconnection.url connection.url=jdbc:mysql://127.0.0.1:3306/<DatabaseName>?user=<username>&password=<password> username and password are the user credentials with which you login to MySQL Database.incrementing.column.name
The name of the column that is strictly incrementing in the tables of your database is used to detect new rows.

Empty values points to the column that must be autodetected by looking for an auto-incrementing column. This column could not be nullable. If a column with these properties does not exist, one of the columns can be updated with the following SQL Commands.

ALTER TABLE <table_name> MODIFY COLUMN <column_name> INT auto_increment
ALTER TABLE <table_name> ADD PRIMARY KEY (<column_name>)

Topic.prefix: This is used to prepend table names to get the name of the Kafka topic to publish data to, or in the case of a custom query, the full name of the topic to publish to. 

Step 3: Start Zookeeper, Kafka, and Schema Registry

  • To start Zookeeper, Kafka and Schema Registry, use the following command:
$ confluent start schema-registry

Step 4: Start the Standalone Connector

  • The following command can be used to start standalone connector:
$ /usr/bin/connect-standalone /etc/schema-registry/connect-avro-standalone.properties/etc/kafka-connect-jdbc/source-quickstart-mysql.properties

This should start the Connector successfully.

Step 5: Start a Console Consumer

  • You can verify any message that is posted to the topic. Do this by starting a consumer that subscribes to a topic named test-mysql-jdbc-students. [students is the table name and test-mysql-jdbc  is topic.prefix]. To start a consumer, use the given command:
/usr/bin/kafka-avro-console-consumer –topictest-mysql-jdbc-students –zookeeper localhost:2181 –from-beginning

You must replace test-mysql-jdbc-students with the name of your configuration and tables in the MySQL Database.

Step 6: Add a Row to the MySQL Table

  • The next step is to add a row to MySQL Table students and check if the Console Consumer receives this message.
mysql> use studentsDB;
mysql> INSERT INTO students (name, marks) VALUES ('Sastri',88);

The consumer receives the message.
You will get the following output:

root@tutorialkart:~# /usr/bin/kafka-avro-console-consumer --topic test-mysql-jdbc-students --zookeeper localhost:2181 --from-beginning
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
{"name":{"string":"John"},"rollno":1,"marks":{"int":84}}
{"name":{"string":"Arjun"},"rollno":2,"marks":{"int":84}}
{"name":{"string":"Prasanth"},"rollno":3,"marks":{"int":77}}
{"name":{"string":"Adarsh"},"rollno":4,"marks":{"int":78}}
{"name":{"string":"Raja"},"rollno":5,"marks":{"int":94}}
{"name":{"string":"Sai"},"rollno":6,"marks":{"int":84}}
{"name":{"string":"Ross"},"rollno":7,"marks":{"int":54}}
{"name":{"string":"Monica Gellar"},"rollno":8,"marks":{"int":86}}
{"name":{"string":"Lee"},"rollno":9,"marks":{"int":98}}
{"name":{"string":"Bruce Wane"},"rollno":10,"marks":{"int":92}}
{"name":{"string":"Jack"},"rollno":11,"marks":{"int":82}}
{"name":{"string":"Priya"},"rollno":12,"marks":{"int":88}}
{"name":{"string":"Amy"},"rollno":13,"marks":{"int":84}}
{"name":{"string":"Sastri"},"rollno":14,"marks":{"int":88}}

Conclusion

In this blog post, you have learned how to perform MySQL to Kafka ETL using the Kafka connector. This method involves a number of technical steps and may not be suitable for all users. This is where Hevo comes to the rescue!

Visit our Website to Explore Hevo

If you use a lot of cloud-based services for running your business, a cloud-based ETL tool like Hevo can help you get the most out of your data by integrating various sources. Hevo supports many cloud-based software services and on-premise databases. Hevo can help you integrate them in a few clicks without using any code at all.

Sign Up for a free trial to try Hevo for free!

Have any further queries about moving data from Kafka to MySQL? Let us know in the comments section below!

No-Code Data Pipeline for MySQL