Apache Kafka is an open-source streaming platform that handles and processes real-time messages that can be further used for developing event-based applications. Kafka servers can handle hundreds of billions of real-time messages that are collected from the Kafka producers. Real-time messages can be further used by Kafka consumers to execute several development tasks.
Users can run default Kafka scripts in CLI (Command Line Interface) for producing and consuming messages. On installing Apache Kafka to your local machine, you are provided with several Kafka scripts that allow you to work with several utilities or CLI. Such CLIs can be used for producing and consuming data to and fro the Kafka servers. One such utility is Kafka console consumer, which allows you to access or fetch data from Kafka servers.
On running the Kafka console consumer utility, the Kafka environment starts up a terminal window or command line prompt by which you can execute commands for fetching data right from the Kafka servers.
In this article, you will learn about Kafka, Kafka consumer console, and different ways of processing real-time messages from Kafka topics.
Table of Contents
- What is Kafka?
- What is Kafka Console Consumer?
- Efficient processing using the Kafka console consumer
A fundamental understanding of data streaming.
What is Kafka?
Developed by Linkedin in 2010, Kafka is an open-source and distributed platform that handles, stores, and streams real-time data for building event-driven applications. In other words, Kafka has a distributed set of servers that stores and processes real-time or continuous data collected from producers. Kafka has a vast ecosystem consisting of three main components: Kafka producers, servers or brokers, and consumers. While Kafka producers publish or write data into Kafka servers, Kafka consumers fetch or read messages from the respective Kafka servers. You can also read our in-depth article about Kafka Python.
What is Kafka Console Consumer?
Kafka console consumer is a utility that reads or consumes real-time messages from the Kafka topics present inside Kafka servers. In other words, the Kafka console consumer is a default utility that comes with the Kafka package for reading Kafka messages using the command prompt or command-line interface. The script file called “kafka-console-consumer” present in the bin directory can be used to easily make your default command prompt to act as a consumer or receiving terminal. The Kafka console consumer utility not only consumes or receives messages from Kafka servers, but it can also process and alter messages depending on end users’ or applications’ preferences.
Efficient processing using the Kafka console consumer
Setup the Kafka environment
- For fetching data using the Kafka console consumer, you have to set up the Kafka environment that effectively streams real-time messages from Kafka producer to consumer.
- Initially, you have to download Apache Kafka from the official website and configure the necessary files to set up the Kafka environment. For running Kafka cluster, you should also have Java 8+ pre-installed and running in your local machine. Ensure that you set up the file path and Java_Home environment variables to make your operating system point or head towards the Java utilities.
- In the below steps, you will set up the Zookeeper instance and Kafka server necessary to run the Kafka environment.
- Firstly, you can start and run the Zookeeper instance. Open a new command prompt and run the following command.
- After the Zookeeper instance is started, execute the following command to start the Kafka server.
- On executing the above commands, both the Zookeeper instance and Kafka servers start and run in your local machine.
- The above-mentioned method is the way of starting your Zookeeper instance and Kafka servers from your local Apache Kafka installation. However, you can also use docker images to easily start the Kafka environment, including Zookeeper, Kafka servers, Schema registry.
- For starting a Kafka application in a docker container, you can use the docker-compose.yml file script, as shown below.
- Run docker-compose up -d –build and write the above code to launch the docker image.
- After the successful launch of the docker image, two docker containers will boot up. One docker container is for starting Kafka server, while the other container runs the Zookeeper instance.
The above-given steps are the two easy ways to start Kafka server and Zookeeper instances.
Hevo Data is a No-code Data Pipeline that offers a fully managed solution to set up data integration from Apache Kafka and 100+ Data Sources (including 30+ Free Data Sources)and will let you directly load data to a Data Warehouse or the destination of your choice.
Hevo will automate your data flow in minutes without writing any line of code. Its fault-tolerant architecture makes sure that your data is secure and consistent. Hevo provides you with a truly efficient and fully automated solution to manage data in real-time and always have analysis-ready data.
Get started with hevo for free
Let’s look at some of the salient features of Hevo:
Sign up here for a 14-day free trial!
- Fully Managed: It requires no management and maintenance as Hevo is a fully automated platform.
- Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to transfer.
- Real-Time: Hevo offers real-time data migration. So, your data is always ready for analysis.
- Schema Management: Hevo can automatically detect the schema of the incoming data and map it to the destination schema.
- Scalable Infrastructure: Hevo has in-built integrations for 100’s of sources that can help you scale your data infrastructure as required.
- Live Monitoring: Advanced monitoring gives you a one-stop view to watch all the activities that occur within Data Pipelines.
- Live Support: Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Creating Kafka Topics
- For creating Kafka topics inside Kafka servers, you have to use the script file named kafka-topic.sh that is present in the environment path variable.
- Execute the following commands to create Kafka topics.
- Topic with a single partition and replication factor
./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic kafka_test_topic
- Topic with a multiple partitions and a replication factor
./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 3 --topic kafka_test_topic_withpartition
- On executing the above commands you created two topics named kafka_test_topic and kafka_test_topic_withpartition.
Starting Kafka producer
- After creating Kafka topics, you have to start a Kafka producer console that publishes messages into previously created Kafka topics.
- For starting the Kafka producer console, you will use the kafka-console-producer script file.
- Execute the following command to run the producer console.
./kafka-console-producer.sh --broker-list localhost:9092 --topic kafka_test_topic
- After running the above command, the terminal will start acting like a producer console, prompting you to enter messages according to your preferences.
- In the producer console, you can enter messages as shown below.
- You can also publish text files containing messages ready to be produced inside Kafka topics. Execute the following command to publish the text file into a topic named kafka_test_topic_withpartition.
- On executing the above command, you will witness an output that is similar to the image below.
- Now, you have successfully created the Kafka producer console and published messages into Kafka topics.
Processing messages with Kafka console consumer
- In the below steps, you will consume or fetch messages present inside the respective Kafka topics.
- Execute the following commands to fetch all messages from the Kafka topic named kafka_test_topic and kafka_test_topic_withpartition.
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic kafka_test_topic --from-beginning
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic kafka_test_topic_withpartition --from-beginning
- The output of both the above commands will resemble the images given below.
- The command executed previously fetched all messages present in both Kafka topics named kafka_test_topic and kafka_test_topic_withpartition.
- Using the Kafka console consumer, you can also customize the fetching limit of messages from the Kafka topics. Execute the following command to limit the number of messages that are to be consumed from the respective topics.
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic kafka_test_topic_withpartition --from-beginning --max-messages 2
- On executing the above command, you have limited the Kafka console consumer to only fetch two messages from the Kafka topic. The output will resemble the following image.
- In the next step, you will fetch messages from the particular offset of the Kafka topics. In Kafka servers, messages received from the producers are stored in the ordered sequence inside a topic partition. Those ordered sequences of numbers are called Offset.
- Execute the following command to consume messages from a particular offset and partition.
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic kafka_test_topic --offset 2 --partition 0
- On executing the above command, you fetch messages from offset 2 in partition 0. The output will resemble the following image.
- The Kafka application provides several techniques that ensure safety and security while transferring data from one end to another. Users can effectively process or stream real-time data with high security and privacy on implementing SSL security for both consumers and producers.
- Security details like Keystore file (kafka_client_keystore.jks), Keystore password, Key password, Truststore file (kafka_client_truststore.jks), Truststore password have to be stored in the properties file. Later while executing commands for producing or consuming messages, the property file has to be used in the command line for securely implementing data streaming operation.
- For example, execute the following command to fetch messages from Kafka topics with SSL security implementation.
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic kafka_ssl_test_topic --from-beginning --consumer.config ~/files/ssl_detail.properties
- In the above command, consumer.config attribute is used to specify SSL properties to the consumer command.
- In the below steps, you will create a consumer group to separately fetch messages from different partitions of the Kafka topic. A consumer group is nothing but a set of consumers that work together to consume messages from specific topics. The partitions of the Kafka topics are divided among the consumers in the group, thereby assigning from which topic each consumer should fetch messages.
- For defining the consumer group, execute the following command.
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --consumer-property group.id=cg_name_1 --topic kafka_test_topic --from-beginning
- As shown in the above command, the consumer group can be defined by specifying the key-value pair. In the above command group.id is the key, cg_name_1 is the value, and group.id=cg_name_1 is collectively known as the key-value pair.
- For fetching messages from the Kafka topics using the consumer group, execute the following command.
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic kafka_test_topic --consumer-property group.id=cg_name_3 --from-beginning
- For listing all the previously defined consumer groups, execute the following command.
./kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list
- Kafka allows users to describe a ‘Consumer Group’ and retrieve the metadata information. Such metadata information is internally maintained by Kafka to keep track of messages that have been previously consumed.
- Execute the following command to see all the metadata information about the particular consumer group named cg_name_1.
./kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group cg_name_1 --describe
The above given are the steps used to effectively consume and process data from Kafka servers.
All image & code credits: DBMS tutorials
In this article, you learned about Kafka, Kafka console consumer, and different ways to process or consume real-time messages from Kafka topics. Since this article mainly focused on the data-consuming part, it covered how to fetch real-time messages from Kafka servers using different consuming techniques.
In this article, you were introduced to Kafka Console Consumer. However, in businesses, extracting complex data from a diverse set of Data Sources can be a challenging task and this is where Hevo saves the day!
visit our website to explore hevo
Hevo Data with its strong integration with 100+ Sources & BI tools such as Apache Kafka, allows you to not only export data from sources & load data in the destinations, but also transform & enrich your data, & make it analysis-ready so that you can focus only on your key business needs and perform insightful analysis using BI tools.
Give Hevo Data a try and sign up for a 14-day free trial today. Hevo offers plans & pricing for different use cases and business needs, check them out!
Share your experience of working with Kafka Console Consumer in the comments section below.