The Kafka CLI is an interactive shell environment that provides you with command-line access for managing your Kafka resources programmatically. You can use the Kafka CLI to type in text commands that perform specific tasks within your Kafka environment. It is by far the fastest and most efficient interface for interacting with a Kafka cluster.

This post illustrates how to perform crucial Kafka tasks using the Kafka CLI commands. However, before diving in, let’s first understand what Kafka is and how is it used in the industry.

What is Kafka?

Kafka CLI Commands- Kafka Logo

Kafka is a popular open-source Publish/Subscribe messaging system that is used for building streaming analytics platforms and data integration pipelines. Kafka is both a queue for parallelizing tasks and a messaging-oriented middleware for service integration. The Kafka message broker (cluster) ingests and stores streams of messages (records) from event producers, which are later distributed to consumer services asynchronously when requested. 

Producers publish events to the Kafka instance or instances since a Kafka cluster can be either single or multi-node. Kafka stores the messages in a topic. The messages in the topic are organized in an immutable sequence (Python tuple object) based on when the messages were created. Downstream applications can then subscribe to the topics that they are interested in.

Revolutionize Kafka ETL Using Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline, is your one-stop-shop solution for all your Apache ETL needs! Hevo offers a built-in and robust native integration with Apache Kafka or the Kafka Confluent Cloud to help you replicate data in a matter of minutes! Check out what makes Hevo amazing:

  • Live Support: The Hevo team is available round the clock to extend exceptional customer support through chat, email, and support calls.
  • Schema Management: Hevo takes away the tedious task of schema management by automatically detecting the schema of incoming data and mapping it to the destination schema.
Get Started with Hevo for Free

Understanding Core Kafka Concepts

Kafka CLI Commands- Kafka Architecture
  • Topic: A named resource to which a particular stream of messages is stored and published.
  • Producer: A client application that creates and publishes records/messages to a Kafka topic(s).
  • Consumer: A client application that subscribes to a topic(s) to receive data from it.
  • Message: The combination of data and associated metadata that a producer application writes to a topic and is eventually consumed by consumers.

Common Kafka Use Cases

Event sourcing and stream processing architecture is a data strategy that is gaining popularity. Currently, over 60% of the Fortune 100 use Kafka in their tech stack. Some of these organizations include Cisco, Goldman Sachs, Target, and Intuit. There are 1000+ Kafka use cases but let’s just highlight the most common ones:

  • Ingestion of User Interaction and Server Events: To make use of user interaction events from end-user apps or server events from your system, you can forward the events to Kafka, process them, and then deliver them to analytical databases. Kafka can ingest events from many clients simultaneously.
  • Data Replication: Kafka can be used to stream incremental changes in databases and forward those to a destination such as a data lake or data warehouse for data replication and analysis.
  • ESB (Enterprise Service Bus): Companies are using Kafka as an ESB and this is helping them to transition from a monolithic to a microservices architecture. Data is made available to various applications and services across the entire organization in real-time.
  • Fraud Detection: The ability to collect, process and distribute financial events and database updates in real-time is enabling teams to do real-time threat/fraud detection. 
  • Data Streaming from Applications and IoT Devices: Applications can publish a stream of real-time user interaction events. On the other hand, IoT sensors can stream data to Kafka for use in other downstream systems.
  • Load Balancing: A service can be deployed on servers that span multiple Datacenters but subscribe to a common topic (data stream). If one service goes down in any Datacenter, the others can automatically take over.

Working with Kafka CLI Commands

Before you begin working with Kafka CLI, make sure you meet the following requirements:

  • Install Java 8+ on your workstation.
  • Download and extract Kafka on your workstation.

In this section you learn about the following Kafka CLI commands:

1. Spin Up a Kafka Environment

  • Open your terminal and navigate to the directory where you extracted the Kafka zip or tar file:
$ cd kafka_2.13-3.1.0
  • Run the following command to launch the ZooKeeper service:
$ bin/zookeeper-server-start.sh config/zookeeper.properties
  • In another terminal, run the following command to start the Kafka broker service:
$ bin/kafka-server-start.sh config/server.properties

After successfully executing these commands, you will have an active Kafka environment. Let’s now look at some basic commands that you can execute using the Kafka CLI.

2. Create a Topic

A topic in Kafka is akin to a directory or folder on your computer. The only difference is that a topic stores events (records/messages) rather than files and these events are normally distributed across multiple nodes/brokers.

Events can be application logs, web clickstreams, data emitted by IoT sensors, and much more. So before creating events, the first step is to create a topic or topics that will store and organize these records.

To create a Kafka topic using the Kafka CLI, you will use the bin/kafka-topics.sh shell script that’s bundled in the downloaded Kafka distribution. Launch another terminal session and execute the following command:

$ bin/kafka-topics.sh --create --topic my-topic --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1

You have successfully created a topic called my-topic.

3. View Details About a Topic

Kafka exposes a command that you can take advantage of to view metadata about the topics in your Kafka cluster. The following Kafka CLI command provides information on the number of partitions and replicas, among other topic details:

$ bin/kafka-topics.sh --bootstrap-server=localhost:9092 --describe --topic my-topic

Apart from passing the cluster information, you can also pass the Zookeeper address with the same results:

$ kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-topic

4. List Topics

You can use the bin/kafka-topics.sh shell script along with the Zookeeper service URL as well as the –list option to display a list of all the topics in the Kafka cluster.

$ bin/kafka-topics.sh --list --zookeeper localhost:2181

You can also pass the Kafka cluster URL to list all topics.

 $ bin/kafka-topics.sh --bootstrap-server=localhost:9092 --list
Migrate data from Kafka to BigQuery
Migrate data from Kafka to Snowflake
Migrate data from Kafka to Redshift

5. Publish Events to a Topic

Kafka clients normally communicate via the Kafka broker to read and write events. The broker is responsible for persisting these events in a distributed, partitioned, commit log (topic).

You can use Kafka CLI’s console producer client to emit new events to your Kafka topic. Run the following command, to perform this:

$ bin/kafka-console-producer.sh --topic my-topic --bootstrap-server localhost:9092

After running this command, a prompt will open. Type your messages and click enter to publish them to the Kafka topic. Each time you click to enter a new message is submitted.

My first event
My second event
My third event

You have now published 3 events to the my-topic topic. Click Ctrl+C to stop the producer client.

6. View Events

To view the events stored in the Kafka topic, you will use Kafka CLI’s consumer client. So, open a new terminal session. Then you can run the command shown below:

$ bin/kafka-console-consumer.sh --topic my-topic --from-beginning --bootstrap-server localhost:9092
My first event
My second event
My third event

You can use the Kafka CLI’s producer in a new terminal to publish new events using the command used in the previous step. It’s interesting to see that the events will instantly appear in the consumer client’s terminal session.

7. Change Message Retention Period

When the producer client sends an event to the Kafka broker, that event is appended to the end of one of the commit logs. By default, the event is retained for 168 hours or 7 days after which it’s deleted to free up disk space.

Based on your application’s requirements, you might want to bypass this behavior. For example, consider the following command:

$ bin/kafka-topics.sh --zookeeper=localhost:2181 --alter --topic my-topic --config retention.ms=300000*

This sets a retention period of 5 minutes for all messages appended to the topic “my-topic“.

8. Add Partitions to a Topic

In a multi-node cluster, you might consider splitting your topics into multiple partitions to achieve higher throughput. This is because if you simply constrain a topic to a single node, it limits the ability to scale out. Instead of limiting yourself to a single node, you should take advantage of the extra CPU and RAM by distributing your topics across multiple machines. To add partitions to your Kafka topic, use the following command:

$ bin/kafka-topics.sh --zookeeper=localhost:2181 --alter --topic my-topic --partitions 10

You can add as many partitions as you would like but it’s recommended to limit that to 10 partitions per topic.

9. Delete a Topic

You can delete a topic from the Kafka broker using the following Kafka CLI command:

$ bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic my-topic

However, if you have set a custom retention policy for your topic as we did earlier, you first need to delete it before executing that command. To delete the retention policy, use the following command:

$ bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic mytopic --delete-config retention.ms

Conclusion

In this article, you learned how to set up a simple single-node Kafka cluster. Along the way, you also learned how to use various Kafka CLI commands to produce and consume messages among other tasks. You can use the commands showcased on this page as a reference when developing and managing your own Kafka applications.

However, as a Developer, streaming complex data from a diverse set of data sources like Databases, CRMs, Project management Tools, Streaming Services, Marketing Platforms using Kafka can seem to be quite challenging. This is where a simpler alternative like Hevo can save your day! 

Hevo Data is a No-Code Data Pipeline that offers a faster way to move data from 150+ Data Sources such as Kafka and other 60+ Free Sources, into your Data Warehouse to be visualized in a BI tool. Hevo is fully automated and hence does not require you to code.Try a 14-day free trial and experience the feature-rich Hevo suite firsthand. Also, checkout our unbeatable pricing to choose the best plan for your organization.

Frequently Asked Questions

1. How can you check Kafka broker logs using CLI?

You can see broker logs directly on the server running the broker, or you can access the Kafka log directory and look at its log files, for example, (/var/log/kafka/server.log). Kafka CLI does not have a command for it.

2. How can I check consumer group offsets using CLI?

The Kafka-consumer-groups.sh –describe –group <group_name> –bootstrap-server <server> command displays the current offsets and lag of a consumer group.

3. How can you list all active producers using CLI?

Kafka CLI does not display direct active producers. However, you may keep yourself updated through monitoring the activity on topics through the kafka-topics.sh –describe command or log data.

Jeremiah
Technical Content Writer, Hevo Data

Jeremiah is a specialist in crafting insightful content for the data industry, and his writing offers informative and simplified material on the complexities of data integration and analysis. He enjoys staying updated on the latest trends and uses his knowledge to help businesses succeed.