Important Kafka CLI Commands to Know in 2023

• February 28th, 2022

Kafka CLI Commands - Featured Image

The Kafka CLI is an interactive shell environment that provides you with command-line access for managing your Kafka resources programmatically. You can use the Kafka CLI to type in text commands that perform specific tasks within your Kafka environment. It is by far the fastest and most efficient interface for interacting with a Kafka cluster.

This post illustrates how to perform crucial Kafka tasks using the Kafka CLI commands. However, before diving in, let’s first understand what Kafka is and how is it used in the industry.

Table of Contents

What is Kafka?

Kafka CLI Commands- Kafka Logo
Image Source

Kafka is a popular open-source Publish/Subscribe messaging system that is used for building streaming analytics platforms and data integration pipelines. Kafka is both a queue for parallelizing tasks and a messaging-oriented middleware for service integration. The Kafka message broker (cluster) ingests and stores streams of messages (records) from event producers, which are later distributed to consumer services asynchronously when requested. 

Producers publish events to the Kafka instance or instances since a Kafka cluster can be either single or multi-node. Kafka stores the messages in a topic. The messages in the topic are organized in an immutable sequence (Python tuple object) based on when the messages were created. Downstream applications can then subscribe to the topics that they are interested in.

This design is quite agile and robust when compared to the pattern of broadcasting messages by synchronous remote procedure calls (RPCs), where producers must wait for consumers to receive the events.

Understanding Core Kafka Concepts

Kafka CLI Commands- Kafka Architecture
Image Source
  • Topic: A named resource to which a particular stream of messages is stored and published.
  • Producer: A client application that creates and publishes records/messages to a Kafka topic(s).
  • Consumer: A client application that subscribes to a topic(s) to receive data from it.
  • Message: The combination of data and associated metadata that a producer application writes to a topic and is eventually consumed by consumers.

Common Kafka Use Cases

Event sourcing and stream processing architecture is a data strategy that is gaining popularity. Currently, over 60% of the Fortune 100 use Kafka in their tech stack. Some of these organizations include Cisco, Goldman Sachs, Target, and Intuit. There are 1000+ Kafka use cases but let’s just highlight the most common ones:

  • Ingestion of User Interaction and Server Events: To make use of user interaction events from end-user apps or server events from your system, you can forward the events to Kafka, process them, and then deliver them to analytical databases. Kafka can ingest events from many clients simultaneously.
  • Data Replication: Kafka can be used to stream incremental changes in databases and forward those to a destination such as a data lake or data warehouse for data replication and analysis.
  • ESB (Enterprise Service Bus): Companies are using Kafka as an ESB and this is helping them to transition from a monolithic to a microservices architecture. Data is made available to various applications and services across the entire organization in real-time.
  • Fraud Detection: The ability to collect, process and distribute financial events and database updates in real-time is enabling teams to do real-time threat/fraud detection. 
  • Data Streaming from Applications and IoT Devices: Applications can publish a stream of real-time user interaction events. On the other hand, IoT sensors can stream data to Kafka for use in other downstream systems.
  • Load Balancing: A service can be deployed on servers that span multiple Datacenters but subscribe to a common topic (data stream). If one service goes down in any Datacenter, the others can automatically take over.

Simplify Kafka ETL and Data Analysis with Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline, helps load data from any data source such as Databases, SaaS applications, Cloud Storage, SDK,s, and Streaming Services and simplifies the ETL process. It supports 100+ Data Sources such as Apache Kafka and other 40+ Free Sources. It is a 3-step process by just selecting the data source, providing valid credentials, and choosing the destination. 

Hevo loads the data onto the desired Data Warehouse/destination in real-time and enriches the data and transforms it into an analysis-ready form without having to write a single line of code. Its completely automated pipeline, fault-tolerant, and scalable architecture ensure that the data is handled in a secure, consistent manner with zero data loss and supports different forms of data. The solutions provided are consistent and work with different BI tools as well.

GET STARTED WITH HEVO FOR FREE

Check out why Hevo is the Best:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled securely and consistently with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.

Simplify your Data Analysis with Hevo today! 

SIGN UP HERE FOR A 14-DAY FREE TRIAL!

Working with Kafka CLI Commands

Before you begin working with Kafka CLI, make sure you meet the following requirements:

  • Install Java 8+ on your workstation.
  • Download and extract Kafka on your workstation.

In this section you learn about the following Kafka CLI commands:

Kafka CLI Commands: Spin Up a Kafka Environment

  • Open your terminal and navigate to the directory where you extracted the Kafka zip or tar file:
$ cd kafka_2.13-3.1.0
  • Run the following command to launch the ZooKeeper service:
$ bin/zookeeper-server-start.sh config/zookeeper.properties
  • In another terminal, run the following command to start the Kafka broker service:
$ bin/kafka-server-start.sh config/server.properties

After successfully executing these commands, you will have an active Kafka environment. Let’s now look at some basic commands that you can execute using the Kafka CLI.

Kafka CLI Commands: Create a Topic

A topic in Kafka is akin to a directory or folder on your computer. The only difference is that a topic stores events (records/messages) rather than files and these events are normally distributed across multiple nodes/brokers.

Events can be application logs, web clickstreams, data emitted by IoT sensors, and much more. So before creating events, the first step is to create a topic or topics that will store and organize these records.

To create a Kafka topic using the Kafka CLI, you will use the bin/kafka-topics.sh shell script that’s bundled in the downloaded Kafka distribution. Launch another terminal session and execute the following command:

$ bin/kafka-topics.sh --create --topic my-topic --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1

You have successfully created a topic called my-topic.

Kafka CLI Commands: View Details About a Topic

Kafka exposes a command that you can take advantage of to view metadata about the topics in your Kafka cluster. The following Kafka CLI command provides information on the number of partitions and replicas, among other topic details:

$ bin/kafka-topics.sh --bootstrap-server=localhost:9092 --describe --topic my-topic

Apart from passing the cluster information, you can also pass the Zookeeper address with the same results:

$ kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-topic

Kafka CLI Commands: List Topics

You can use the bin/kafka-topics.sh shell script along with the Zookeeper service URL as well as the –list option to display a list of all the topics in the Kafka cluster.

$ bin/kafka-topics.sh --list --zookeeper localhost:2181

You can also pass the Kafka cluster URL to list all topics.

 $ bin/kafka-topics.sh --bootstrap-server=localhost:9092 --list

Kafka CLI Commands: Publish Events to a Topic

Kafka clients normally communicate via the Kafka broker to read and write events. The broker is responsible for persisting these events in a distributed, partitioned, commit log (topic).

You can use Kafka CLI’s console producer client to emit new events to your Kafka topic. Run the following command, to perform this:

$ bin/kafka-console-producer.sh --topic my-topic --bootstrap-server localhost:9092

After running this command, a prompt will open. Type your messages and click enter to publish them to the Kafka topic. Each time you click to enter a new message is submitted.

My first event
My second event
My third event

You have now published 3 events to the my-topic topic. Click Ctrl+C to stop the producer client.

Kafka CLI Commands: View Events

To view the events stored in the Kafka topic, you will use Kafka CLI’s consumer client. So, open a new terminal session. Then you can run the command shown below:

$ bin/kafka-console-consumer.sh --topic my-topic --from-beginning --bootstrap-server localhost:9092
My first event
My second event
My third event

You can use the Kafka CLI’s producer in a new terminal to publish new events using the command used in the previous step. It’s interesting to see that the events will instantly appear in the consumer client’s terminal session.

Kafka CLI Commands: Change Message Retention Period

When the producer client sends an event to the Kafka broker, that event is appended to the end of one of the commit logs. By default, the event is retained for 168 hours or 7 days after which it’s deleted to free up disk space.

Based on your application’s requirements, you might want to bypass this behavior. For example, consider the following command:

$ bin/kafka-topics.sh --zookeeper=localhost:2181 --alter --topic my-topic --config retention.ms=300000*

This sets a retention period of 5 minutes for all messages appended to the topic “my-topic“.

Kafka CLI Commands: Add Partitions to a Topic

In a multi-node cluster, you might consider splitting your topics into multiple partitions to achieve higher throughput. This is because if you simply constrain a topic to a single node, it limits the ability to scale out. Instead of limiting yourself to a single node, you should take advantage of the extra CPU and RAM by distributing your topics across multiple machines. To add partitions to your Kafka topic, use the following command:

$ bin/kafka-topics.sh --zookeeper=localhost:2181 --alter --topic my-topic --partitions 10

You can add as many partitions as you would like but it’s recommended to limit that to 10 partitions per topic.

Kafka CLI Commands: Delete a Topic

You can delete a topic from the Kafka broker using the following Kafka CLI command:

$ bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic my-topic

However, if you have set a custom retention policy for your topic as we did earlier, you first need to delete it before executing that command. To delete the retention policy, use the following command:

$ bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic mytopic --delete-config retention.ms

Conclusion

In this article, you learned how to set up a simple single-node Kafka cluster. Along the way, you also learned how to use various Kafka CLI commands to produce and consume messages among other tasks. You can use the commands showcased on this page as a reference when developing and managing your own Kafka applications.

Currently, Apache Zookeeper is used to manage the Kafka cluster’s metadata. This will soon change with the release of Apache Kafka 2.8.0 which removes the Apache Zookeeper dependency by implementing a built-in consensus layer.

However, as a Developer, streaming complex data from a diverse set of data sources like Databases, CRMs, Project management Tools, Streaming Services, Marketing Platforms using Kafka can seem to be quite challenging. This is where a simpler alternative like Hevo can save your day! 

Hevo Data is a No-Code Data Pipeline that offers a faster way to move data from 100+ Data Sources such as Kafka and other 40+ Free Sources, into your Data Warehouse to be visualized in a BI tool. Hevo is fully automated and hence does not require you to code.

VISIT OUR WEBSITE TO EXPLORE HEVO

Want to take Hevo for a spin?

SIGN UP and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

Any questions or feedback regarding Kafka CLI Commands? Reach out to us in the comments section below.

No-Code Data Pipeline For Your Apache Kafka