Companies are using data-driven approaches to grow more effectively in a fast pace environment and stay ahead of their competitors. Organizations transform their businesses by using event-driven architecture. The modern applications used by companies require flexibility, high performance, and scalability to run all the essential services efficiently. Now, Data Pipelining and Data Streaming power most applications and microservices. 

Apache Kafka is an event streaming platform widely used by companies for distributing events at high throughput and a publish-subscribe messaging system that makes users read and write data more conveniently. Kafka CLI (Command Line Interface) allows users to interact with Kafka services and develop scalable applications and manage Data Streaming. 

The Kafka CLI (Command Line Interface) is easy to use tool that can be accessed through the terminal window for seamless event streaming and processing. In this article, you will learn about getting started with Kafka CLI and how to use it. You will also go through some of the crucial commands one should know to understand Kafka CLI.

Table of Contents

What is Apache Kafka?

Kafka Logo
Image Source

Apache Kafka is an open-source distributed event streaming platform written in Scala and Java. It is a framework implementation used by many companies and Developers for creating and managing seamless Data Pipelines, Data Integration, and Streaming Analytics. Kafka aims to deliver low latency and high throughput platforms that can handle trillions of events a day. Apache Software Foundation initially developed Kafka as a messaging queue that is based on an abstraction of a distributed commit log.

Kafka is a publish-subscribe messaging platform to deliver data feed in real-time to Data Pipelines, Streaming, and replay data feeds. It can easily integrate with external systems using Kafka Connect that provides Kafka Streams using TCP-based protocols. Companies widely use Kafka to enhance the streaming experience of both operators and Developers in production, at a massive scale. With the help of Kafka, Developers can easily serve multiple requests from applications with ease.

Key Features of Kafka

Some of the main features of Kafka are listed below:

  • Scalability: Kafka can handle trillions of messages per day and petabytes of data with thousands of partitions. It makes it easier for organizations to scale production clusters up to a thousand brokers.
  • High Availability: Kafka replicates your data across multiple clusters efficiently in their availability zones, or it can connect to separate clusters across the geographics regions so that your data is available anytime.
  • Integrations: Kafka comes with its out-of-the-box Connect interface that allows Developers to easily connect to hundreds of event sources and event sinks such as AWS S3, MySQL, PostgreSQL, Elasticsearch, etc.
  • Ease of Use: Kafka offers guaranteed ordering and zero message loss. It also allows Developers to learn and develop applications using Kafka CLI with the help of documentation, online training, guided tutorials, videos, sample projects, and other resources.
  • Fault-Tolerant: Kafka offers fault-tolerant clusters to keep the organization data permanent and safe in distributed and durable clusters.

To know more about Kafka, click here.

Simplify Kafka ETL and Data Analysis with Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline, helps load data from any data source such as Databases, SaaS applications, Cloud Storage, SDK,s, and Streaming Services and simplifies the ETL process. It supports 100+ Data Sources including Apache Kafka and other 40+ Free Sources. You can use Hevo Pipelines to replicate the data from your Apache Kafka Source to the Destination system. It loads the data onto the desired Data Warehouse/destination and transforms it into an analysis-ready form without having to write a single line of code.

Hevo’s fault-tolerant and scalable architecture ensures that the data is handled in a secure, consistent manner with zero data loss and supports different forms of data. Hevo supports two variations of Kafka as a Source. Both these variants offer the same functionality, with Confluent Cloud being the fully-managed version of Apache Kafka.

Get Started with Hevo for Free

Check out why Hevo is the Best:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled securely and consistently with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
Sign up here for a 14-Day Free Trial!

Crucial Kafka CLI Commands

Kafka CLI: Sending Message via Broker

Now that you have understood about Kafka. In this section, you will learn about Kafka CLI (Command Line Interface), how to use Kafka CLI and some basic commands that you should know to understand Kafka CLI. The following steps and commands are given below to get started with Kafka CLI.

Installing Kafka CLI

  • Download the Apache Kafka CLI here.
  • Once you download the Kafka CLI file.
  • Open up the terminal in the directory of Kafka CLI downloaded file.
  • Now, you need to extract it by it and then change the directory to the Kafka CLI extracted directory by using the following command given below.
$ tar -xzf kafka_2.13-3.1.0.tgz
$ cd kafka_2.13-3.1.0
  • Here, in the above command change the version of the file according to the latest version you downloaded.

1) Starting the Kafka Environment

  • For running Kafka CLI on your system, your local environment must have Java 8+ installed.
  • Now from the terminal start the ZooKeeper service by using the following command given below.
$ bin/zookeeper-server-start.sh config/zookeeper.properties
  • Now, let’s start the broker service for Kafka CLI.
  • Open up the new terminal for Kafka CLI and run the following command given below.
$ bin/kafka-server-start.sh config/server.properties
  • Once both the services are launched successfully, now you will have the Kafka environment ready to use with Kafka CLI.

2) Creating the Topic

  • With the help of Kafka CLI, you can read, write, store, and process events across many machines. 
  • All the events are organized and stored in topics that are similar to the folder in the directory, and events are the files stored in that directory.
  • Now, open a new terminal window and create a new topic using the following Kafka CLI given below.
$ bin/kafka-topics.sh --create --topic sample-topic --bootstrap-server localhost:3045
  • Here the name of the new topic is “sample-topic” on the localhost port 3045.
  • If you need to view the details of the newly created topic such as partition count, then use the following command given below.
$ bin/kafka-topics.sh --describe --topic sample-topic --bootstrap-server localhost:3045
  • The above command will give a similar output as shown below.
Topic:sample-topic PartitionCount:1 ReplicationFactor:1 Configs:
    Topic: sample-topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0

3) Listing Existing Topics

  • For listing all the topics in the Kafka environment, you can run the following command given below.
bin/kafka-topics.sh --zookeeper localhost:3045 --list

4) Deleting the Topic

  • You can delete the topic via Kafka CLI using the following command given below.
bin/Kafka-topics.sh --zookeeper localhost:3045 --delete --topic sample-topic
  • If you have deleted the newly created topic again create a new topic to go further in this guide for Kafka CLI.

5) Writing Events into the Topic

  • Now that you have created a topic, you can write events into the topic.
  • Writing Events via Kafka CLI is done by communicating Kafka client with Kafka brokers over the network.
  • As the events are received, the brokers store them in clusters.
  • Now, you need to run the console producer client via Kafka CLI to write the events into the topic by using the following command given below.
$ bin/kafka-console-producer.sh --topic sample-topic --bootstrap-server localhost:3045
This is sample event one
This is sample event two
  • In the above command, the two events are written in two separate lines.

6) Reading the Events

  • You have created a new topic written new events into it. Now, let’s see how to read the events in Kafka CLI.
  • In the fresh terminal, call the Kafka console consumer client by using the following command given below.
$ bin/kafka-console-consumer.sh --topic sample-topic --from-beginning --bootstrap-server localhost:3045
  • When you run the above command it will give the following output as shown below.
This is sample event one
This is sample event two

7) Processing Events With Kafka Streams

  • You can process the events using Kafka Streams which allows you to implement real-time applications and microservices in Java/ Scala. The input or output is stored into Kafka topics.
  • A sample code for the WordCount algorithm is used below to demonstrate how to use Kafka Streams for processing events.
KStream<String, String> textLines = builder.stream("sample-topic");

KTable<String, Long> wordCounts = textLines
            .flatMapValues(line -> Arrays.asList(line.toLowerCase().split(" ")))
            .groupBy((keyIgnored, word) -> word)
            .count();

wordCounts.toStream().to("output-topic", Produced.with(Serdes.String(), Serdes.Long()));

8) Terminating the Kafka Environment 

  • Like you terminate the running terminal by pressing the CTRL+C key combination. Similarly, you can terminate the Kafka CLI operations in a sequence.
    1. Stop the producer and consumer clients by pressing the CTRL+C key combination.
    2. Stop running Kafka broker with CTRL+C.
    3. Then, stop the ZooKeeper service with CTRL+C.

Conclusion 

In this article, you learnt about Apache Kafka, and how to use Kafka CLI. You also read how to set up a Kafka environment using Kafka CLI and some of the basic commands you must know to get started with Kafka. Kafka CLI allows users to execute various operations for creating, managing, streaming, and processing topics and events. Developers use Kafka CLI for managing Data Streams and connecting to various scalable applications or microservices for publish-subscribe services.

Visit our Website to Explore Hevo

Kafka streams thousands of events that include valuable business data. It is essential to store this data in Data Warehouses and run Analytics on it to generate insights. Hevo Data is a No-code Data Pipeline solution that helps to transfer data from Kafka and 100+ sources to desired Data Warehouse. It fully automates the process of transforming and transferring data to a destination without writing a single line of code.

Want to take Hevo for a spin? Sign Up here for a 14-day free trial and experience the feature-rich Hevo suite first hand.

Share your experience of learning about working with Kafka CLI in the comments section below!

mm
Former Research Analyst, Hevo Data

Aditya has a keen interest in data science and is passionate about data, software architecture, and writing technical content. He has experience writing around 100 articles on data science.

No-code Data Pipeline For your Data Warehouse

Get Started with Hevo