Apache Kafka has a vast environmental architecture that comprises producer, broker, consumer, and Zookeeper. In Kafka architecture, Zookeeper serves as a centralized controller for managing all the metadata information about Kafka producers, brokers, and consumers. However, you can install and run Kafka without Zookeeper. In this case, instead of storing all the metadata inside Zookeeper, all the Kafka configuration data will be stored as a separate partition within Kafka itself. 

Apache Kafka is an Open-source Distributed Streaming Platform that collects, processes, stores, and manages real-time data that are streaming continuously into Kafka servers. Zookeeper is an open-source coordination service for managing distributed applications. But, does Kafka require Zookeeper?

In this article, you will learn about Kafka, Zookeeper, and running Apache Kafka without Zookeeper. You will also learn how to install Apache Kafka without Zookeeper

What is Zookeeper in Kafka?

In Kafka, ZooKeeper is used for managing and coordinating Kafka broker nodes, handling leader elections for partitions, and keeping track of the status of nodes in the Kafka cluster.

Steps to Install Apache Kafka without Zookeeper

This article focuses on Scala version 2.12, i.e., Kafka 2.8.0 to install Kafka without Zookeeper.

The steps to be followed to install Kafka without Zookeeper are as follows:

Simplify Apache Kafka ETL and Analysis with Hevo’s No-code Data Pipeline

A fully managed No-code Data Pipeline platform like Hevo Data helps you integrate and load data from 150+ different Data sources (including 40+ free sources) such as Kafka to a Data Warehouse or Destination of your choice in real-time effortlessly. Hevo, with its minimal learning curve, can be set up in just a few minutes, allowing the users to load data without compromising performance. Its strong integration with umpteenth sources allows users to bring in data of different kinds smoothly without having to code a single line. 

Get Started with Hevo for Free

A) Download Apache Kafka

The steps followed to download Apache Kafka are as follows:

Option 1: In Windows Operating System

  • Step 1: Initially, go to the official website of Apache Kafka and click on the “Download Kafka” button.
  • Step 2: On the next page, you will see various Kafka versions. From that, choose the latest Kafka version that removes Zookeeper dependency. You can download the preferred version by clicking on the respective Kafka version. 
  • Step 3: Now, you will be redirected to the new page having the direct download link of Kafka. Click the link to download Kafka to your PC directly. 

Option 2: In Linux Operating System

  • Step 1: If you are using a Linux OS, you can easily download it directly from your command prompt using the “wget” command. This command serves as a non-interactive network downloader that downloads any number of files from a specific server while being in your command prompt.
  • Step 2: For that, open your command prompt terminal and write the command as given below.
wget https://apachemirror.wuchna.com/kafka/2.8.0/kafka_2.12-2.8.0.tgz

The link followed by the “wget” command is nothing but the direct download link used to download Kafka.

  • Step 3: Write the command and press the “Enter” key. Wait for a few minutes until the download completes. 
  • Step 4: After downloading, you can further extract or unzip your files by writing the below command in your terminal.
tar xzf kafka_2.12-2.8.0.tgz

B) Run KRaft

The steps to be followed are:

  • Step 1: Navigate to your Kafka folder so that the commands you write from now on will point to the respective Kafka folder.
  • Step 2: Write the below command in the terminal.
cd kafka_2.12-2.8.0
  • Step 3: Now, you are in the Kafka folder. Further, go to the “Kraft” folder inside the “config” folder of the Kafka home directory. The below command will help you with the navigation.
cd config/kraft
  • Step 4: As shown in the above image, you will see some sample configuration files inside the Kraft folder. From these, the “server.properties” file helps you start new Kafka clusters without Zookeeper.
  • Step 5: Now, copy the “server.properties” file three times to create three new configuration files. Further, you can configure each file for creating three-node Kafka Clusters.
  • Step 6: You can name the newly created property files as server1.properties, server2.properties, and server3.properties, indicating three different servers.
  • Step 7: The name of the newly created files can be given parallelly to the command as shown below. 
cp server.properties server1.properties
cp server.properties server2.properties
cp server.properties server3.properties

Here, the names of the newly created files are server1.properties, server2.properties, server3.properties.

  • Step 8: After creating new config files, you can start configuring the properties of each file. Initially, configure server1.properties file. For that, you can use the “vi” command that allows you to edit your file within the terminal instead of using any external editor application. 
  • Step 9: To edit server1.properties file, write the command as given below.
vi server1.properties 
  • Step 10: The above command will open server1.properties file to edit and configure further. 
  • Step 11: In the server file, you will find many lines of codes that define their respective properties. You have to edit or modify each property to look the same as the below properties.
  • Step 12: If some properties do not need any modifications to match the below codes, you can leave it as it is. Only modify the following properties and leave other properties unchanged.

You can also use the below commands to quickly copy and paste them into the respective configuration files.

  • Step 13: After modifying the codes, save the file. You have successfully modified and configured server1.properties file.
  • Step 14: Now, open server2.properties file to perform the same configuration process. Write the below command to open server2.properties file.
vi server2.properties 
  • Step 15: Follow the same procedure that you did while configuring the server1.properties file.
  • Step 16: Modify the server2.properties to look the same as the properties shown below.
  • Step 17: After configuring, save the respective file.
  • Step 18: Finally, modify server3.properties file while keeping the below codes as reference. Only edit the properties that require modification, leave other properties to remain the same.

Now, all the server property files have been modified and updated. 

  • Step 19: In the next step, you will create a new “uuid” that will serve as your cluster-ID. For that, enter the command in your terminal as given below.
./bin/kafka-storage.sh random-uuid
  • Step 20: After executing the above command, you will get a unique UUID for the cluster. Note it down for future reference. 
  • Step 21: Now, you need to format the existing log directories or storage locations so that Kafka can store log files in the respective server’s folder instead of storing them in temporary directories. You have to format the locations separately for each server file.
  • Step 22: The basic command to format locations based on each server property file is given below. 
./bin/kafka-storage.sh format -t <uuid> -<server_config_location>
  • Step 23: In the above command, replace <uuid> with the UUID you copied before. Replace the <server_config_location> with the respective server property file. 
  • Step 24: Enter the command given below to format locations based on each server property file.
  • For server1.properties file 
./bin/kafka-storage.sh format -t uuid -c 
./config/kraft/server1.properties
  • For server2.properties file 
./bin/kafka-storage.sh format -t uuid -c 
./config/kraft/server1.properties
  • For server3.properties file 
./bin/kafka-storage.sh format -t uuid -c 
./config/kraft/server1.properties
  • Step 25: After formatting the locations, you are ready to start the Kafka servers. 
  • Step 26: Before starting the servers, you need to set up the heap properties. For that, you can execute the command given below. 
export KAFKA_HEAP_OPTS="-Xmx200M –Xms100M"
  • Step 27: By this command, you are providing a heap between 200 MB and 100 MB for Kafka. Kafka runs with 512MB as the heap size by default. Based on the requirements and use cases, you can increase your heap size to even 3GB and above. 
  • Step 28: In the below steps, you will start the servers.

Starting Server 1: 

 ./bin/kafka-server-start.sh -daemon 
./config/kraft/server1.properties

Starting Server 2: 

 ./bin/kafka-server-start.sh -daemon 
./config/kraft/server2.properties

Starting Server 3: 

 ./bin/kafka-server-start.sh -daemon 
./config/kraft/server3.properties
  • Step 29: Now, you have successfully started Kafka without Zookeeper. To check whether Kafka has started, you can execute the following command.
ps -ef | grep kafka

Once the command is executed, you can see several commands on your terminal, which provide appropriate information about the newly created servers. With this, you can confirm that Kafka servers are successfully started and running live. 

After starting Kafka, you can create topics to store real-time data. Further, you can run a Kafka producer and consumer in a separate terminal to start producing and receiving messages. Once you have started receiving messages, you can assure that you have successfully installed Apache Kafka without Zookeeper

Note: Using Apache Kafka without Zookeeper is still in its preview or testing phase. Therefore, it is advised not to implement it for production.

How does Apache Kafka run without Zookeeper? Does Kafka need Zookeeper? 

While Zookeeper Kafka is the traditional deployment model, newer Kafka versions (using KRaft mode) can operate without Zookeeper for simplified management.

Kafka without Zookeeper
Image Source

In the latest version of Kafka 2.8.0, users are provided with a preview of how to use Kafka without Zookeeper. Usually, Kafka uses Zookeeper to store and manage all the metadata information about Kafka clusters. Kafka also uses Zookeeper as a centralized controller that manages and organizes all the Kafka brokers or servers.

However, instead of storing all server config information in Zookeeper in the new Kafka version, you can store them as a topic partition inside the Kafka server itself. To start with Kafka without Zookeeper, you should run Kafka with Kafka Raft metadata mode i.e. KRaft.

Using KRaft, Kafka clusters are no longer limited by Zookeeper’s metadata management, enhancing the scalability of Kafka without Zookeeper.

KRaft metadata management often allows for faster recovery times for the Kafka controller, thus positively impacting the performance of Kafka without Zookeeper. Moreover, KRaft enhances the security of Kafka without Zookeeper by eliminating the attack surface and potential vulnerabilities previously associated with Zookeeper.

The KRaft controllers collectively form a Kraft quorum, which stores all the metadata information regarding Kafka clusters. With this method, you eradicate the dependency of Zookeeper within Kafka environment architecture. Besides, you can achieve various benefits like eliminating system complexities and data redundancy while running Kafka without Zookeeper. As Kafka plans to discontinue Zookeeper as a centralized configuration service, you will have a simplified Kafka architecture without any third-party service dependencies. 

Kafka without Zookeeper

Kafka without Zookeeper
Image Source

An Apacke Kafka setup without Zookeeper has a special server type—controller, which forms a cluster quorum. By utilizing the KRaft algorithm, this cluster chooses a leader to serve requests from other brokers connecting to pull the cluster state’s metadata. For brokers, while an active controller used to push the changes to brokers earlier, now, brokers pull metadata from a leader controller.

Among the many internal changes implemented by the Kafka community within the latest releases, here are the important ones:

  • Limitations of Kafka cluster scaling have been addressed.
  • Simplified setup of system configuration, observability, security, logging, etc. since knowledge requirements and production steups for one Kafka technology will suffice.

Conclusion

In this article, you have learned about running Apache Kafka without Zookeeper. While using Apache Kafka with Zookeeper, you might witness various issues, including data duplication and system complexities. To eradicate such complications, you can install and use Kafka without Zookeeper to attain maximum throughput.

For further information on installing Kafka on Windows, you can visit the former link.

Hevo Data, a No-code Data Pipeline provides you with a consistent and reliable solution to manage data transfer between a variety of sources and a wide variety of Desired Destinations with a few clicks.

Visit our Website to Explore Hevo

Want to give Hevo a try?

Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You may also have a look at the amazing Hevo Price, which will assist you in selecting the best plan for your requirements.

Share your experience of understanding the installation of Kafka without Zookeeper in the comment section below! We would love to hear your thoughts.

mm
Freelance Technical Content Writer, Hevo Data

Ishwarya has experience working with B2B SaaS companies in the data industry and her passion for data science drives her to product informative content to aid individuals in comprehending the intricacies of data integration and analysis.

No-code Data Pipeline for Apache Kafka