Apache Kafka is a message streaming service that allows different applications in a distributed system to communicate and share data through messages.
It functions as a pub/sub-system where producer applications publish messages, and consumer systems subscribe to them.
Apache Kafka enables you to adopt a loosely coupled architecture between the parts of your system that produce and consume data. This makes designing and managing the system simpler. Kafka relies on Zookeeper for metadata management and synchronization of different elements of the cluster.
Features of Apache Kafka
Apache Kafka has grown popular, among other reasons, for being
- Scalable through clusters and partitions
- Fast capable of performing 2 million writes per second
- Maintains the order in which messages are sent
- Reliable through its system of replicas
- It can be upgraded with zero downtime
Now, let’s explore some of the common use cases of Kafka.
Common Use Cases of Apache Kafka
Kafka is often used in processing big data, Recording and aggregating events such as button clicks for analytics, and Combining logs from different parts of a system into one central location.
It helps in enabling communication between different applications in a system and real-time processing of data from IoT devices.
Now, let’s check out the detailed steps to install Kafka on Windows and Linux.
Installing Kafka on Windows
First, check if Java is installed on your machine to install Apache Kafka on Windows. Open up the command prompt in Administrator mode and enter the command:
If Java is installed, you should get the JDK version number currently installed.
If you get an error message saying the command was not recognized, Java was not installed, and you need to install Java. To install Java, head to Adoptium.net and click on the download button.
This should download the Java installer file. When downloading is complete, run the installer. This should open up the installation prompt.
Press, Next repeatedly to choose the default options. Installation should then begin. Verify installation by closing the command prompt, reopening another command prompt in Administrator mode, and entering the command:
This time, you should get the JDK version you just installed. After installation is complete, we can begin installing Kafka.
To install Kafka, first go to the Kafka website.
Click on the link, and it should take you to the Downloads page. Download the latest binaries available.
This will download Kafka scripts and binaries packaged in
.tgz file. After downloading, you must extract the files from the .tgz archive. To extract, I will use WinZip, which can be downloaded from the WinZip website.
After extracting the file, move it to the
C:\ such that the file path becomes
Then open the command prompt in Administrator mode and start Zookeeper by first navigating to the Kafka directory. And running the zookeeper-server-start.bat file with zookeeper.properties as the configuration file
cd C:\kafka bin\windows\zookeeper-server-start.bat config\zookeeper.properties
With Zookeeper running, we need to add the
wmic executable file that Kafka uses in our system PATH,
After this, start the Apache Kafka server by opening another command prompt session in Administrator mode and navigating to the
Then start Kafka by running
With this, Kafka should be running. You can customize server properties, such as where the logs are written in the
Installing Kafka on Linux
First, ensure that your system is up-to-date by updating all packages
sudo apt update && sudo apt upgrade
Next, check if Java is installed on your machine by running
java is installed, you will see the version number. However, if it is not, you can install it using
sudo apt install default-jdk
After this, we can install Apache Kafka by downloading the binaries from the website.
Open your terminal and navigate to the folder where the download was saved. In my case, I have to navigate to the Downloads folder.
Once in the downloads folder, extract the downloaded files using
tar -xvzf kafka_2.13-3.3.1.tgz
Navigate to the extracted folder
List the directories and files.
Once in the folder, start a Zookeeper server by running the
zookeeper-server-start.sh script located in the
bin directory of the extracted folder.
The script will require a Zookeeper configuration file. The default file is called
zookeeper.properties and is located in the
So to start the server, use the command:
With Zookeeper running, we can start the Apache Kafka server. The
kafka-server-start.sh script is also located in the
bin directory. The command also expects a configuration file. The default one is
server.properties stored in the
This should get Apache Kafka running. Inside the
bin directory, you will find many scripts to do things such as create topics, manage producers and manage consumers. You can also customize server properties in the
In this guide, we went through how to install Java and Apache Kafka. While you can install and manage Kafka clusters manually, you can also use managed options such as Amazon Web Services and Confluent.
Next, you can learn data processing with Kafka and Spark.