13 May

How to Install Apache Kafka on Debian 11 (Linux Message Broker)

How to Install Apache Kafka on Debian 11 (Linux Message Broker). In this article we will introduce what Apache Kafka is with it’s features and next we will move onto installation phase on Debian 11.

What is Apache Kafka

Apache Kafka is a distributed event streaming platform that receives data from distinct sources and shares it with the target system in real time. Written in Scala and Java, the open source distributed publish subscribe messaging system facilitates the asynchronous data exchange between servers and applications.

Today, the adaptation of Kafka has enabled businesses to deliver timely experiences to consumers and manage real time data.

Earlier, the data processing followed the batch processing technique. As per the periodic batch processing technique, all the raw data was collected and stored first. Later it was processed at arbitrary time intervals. For example, companies used to wait till the month end or week to analyze all the collected information, calculate profits and expenses. The only drawback of practicing batch processing was it did not provide real time data.

With the growth and expansion of businesses, the need for analyzing data in real time has become necessary to make better decisions and strategies. With Apache Kafka server this requirement to stream events in real time was resolved. Another feature that makes Kafka different from other messaging systems is it stores all messages for a period and consumers are solely responsible for tracking read messages.

If you want to build resilient data services and applications, look no further. Kafka is a fast, highly scalable and fault tolerant publish subscribe system. It has five core functions, including Publish, Consume, Process, Connect, Store. These functions enable the system to deliver higher throughput. Further, it relies on the file system for maintaining and caching purposes.

Today, thousands of companies trust the platform as it stores all the streams safely in a fault tolerant cluster and delivers messages at a network limited throughput. Also, it has an out of the box Connect interface that allows integration with various event sources such as Elasticsearch, AWS S3, Postgres, etc.

Next in this tutorial about How to Install Apache Kafka on Debian 11 (Linux Message Broker) is to explain Apache Kafka benefits. Let’s do it!

Also Read

Kafka Architecture (Cluster, Topics, Producers, Partitions, Consumers, Zookeeper)

Benefits of Apache Kafka

There are various reasons why many high profile companies are investing in Apache Kafka for collecting data in real time. Have a look at some of its benefits that might convince you and help change your mind.

Open Source

Kafka is an Open Source platform, i.e., the source code is free and available to all developers or users for modification. There are no restrictions or licensing fees for the same.

Scale and Speed

Unlike other messaging systems, Kafka provides the data in real time. Also, being a distributed platform, all the processing work is distributed among different physical and virtual machines. It further helps in scaling out and providing quick results.

Extensible

Kafka collaborates with Zookeeper to coordinate and synchronize with other services

Performance

Kafka provides a queue that can handle large amounts of data and move messages from one sender to another.

Fault tolerant

Kafka is a publish subscribe messaging system built for high throughput and fault tolerance.Kafka supports automatic recovery features and is resilient to node failures. It ensures that even if one node goes down the other will replaces it and deliver a quality result.

Replication

Copies of various topics are automatically generated, but with Kafka, customers have the ability to manually configure topics and prevent replication as per their needs.

Allows message replay

Kafka has certain features that enable multiple consumers to subscribe to a similar topic and replay the messages for a specific period of time.

Stream Processing

Apache Kafka allows seamless movement of data in the form of messages, streams, or records. Further, it allows users to inspect, transform and leverage data before moving. The platform is easy to use and supports a native approach for storing and moving data in real time.

Seamless Messaging Functionality

Organizations that use legacy communications models to deal with large volume data often find issues in communications and scalability. However, with the messaging and streaming functionality, Kafka has reduced this issue and users can publish, subscribe, store and process data in real time.

Next we will explain how to install Apache Kafka on Debian 11.

Also Read

How to Setup Apache Kafka Server on Azure/AWS/GCP

How to Install Apache Kafka on Debian 11

Install Java JDK

Apache Kafka is a Java based application. So Java must be installed on your system. If not installed, you can install it by running the following command:

				
					apt-get install default-jdk -y

Once Java is installed, verify the Java installation using the following command:

				
					java --version

You will get the following output:

				
					openjdk 11.0.14 2022-01-18
OpenJDK Runtime Environment (build 11.0.14+9-post-Debian-1deb11u1)
OpenJDK 64-Bit Server VM (build 11.0.14+9-post-Debian-1deb11u1, mixed mode, sharing)

Also Read

How to Install Apache Kafka on Ubuntu 20.04 (Kafka Cluster)

Install Kafka on Debian 11

Before starting, it is recommended to create a dedicated user to run Apache Kafka. You can create it using the following command:

				
					adduser kafka

Please add Kafka user to the sudo group with the following command:

				
					adduser kafka sudo

Next, log in as a Kafka user and download the latest version of Apache Kafka using the following command:

				
					su - kafka
wget https://archive.apache.org/dist/kafka/2.7.2/kafka-2.7.2-src.tgz

Once the download is completed, extract the downloaded file with the following command:

				
					tar -xvzf kafka-2.7.2-src.tgz
mv kafka-2.7.2-src kafka

Then exit from the Kafka user with the following command:

				
					exit

Now you will also need to install the Gradle to your system. You can install it with the following command:

				
					cd /home/kafka/kafka
./gradlew jar -PscalaVersion=2.13.3

Next, set proper ownership to the Kafka directory:

				
					chown -R kafka:kafka /home/kafka/kafka

Also Read

Top 15 Best RabbitMQ Alternatives Message Brokers (Pros and Cons)

Create Systemd Unit Files for Kafka and Zookeeper

Next, you will need to create a systemd service file for both Zookeeper and Kafka to manage their services.

First, create a Zookeeper service file using the following command:

				
					nano /etc/systemd/system/zookeeper.service

Add the following lines:

				
					[Unit]
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/kafka/bin/zookeeper-server-start.sh /home/kafka/kafka/config/zookeeper.properties
ExecStop=/home/kafka/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Save and close the file then create a systemd service file for Kafka using the following command:

				
					nano /etc/systemd/system/kafka.service

Add the following lines:

				
					[Unit]
Requires=zookeeper.service
After=zookeeper.service

[Service]
Type=simple
User=kafka
ExecStart=/bin/sh -c '/home/kafka/kafka/bin/kafka-server-start.sh /home/kafka/kafka/config/server.properties > /home/kafka/kafka/kafka.log 2>&1'
ExecStop=/home/kafka/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Save and close the file then reload the systemd daemon using the following command:

				
					systemctl daemon-reload

Next, start and enable the Apache Kafka service with the following command:

				
					systemctl enable --now kafka

Kafka Zookeeper

You can now check the status of the Apache Kafka and Zookeeper service using the following command:

				
					systemctl status kafka zookeeper

You will get the following output:

				
					● kafka.service
     Loaded: loaded (/etc/systemd/system/kafka.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-03-25 05:44:23 UTC; 10s ago
   Main PID: 8893 (sh)
      Tasks: 71 (limit: 4679)
     Memory: 333.2M
        CPU: 8.748s
     CGroup: /system.slice/kafka.service
             ├─8893 /bin/sh -c /home/kafka/kafka/bin/kafka-server-start.sh /home/kafka/kafka/config/server.properties > /home/kafka/kafka/kaf>
             └─8894 java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvo>

Mar 25 05:44:23 debian11 systemd[1]: Started kafka.service.

● zookeeper.service
     Loaded: loaded (/etc/systemd/system/zookeeper.service; disabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-03-25 05:44:23 UTC; 10s ago
   Main PID: 8892 (java)
      Tasks: 31 (limit: 4679)
     Memory: 81.9M
        CPU: 3.137s
     CGroup: /system.slice/zookeeper.service
             └─8892 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGC>

Mar 25 05:44:25 debian11 zookeeper-server-start.sh[8892]: [2022-03-25 05:44:25,712] INFO Created server with tickTime 3000 minSessionTimeout >
Mar 25 05:44:25 debian11 zookeeper-server-start.sh[8892]: [2022-03-25 05:44:25,759] INFO Using org.apache.zookeeper.server.NIOServerCnxnFacto>
Mar 25 05:44:25 debian11 zookeeper-server-start.sh[8892]: [2022-03-25 05:44:25,770] INFO Configuring NIO connection handler with 10s sessionl>
Mar 25 05:44:25 debian11 zookeeper-server-start.sh[8892]: [2022-03-25 05:44:25,793] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zoo>
Mar 25 05:44:25 debian11 zookeeper-server-start.sh[8892]: [2022-03-25 05:44:25,834] INFO zookeeper.snapshotSizeFactor = 0.33 (org.apache.zook>
Mar 25 05:44:25 debian11 zookeeper-server-start.sh[8892]: [2022-03-25 05:44:25,841] INFO Snapshotting: 0x0 to /tmp/zookeeper/version-2/snapsh>
Mar 25 05:44:25 debian11 zookeeper-server-start.sh[8892]: [2022-03-25 05:44:25,848] INFO Snapshotting: 0x0 to /tmp/zookeeper/version-2/snapsh>
Mar 25 05:44:25 debian11 zookeeper-server-start.sh[8892]: [2022-03-25 05:44:25,885] INFO PrepRequestProcessor (sid:0) started, reconfigEnable>
Mar 25 05:44:25 debian11 zookeeper-server-start.sh[8892]: [2022-03-25 05:44:25,911] INFO Using checkIntervalMs=60000 maxPerMinute=10000 (org.>
Mar 25 05:44:26 debian11 zookeeper-server-start.sh[8892]: [2022-03-25 05:44:26,641] INFO Creating new log file: log.1 (org.apache.zookeeper.s>

Also Read

How to Install RabbitMQ on Debian 11 Server Tutorial (Step by Step)

Install Cluster Manager for Apache Kafka

CMAK is an open source tool for managing and monitoring Kafka services developed by Yahoo. First, download it using the following command:

				
					apt-get install git -y
git clone https://github.com/yahoo/CMAK.git

Once the download is completed, edit the CMAK configuration file:

				
					nano ~/CMAK/conf/application.conf

Change the following lines:

				
					kafka-manager.zkhosts="kafka-manager-zookeeper:2181"
kafka-manager.zkhosts=${?ZK_HOSTS}
cmak.zkhosts="localhost:2181"
cmak.zkhosts=${?ZK_HOSTS}

Save and close the file then navigate to the CMAK directory and create a zip file for deploying the application:

				
					cd ~/CMAK
./sbt clean dist

You will get the following output:

				
					[info] Main Scala API documentation to /root/CMAK/target/scala-2.12/api...
[info] Non-compiled module 'compiler-bridge_2.12' for Scala 2.12.10. Compiling...
[info] Compiling 136 Scala sources and 2 Java sources to /root/CMAK/target/scala-2.12/classes ...
[info]   Compilation completed in 17.571s.
model contains 645 documentable templates
[info] Main Scala API documentation successful.
[info] LESS compiling on 1 source(s)
[success] All package validations passed
[info] Your package is ready in /root/CMAK/target/universal/cmak-3.0.0.6.zip
[success] Total time: 192 s (03:12), completed Mar 25, 2022, 5:51:09 AM
Graal diagnostic output saved in /root/CMAK/dumps/1648187271178/graal_diagnostics_10375.zip

Next, naviage to the ~/CMAK/target/universal directory and unzip the zip file:

				
					cd ~/CMAK/target/universal
unzip cmak-3.0.0.6.zip

Please change the directory to the extracted directory and run the cmak binary:

				
					cd cmak-3.0.0.6
bin/cmak

If everything is fine, you will get the following output:

				
					2022-03-25 05:52:48,313 - [INFO] k.m.a.KafkaManagerActor - Started actor akka://kafka-manager-system/user/kafka-manager
2022-03-25 05:52:48,315 - [INFO] k.m.a.KafkaManagerActor - Starting delete clusters path cache...
2022-03-25 05:52:48,326 - [INFO] k.m.a.DeleteClusterActor - Started actor akka://kafka-manager-system/user/kafka-manager/delete-cluster
2022-03-25 05:52:48,329 - [INFO] k.m.a.DeleteClusterActor - Starting delete clusters path cache...
2022-03-25 05:52:48,367 - [INFO] k.m.a.DeleteClusterActor - Adding kafka manager path cache listener...
2022-03-25 05:52:48,371 - [INFO] k.m.a.DeleteClusterActor - Scheduling updater for 10 seconds
2022-03-25 05:52:48,380 - [INFO] k.m.a.KafkaManagerActor - Starting kafka manager path cache...
2022-03-25 05:52:48,411 - [INFO] k.m.a.KafkaManagerActor - Adding kafka manager path cache listener...
2022-03-25 05:52:48,946 - [INFO] play.api.Play - Application started (Prod)
2022-03-25 05:52:49,443 - [INFO] k.m.a.KafkaManagerActor - Updating internal state...
2022-03-25 05:52:50,130 - [INFO] p.c.s.AkkaHttpServer - Listening for HTTP on /0:0:0:0:0:0:0:0:9000

At this point, CMAK is started and listening on port 9000.

Also Read

RabbitMQ vs ActiveMQ – What’s The Difference? (Pros and Cons)

Access Kafka Cluster Manager

You can now access the Kafka Cluster Manager using the URL http://your-server-ip:9000. You should see the following page:

Add Cluster

Click on the Cluster => Add Cluster to add the cluster. You should see the following page:

CMAK Cluster

Provide your cluster information and click on the Save button. You should see the following page:

Kafka cluster view

Now, click on the Go to cluster view. You should see the following page:

Also Read

RabbitMQ vs Redis – Message Brokers (Pros and Cons)

How to Install Apache Kafka on Debian 11 (Linux Message Broker) Conclusion

In the above guide, we explained how to install Apache Kafka on Debian 11. We also explained how to install the Kafka Cluster Manager to manage Apache Kafka. I hope you can now deploy the Apache Kafka in the production environment.

How to Install Apache Kafka on Debian 11 (Linux Message Broker)

What is Apache Kafka

Benefits of Apache Kafka

Open Source

Scale and Speed

Extensible

Performance

Fault tolerant

Replication

Allows message replay

Stream Processing

Seamless Messaging Functionality

How to Install Apache Kafka on Debian 11

Install Java JDK

Install Kafka on Debian 11

Create Systemd Unit Files for Kafka and Zookeeper

Kafka Zookeeper

Install Cluster Manager for Apache Kafka

Access Kafka Cluster Manager

Add Cluster

CMAK Cluster

Kafka cluster view

How to Install Apache Kafka on Debian 11 (Linux Message Broker) Conclusion

Related Posts:

Hitesh Jethva

Recent Posts

Pages

Follow Us