Apache Kafka is an open source, distributed, high-throughput publish-subscribe messaging system.

If you are approaching kafka for the first time then this post to help you get running distributed kafka cluster on your system with minimal steps. In this guide, we will discuss steps to setup kafka on Ubuntu 16.04.

The basic architecture of Kafka is organized around a few key terms:.

Zookeeper: a coordinator between brokers and clusters.

Topic: a topic is a category to which messages are published by the message producers.

Brokers: broker instance can handle reads and writes message.  

Producers: that insert data into the cluster.

Consumers: that read data from the cluster

Step 1: Install Java

Kafka needs a java runtime environment:

$sudo apt-get update
$sudo apt-get install default-jre

Step 2: Install Zookeeper

Zookeeper is a key value store used to maintain server state. This is mandatory to run the kafka.

It’s a centralized system for maintaining the configuration. It also does a job to elect the leaders.

$sudo apt-get install zookeeperd

Let’s check if this is alive or not.

$telnet localhost 2181

At prompt, enter this

ruok

if everything is okay then telnet session will reply this,

imok

Step 3: Create a service User for kafka

Kakfa is a network application, creating a non sudo user will minimize the risk of machine compromised. Let’s create a user of it name it “kafka”

$sudo adduser --system --no-create-home --disabled-password --disabled-login kafka

Step 4: Installing kafka

Download the kafka and unzip in a convenient location typically, /opt

$cd ~
$wget http://www-eu.apache.org/dist/kafka/1.1.0/kafka_2.11-1.1.0.tgz
$sudo mkdir /opt/kafka
$sudo tar -xvzf kafka_2.12-1.0.1.tgz --directory /opt/kafka --strip-components 1

Step 5: Configure the kafka server

As, kafka stores the data on disk, we will create a directory for it.

$sudo mkdir /var/lib/kakfa
$sudo mkdir /var/lib/kafka/data

Since we will be setup the distributed setup for kafka, let’s configure the 3 brokers.

If you open the /opt/kafka/confit/server.properties you will see many properties, BUT we will be dealing with only 3 properties. These three properties must be unique for each instance.

broker.id=0

listeners=PLAINTEXT://:9092

log.dirs=/tmp/kafka-logs

As we have 3 brokers, we will create properties file for each broker. Let’s copy the /opt/kafka/config/server.properties file and create 3 files for each instance.

$cp opt/kafka/config/server.properties opt/kafka/config/server-1.properties
$cp opt/kafka/config/server.properties opt/kafka/config/server-2.properties
$cp opt/kafka/config/server.properties opt/kafka/config/server-3.properties 

Create the log directories for each server.

$sudo mkdir /var/lib/kakfa/data/server-1
$sudo mkdir /var/lib/kakfa/data/server-2
$sudo mkdir /var/lib/kakfa/data/server-3

We will be using these directories in configuration.

Now, make some configuration changes in each kafka server. Open this file in text editor. I am using nano.

server-1.properties

$sudo nano /opt/kafka/config/server-1.properties

broker.id=1

listeners=PLAINTEXT://:9093

log.dirs=/var/lib/kakfa/data/server-1

Save the changes and go to next server to edit.

server-2.properties

$sudo nano /opt/kafka/config/server-2.properties

broker.id=2

listeners=PLAINTEXT://:9094

log.dirs=/var/lib/kakfa/data/server-2

server-3.properties

$sudo nano /opt/kafka/config/server-3.properties

broker.id=3

listeners=PLAINTEXT://:9095

log.dirs=/var/lib/kakfa/data/server-3

If you would like to delete the topics then you need to make edits to delete.topic.enable setting. By default, kafka doesn’t allow you to delete it. It needs to enable in configuration to do it. Please find the line and change it.

delete.topic.enable = true

Step 6: confirm permission of kafka directories

We will assign permission to kafka user(created in step 3) to kafka directories.

$sudo chown -R kafka:nogroup /opt/kafka
$sudo chown -R kafka:nogroup /var/lib/kafka

Step 7: Start the brokers

Now, we can start our brokers. Run these three commands on different terminal sessions.

$cd /opt/kafka
$bin/kafka-server-start.sh config/server-1.properties
$bin/kafka-server-start.sh config/server-2.properties
$bin/kafka-server-start.sh config/server-3.properties

You should see a startup message when the brokers start successfully.

Test the installation

Create a topic

We need to create a topic first.

$bin/kafka-topics.sh --create --topic topic-1 --zookeeper localhost:2181 --partitions 3 --replication-factor 3

You should see a confirmation message after you create a topic.

partition allows how many brokers you want data to be split. As, we have 3 brokers, we can set this to 3.

replication factor allows how many copies of data you need. This is helpful when any broker down other brokers can handle the job.

The Producer instance

Producer feeds the data into the kafka clusters. This command will push the data into the cluster.  

$bin/kafka-console-producer.sh --broker-list localhost:9093,localhost:9094,localhost:9095 –topic topic-1

broker-list options have the list of brokers which we have configured.

topic options specify under which topic you want to push the data. In our case we’ve pushed the data under topic-1

Once you execute this command, you will see a prompt where you can enter a message. You need to hit the enter every time to create a new message.

Consumers

We’ve produced the message. Now let’s consume those messages. Run this command to consume the messages.

$bin/kafka-console-consumer.sh --bootstrap-server localhost:9093 --topic topic-1 --from-beginning

bootstrap-server is the broker which we have created. It could be any from our 3 brokers.

from-beginning specifies read the message from the beginning.

This command shows all the message which has been produced by the producer. You can also see the message anyone adding any message. This is possible if you are logged in a separate terminal.

Hope this helps your setup and configure kafka on Ubuntu 16.04. Please try it and experiment.

Words from our clients

 

Tell Us About Your Project

We’ve done lot’s of work, Let’s Check some from here