Apache Kafka: Messaging Queuing Distributed Stream Processing

What is the Apache Kafka?

Apache kafka is an open source distributed event streaming plateform used by thousands of companies for hight performance data pipelines, streaming analytics, data integration and mission critcal application.

What is KAFKA CLUSTER?

Kafka is a distributed system, it act as a cluster. A kafka cluster consists of a set of brokers. A cluster has a minimum of 3 brokers.

What is KAFKA BROKER?

The broker is the kafka server. It’s just a maniful name given to the kafka server. And this name makes same as well because all the kafka does it act as a message broker between producer and consumer.

The Producer and Consumer don’t interact directly. They use kafka server as an agent or a broker to exchange message.

What is PRODUCER?

Producer is an application that sends message. It does not send message directly to the reciept. It send message only to the kafka server.

What is Consumer?

Consumer is an application that reads message from the kafka server.

If producers are sending data,they must be sending it to someone, right? If Consumers are the recipients. But remember that the producers don’t send data to a reciept address. They just send it to kafka server. And anyone who is interested in that data can come forword and take it from kafka server. So, any application that requests data from a kafka server is a consumer, and they can ask for data send by any producer provided they have permission to read it.

What is KAFKA TOPIC?

We learned that the proceducer sends data to the kafka broker. Then send data to the kafka broker. Then a consumer can ask for data from the kafka broker. But the question is which data?
We need to have some identification machanism to request data from a broker. There comes the notion of the topic.

Topic is like a table in database or folder in a file system.
Topic is identified by a name
You can have any number of topics

What is KAFKA PARTITIONS

Kafka topics are divided into a number of partitions, which contain records in an unchangeable sequence. Kafla Brokers will store message for a topic. But the capacity of the data can be enormous and it may not be possible to store in a single computer. Therefore it will partitioned into multiple parts and distributed among multiple computers, since kafka is a distributed system.

What is OFFSET?

Offset is a sequence of ids given to message as the arrive at a partition. Once the offset is assigned it will never be changed. The first message gets an offset zero. The next message recieves an offset one and so on.

USE CASES
Here is a description of a few of the popular use cases for Apache Kafka. For an overview of a number of these areas in action.

Messaging
Website Activity Tracking
Metrics
Log Aggregation
Stream Processing
Event Sourcing
Commit Log

Subscribe to our newsletter