Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 4616

Kafka Producer and Consumer Message Acknowledgement Options

$
0
0

1. Introduction

As we know, Apache Kafka is a messaging and streaming system. It provides acknowledgment options for ensuring reliability guarantees. In this tutorial, let’s learn about acknowledgment options for producers and consumers in Apache Kafka.

2. Producer Acknowledgements Options

Even with a reliably configured Kafka broker, we must also configure producers to be reliable, too. We can configure producers using one of three acknowledgment modes, which we’ll cover next.

2.1. Acknowledge None

We can set the property acks to 0:

static KafkaProducer<String, String> producerack0;
static Properties getProducerProperties() {
    Properties producerProperties = new Properties();
    producerProperties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
      KAFKA_CONTAINER.getBootstrapServers());
    return producerProperties;
}
static void setUp() throws IOException, InterruptedException {
    Properties producerProperties = getProducerProperties();
    producerProperties.put(ProducerConfig.ACKS_CONFIG,
      "0");
    producerack0 = new KafkaProducer<>(producerProperties);
}

In this configuration, the producer doesn’t wait for a reply from the broker. It assumes that the message is sent successfully. If something goes wrong and the broker doesn’t receive the message, the producer is unaware of it, and the message is lost.

However, because the producer doesn’t wait for any response from the server, it can send messages as fast as the network supports, achieving high throughput.

A message is considered to be written successfully to Kafka if the producer manages to send it over the network. Message sending is unsuccessful on the client side for errors such as if the object sent cannot be serialized or if the network card failed. But after it reaches the broker, we won’t get any error if the partition is offline, a leader election is in progress, or even if the entire Kafka cluster is unavailable.

Running with acks=0 results in low producer latency, although it doesn’t improve end-to-end latency, given that consumers won’t see messages until the system replicates the messages to all available replicas.

2.2. Acknowledge Leader

We can instead set property acks to 1:

static KafkaProducer<String, String> producerack1;
static Properties getProducerProperties() {
    Properties producerProperties = new Properties();
    producerProperties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
      KAFKA_CONTAINER.getBootstrapServers());
    return producerProperties;
}
static void setUp() throws IOException, InterruptedException {
    Properties producerack1Prop = getProducerProperties();
    producerack1Prop.put(ProducerConfig.ACKS_CONFIG,
      "1");
    producerack1 = new KafkaProducer<>(producerack1Prop);
}

The producer receives a successful response from the broker the moment the leader replica receives the message. If the message can’t be written to the leader due to any error, the producer receives an error response and retries sending the message, avoiding the potential loss of data.

We can still lose a message for other reasons, such as if the leader crashes before the system replicates the latest messages to the new leader.

The leader sends either an acknowledgment or an error the moment it gets the message and writes it to the partition data file. We can lose data if the leader shuts down or crashes. The crash may result in some successfully written and acknowledged messages not getting replicated to the followers before the crash.

With an acks=1 configuration, it’s possible to write to the leader faster than it can replicate messages, in which case we’ll end up with under-replicated partitions, as the leader acknowledges messages from the producer before replicating them. The latency is higher than an acks=0 configuration as we wait until one replica receives the message.

2.3. Acknowledge All

We can also set property acks to all:

static KafkaProducer<String, String> producerackAll;
static Properties getProducerProperties() {
    Properties producerProperties = new Properties();
    producerProperties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
      KAFKA_CONTAINER.getBootstrapServers());
    return producerProperties;
}
static void setUp() throws IOException, InterruptedException {
    Properties producerackAllProp = getProducerProperties();
    producerackAllProp.put(ProducerConfig.ACKS_CONFIG,
      "all");
    producerackAll = new KafkaProducer<>(producerackAllProp);
}

The producer receives a success response from the broker once all in-sync replicas receive the message. This is the safest mode since we can make sure more than one broker has the message and that the message survives even in case of a crash.

The leader waits until all in-sync replicas get the message before sending back an acknowledgment or an error. The min.insync.replicas configuration on the broker allows us to specify the minimum number of replicas that must receive before the producer acknowledges the message.

This is the safest option. The producer continues attempting to send the message until it’s fully committed. The producer latency is highest for this configuration as the producer waits for all in-sync replicas to get all the messages before it can mark the message batch as “done” and carry on.

Setting the acks property to the value -1 is equivalent to setting it to the value all.

The property acks can be set only to one of the three possible values: 0, 1 or all/-1. If set to any value other than these three, then Kafka throws ConfigException.

For acks configuration 1 and all, we can handle producer retries using producer properties retries, retry.backoff.ms, and delivery.timeout.ms.

3. Consumer Acknowledgements Options

Data becomes accessible to consumers only after Kafka marks it as committed, which ensures the system writes the data to all in-sync replicas. This guarantees that consumers receive consistent data. Their sole responsibility is to keep track of the messages they have read and those they have yet to read. This is key to not lose messages while consuming them.

When reading data from a partition, a consumer fetches a batch of messages, checking the last offset in the batch, and then requesting another batch of messages starting from the last offset received. This guarantees that a Kafka consumer always gets new data in the correct order without missing any messages.

We have four consumer configuration properties that are important to understand in order to configure our consumer for a desired reliability behavior.

3.1. Group ID

Each Kafka consumer belongs to a group, identified by the property group.id.

Let’s say we have two consumers in a group, meaning both have the same group ID. The Kafka system assigns each consumer to a subset of the partitions in the topic. Each consumer reads a subset of messages individually. The group as a whole reads all the messages.

If we need a consumer to see, on its own, every single message in its subscribed topics, it needs a unique group.id.

3.2. Auto Offset Reset

The property auto.offset.reset determines consumer behavior when it begins reading a partition without a committed offset or an invalid committed offset:

static Properties getConsumerProperties() {
    Properties consumerProperties = new Properties();
    consumerProperties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
      KAFKA_CONTAINER.getBootstrapServers());
    return consumerProperties;
}
Properties consumerProperties = getConsumerProperties(); 
consumerProperties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false); 
consumerProperties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
try (KafkaConsumer<String, String> consumer = new KafkaConsumer<>(consumerProperties)) {
    // ...
}

or

static Properties getConsumerProperties() {
    Properties consumerProperties = new Properties();
    consumerProperties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
      KAFKA_CONTAINER.getBootstrapServers());
    return consumerProperties;
}
Properties consumerProperties = getConsumerProperties(); 
consumerProperties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false); 
consumerProperties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
try (KafkaConsumer<String, String> consumer = new KafkaConsumer<>(consumerProperties)) {
    // ...
}

The default value is latest, which means that, lacking a valid offset, the consumer starts reading from the newest records. The consumer considers only the records written after it starts running.

The alternative value for the property is earliest. It means that, lacking a valid offset, the consumer reads all the data in the partition, starting from the very beginning. The configuration auto.offset.reset being set to none causes an exception when consumed from an invalid offset.

3.3. Enable Auto Commit

The property enable.auto.commit controls whether the consumer commits offsets automatically, and it defaults to true:

static Properties getConsumerProperties() {
    Properties consumerProperties = new Properties();
    consumerProperties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
      KAFKA_CONTAINER.getBootstrapServers());
    return consumerProperties;
}
Properties consumerProperties = getConsumerProperties(); 
consumerProperties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true);
try (KafkaConsumer<String, String> consumer = new KafkaConsumer<>(consumerProperties)) {
    // ...
}

If set to false, we can control when the system commits the offsets. This enables us to minimize duplicates and avoid missing data.

Setting the property enable.auto.commit to true allows us to control the frequency of commits using auto.commit.interval.ms.

3.4. Auto Commit Interval

In auto-commit configuration, the property auto.commit.interval.ms configures how frequently the Kafka system commits the offsets:

static Properties getConsumerProperties() {
    Properties consumerProperties = new Properties();
    consumerProperties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
      KAFKA_CONTAINER.getBootstrapServers());
    return consumerProperties;
}
Properties consumerProperties = getConsumerProperties(); 
consumerProperties.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, 7000); 
consumerProperties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true);
try (KafkaConsumer<String, String> consumer = new KafkaConsumer<>(consumerProperties)) {
    // ...
}

The default is every five seconds. In general, committing more frequently adds overhead but reduces the number of duplicates that can occur when a consumer stops.

4. Conclusion

In this article, we learned about producer and consumer acknowledgement options for Apache Kafka and how to utilize them. Acknowledgement options in Kafka allow developers to fine-tune the balance between performance and reliability, making it a versatile system for a variety of use cases.

As always, the complete code used in this article is available over on GitHub.

The post Kafka Producer and Consumer Message Acknowledgement Options first appeared on Baeldung.
       

Viewing all articles
Browse latest Browse all 4616

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>