Skip to main content

Re: Exactly once transactions

Please note that for exactlyOnce it is not sufficient to set simply an option. The producer and consumer must individually make sure that they only process the message once. For instance, the consumer can crash and it may then resend already submitted messages or the producer might crash and might write the same message twice to a database etc.
Or due to a backup and restore or through a manual admin action all these things might happen.
Those are not "edge" scenarios. In operations they can happen quiet often, especially in a Containerized infrastructure.
This you have to consider for all messaging solutions (not only Kafka) in your technical design.

> Am 30.10.2019 um 20:30 schrieb Sergi Vladykin <sergi.vladykin@gmail.com>:
>
> Hi!
>
> I investigate possibilities of "exactly once" Kafka transactions for
> consume-transform-produce pattern. As far as I understand, the logic must
> be the following (in pseudo-code):
>
> var cons = createKafkaConsumer(MY_CONSUMER_GROUP_ID);
> cons.subscribe(TOPIC_A);
> for (;;) {
> var recs = cons.poll();
> for (var part : recs.partitions()) {
> var partRecs = recs.records(part);
> var prod = getOrCreateProducerForPart(MY_TX_ID_PREFIX + part);
> prod.beginTransaction();
> sendAllRecs(prod, TOPIC_B, partRecs);
> prod.sendOffsetsToTransaction(singletonMap(part,
> lastRecOffset(partRecs) + 1),
>
> MY_CONSUMER_GROUP_ID);
> prod.commitTransaction();
> }
> }
>
> Is this right approach?
>
> Because it looks to me there is a possible race here and the same record
> from topic A can be committed to topic B more than once: if rebalancing
> happens after our thread polled a record and before creating a producer,
> then another thread will read and commit the same record, after that our
> thread will wake up, create a producer (and fence the other one) and
> successfully commit the same record second time.
>
> Can anyone explain please how to do "exactly once" in Kafka right?
> Examples would be very helpful.
>
> Sergi

Comments