Skip to main content

Posts

Downscaling controllers in Kraft cluster leaves troublesome traces behind

In a personal cluster I used to have three kraft controllers, using Kafka 4.2.0. Because of "reasons" (the cluster is a lab), I downsized the cluster and now I only use a controller, just modifying "controller.quorum.voters" to include only the surviving controller. Working fine so far. Yes, I know that a single controller is a risk. Today I upgraded the brokers and the controller to Kafka 4.3.0 and I tried to upgrade de cluster version using "kafka-features.sh upgrade --release-version 4.3", but it is complaining that "old" controllers, offline and destroyed, are not compatible (they were Kafka 4.2.0 at the time of decommission). 1. How can I get rid of those dead controllers still haunting me? 2. I have tried to migrate to dynamic controller membership, but just changing "controller.quorum.voters" to "controller.quorum.bootstrap.server" doesn't work, although it is documented in <h...

[RESULTS] [VOTE] Release Kafka version 4.2.1

This vote passes with 5 +1 votes (3 bindings) and no 0 or -1 votes. +1 votes PMC Members: * Mickael Maison * Lucas Brutschy * Chia-Ping Tsai Committers: * No votes Community: * Jakub Scholz * Maroš Orsák 0 votes * No votes -1 votes * No votes Vote thread: https://lists.apache.org/thread/o3rcbdwv2sl264x71yy26tv71s1mxmkv, https://lists.apache.org/thread/y5m2p9572sv7x2odc8b4yqzk7yr1z786 I'll continue with the release process and the release announcement will follow in the next few days. PoAn

Kafka Streams REPLACE_THREAD recovery delayed by session.timeout.ms since 4.0 — dying consumer no longer sends LeaveGroup (intentional?)

Hi all, We're upgrading a Kafka Streams service from 3.9.0 to 4.x and ran into a behavioral change in REPLACE_THREAD recovery that we'd like to understand whether it's intended. Summary When the StreamsUncaughtExceptionHandler returns REPLACE_THREAD, recovery time jumps from sub-second (3.9.x) to ~`session.timeout.ms(45 s default) starting with4.0. We've traced this to the dying consumer no longer sending a LeaveGroup` request on shutdown — the broker therefore has to wait for the session to expire before it triggers a rebalance. This appears to be the combined effect of KIP-1092 (which added the GroupMembershipOperation filter in AbstractCoordinator) and Streams' replaceStreamThread() passing REMAIN_IN_GROUP to the consumer's close path. We understand the rationale (avoid partition bouncing for stateful apps where the new thread starts in the same JVM), but the 45 s floor undermines KIP-671's promise of "fast in-place recovery from ...

Kafka Streams REPLACE_THREAD recovery delayed by session.timeout.ms since 4.0 — dying consumer no longer sends LeaveGroup (intentional?)

Hi all, We're upgrading a Kafka Streams service from 3.9.0 to 4.x and ran into a behavioral change in REPLACE_THREAD recovery that we'd like to understand whether it's intended. Summary When the StreamsUncaughtExceptionHandler returns REPLACE_THREAD, recovery time jumps from sub-second (3.9.x) to ~`session.timeout.ms(45 s default) starting with4.0. We've traced this to the dying consumer no longer sending a LeaveGroup` request on shutdown — the broker therefore has to wait for the session to expire before it triggers a rebalance. This appears to be the combined effect of KIP-1092 (which added the GroupMembershipOperation filter in AbstractCoordinator) and Streams' replaceStreamThread() passing REMAIN_IN_GROUP to the consumer's close path. We understand the rationale (avoid partition bouncing for stateful apps where the new thread starts in the same JVM), but the 45 s floor undermines KIP-671's promise of "fast in-place recovery from ...

Re: [VOTE] 4.2.1 RC5

Hi, Besides manual checks, I have also run the Strimzi test container tests on Aarch64 image build from https://dist.apache.org/repos/dist/dev/kafka/4.2.1-rc5/kafka_2.13-4.2.1.tgz, all tests passed without issues. +1 (non-binding), Cheers, Maros št 28. 5. 2026 o 6:05 Chia-Ping Tsai <chia7712@apache.org> napísal(a): > +1 (binding) > > - Run E2E tests locally; all passed with a few retries. > - Set up a 3-node cluster and ran a simple producer/consumer workflow; no > obvious performance regression. > - Run unit/integration tests locally; all passed with minimal retries. > > Best, > Chia-Ping > > On 2026/05/14 10:41:11 PoAn Yang wrote: > > Hello Kafka users, developers and client-developers, > > > > This is the third candidate for release of Apache Kafka 4.2.1. > > > > This is the first bug fix release for Apache Kafka 4.2 with fixes as > > described in the release notes. In the t...

Re: [VOTE] 4.2.1 RC5

+1 (binding) - Run E2E tests locally; all passed with a few retries. - Set up a 3-node cluster and ran a simple producer/consumer workflow; no obvious performance regression. - Run unit/integration tests locally; all passed with minimal retries. Best, Chia-Ping On 2026/05/14 10:41:11 PoAn Yang wrote: > Hello Kafka users, developers and client-developers, > > This is the third candidate for release of Apache Kafka 4.2.1. > > This is the first bug fix release for Apache Kafka 4.2 with fixes as > described in the release notes. In the third candidate, it adds KAFKA-20572. > > Release notes for the 4.2.1 release: > https://dist.apache.org/repos/dist/dev/kafka/4.2.1-rc5/RELEASE_NOTES.html > > Please download, test and vote by Thursday, May 20, 9pm PT. > > Kafka's KEYS file containing PGP keys we use to sign the release: > https://kafka.apache.org/KEYS > > * Release artifacts to be voted upon (source and bi...

Re: [ANNOUNCE] Apache Kafka 4.3.0

Thanks to Mickael for driving the 4.3.0 release > Mickael Maison <mimaison@apache.org> 於 2026年5月22日 晚上8:29 寫道: > > The Apache Kafka community is pleased to announce the release for > Apache Kafka 4.3.0 > > All of the changes in this release can be found in the release notes: > https://www.apache.org/dist/kafka/4.3.0/RELEASE_NOTES.html > > > An overview of the release can be found in our announcement blog post: > https://kafka.apache.org/blog/2026/05/22/apache-kafka-4.3.0-release-announcement/ > > You can download the source and binary release (Scala 2.13) from: > https://kafka.apache.org/downloads#4.3.0 > > --------------------------------------------------------------------------------------------------- > > > Apache Kafka is a distributed streaming platform with four core APIs: > > > ** The Producer API allows an application to publish a stream of records to > one or more ...