RE: Kafka Streams REPLACE_THREAD recovery delayed by session.timeout.ms since 4.0 — dying consumer no longer sends LeaveGroup (intentional?)
We have observed a similar behavior: Currently, we're upgrading our Spring Boot application from 4.0.6 to 4.1.0 which in turn upgrades Kafka from 4.1.2 to 4.2.1.
We have observed that the behavior of REPLACE_THREAD / StreamsUncaughtExceptionHandler has changed.
In Kafka 4.1.2, when REPLACE_THREAD was returned, the old StreamThread shut down and left the group, and the new thread could then resume work immediately.
In Kafka 4.2.1, when REPLACE_THREAD is returned, the old StreamThread remains in the group for max.poll.interval.ms milliseconds and that is how long it takes for the new StreamThread to start getting work.
We believe that this is caused by https://github.com/apache/kafka/commit/68f1da84740092dbd5ebb49ae62035174758b98b#diff-ab27af136b0c45ed402ec44368a91380b018bf06f1a9722324fe6be8d5220f7dR491-R978 where leaveGroupRequested is now set to false / REMAIN_IN_GROUP, which previously was set to true / LEAVE_GROUP.
We are using the classic protocol.
On 2026/05/2...