Skip to main content

Re: Unexpected Rebalances ,Any tips on APIs or debug techniques to figure out rebalance causes?

1. is it correctly the idea about B ? i think it maybe the major factor
for rebalance in my case ,because consuming data is slowly.
=> Looks like so, but we cannot confirm that because we don't have other
information. You should check the consumer log to see why the consumer
leave group.

2. i have do an experiment for assert B. but not reproduce
A:consumer fast or slow can't trigger it
You should also adjust the heartbeat interval to allow the heartbeat detect
the poll expiration.
You can refer to this test:
https://github.com/apache/kafka/blob/trunk/core/src/test/scala/integration/kafka/api/PlaintextConsumerTest.scala#L167

3. Any tips on APIs or debug techniques to figure out rebalance causes
On server side, you can check log like this:
"Preparing to rebalance group xxx ... (reason: yyyyy)



4. How can I trigger it manually ?
=> same as question 2

5. Is it a bad idea to have the same Consumer Group (Same ID) consuming
from multiple topics ?
=> Depends on your use case, no good or bad.


Thank you.
Luke


On Wed, Apr 27, 2022 at 11:58 PM 杨宝栓 <yangbaoshuan@rcrai.com> wrote:

>
>
> HI:
> We are seeing unexpected rebalances in golang consumers, described below.
> 1. We have a topic with 36 partitions,and one consumer (lets name
> it consumer1) consuming it.
> 2. Run kafka in Docker and configuration:We use defaults
> 3. Consumer consuming data is slowly about cost 1s for one
> piece of data
> 4. All the consumers for topic A are in the same group
> 5. The rebalances are intermittent and hard to reproduce. We see
> no obvious errors in the logs.
> 6. No matter how to change the configuration that affects
> rebalance ,it always rebalance
> The configuration that affects rebalance below:
> max.poll.interval.ms
> max.poll.records
> request.timeout.ms
> session.timeout.ms
> As far as I am concerned, conditions of the rebalance contains:
> a consumer is considered DEAD by the group coordinator.
> A. when the consumer is busy, which means that no
> heartbeats has been sent in the meanwhile by the consumer to the group
> coordinator within the configured session interval
> B. when the consumer is slowly with a long-running
> processing, which means that interval of poll() is too long within the
> configured max.poll.interval.ms
> question:
> 1. is it correctly the idea about B ? i think it maybe the
> major factor for rebalance in my case ,because consuming data is slowly.
> 2. i have do an experiment for assert B. but not reproduce
> A:consumer fast or slow can't trigger it
> 3. Any tips on APIs or debug techniques to figure out rebalance
> causes
> 4. How can I trigger it manually ?
> 5. Is it a bad idea to have the same Consumer Group (Same ID)
> consuming from multiple topics ?
>
>
>
>
>
>

Comments