Skip to main content

Re: Kafka Group coordinator discovery failing for subsequent restarts

Hi

about question 1, it's dosen't matter that how many consumers in same
consumer group.

So you means the broker which is coordinator did not crashed at all before?

May i know if only exact one broker(coordinator) is unavailable or many
are? if only exact one, you can try to transfer leader of _consumer_offset
which on that broker to another broker to see if it's no problem any more?

i found the following issue seems similar with yours, FYR:

https://stackoverflow.com/questions/51952398/kafka-connect-distributed-mode-the-group-coordinator-is-not-available

Best,
Lisheng


Hrishikesh Mishra <sd.hrishi@gmail.com> 于2019年8月29日周四 下午12:19写道:

> Hi,
>
> We are facing following issues with Kafka cluster.
>
> - Kafka Version: 2.0.0
> - We following cluster configuration:
> - Number of Broker: 14
> - Per Broker: 37GB Memory and 14 Cores.
> - Topics: 40 - 50
> - Partitions per topic: 32
> - Replicas: 3
> - Min In Sync Replica: 2
> - __consumer_topic partition: 50
> - offsets.topic.replication.factor=3
> - default.replication.factor=3
> - Consumers#: ~4000 (will grow to ~7K)
> - Consumer Groups#: ~4000 (will grow to ~7K)
>
>
> Imp: Here one consumer is consuming from one topic and one consumer group
> has only one consumer due to some architectural constraints.
>
> Two major problems we are facing with consumer group:
>
> - First time when we are starting consumer with new group name it
> working very well. But subsequent restart (with previous / older group
> name) is causing problems from some consumers. We are getting following
> errors:
>
> INFO [2019-08-28 19:05:34,481] [main] [AbstractCoordinator]: [Consumer
> clientId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2,
> groupId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2]
> Discovered
> group coordinator 10.XX.XXX.112:9092 (id: 2147483631 rack: null)
> INFO [2019-08-28 19:05:34,481] [main] [AbstractCoordinator]: [Consumer
> clientId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2,
> groupId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2] Group
> coordinator 10.XX.XXX.112:9092 (id: 2147483631 rack: null) is
> unavailable
> or invalid, will attempt rediscovery
> INFO [2019-08-28 19:05:34,582] [main] [AbstractCoordinator]: [Consumer
> clientId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2,
> groupId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2]
> Discovered
> group coordinator 10.32.197.112:9092 (id: 2147483631 rack: null)
>
> These messages are keep coming and consumer not able to start / poll.
> But if we change the group name then it works first time without any
> issue
> (and fails in subsequent restart). So it also means that there is no
> with
> issue broker. Will it because of having single consumer in consumer
> group,
> if yes then what will be the work around here?
>
> - The second error, we are getting when consumer is up and running. Then
> after couple hours, it starts failing and throwing following error:
> Consumer clientId=banneXXXX#XX-XXX-XXX-XXX-X-1388688-XXX-XXXXX,
> groupId=bannerXXX#XX-XXX-XXX-XXX-X-1388688-XXX-XXXXX] Offset commit
> failed
> on partition banneXXXX-7 at offset 13711176: This is not the correct
> coordinator
> [Consumer
>
> clientId=banXXerGrXXMXX#XX-XX-XXXXX-XXX-5-1478733-XXX-XXXXX-ingestion-v2,
> groupId=banXXerGrXXMXX#XX-XX-XXXXX-XXX-5-1478733-XXX-XXXXX-ingestion-v2]
> Offset commit failed on partition banXXerGrXXMXX-8 at offset 14741:
> This is
> not the correct coordinator.
>
>
> I wanted to know following things:
>
> - What is the max limit of consumer groups in a Kafka cluster, I didn't
> find any limitation on internet, all places it mentioned that limited
> by OS.
> - Is there a problem of a consumer group has only one consumer.
> - Is there some problem with my Kafka configuration,
>
>
>
>
> Regards
> Hrishikesh
>

Comments