We were running client version 2.3.0 for a while, then bumped to 2.3.1 for a particular kafka streams bug fix. We saw this issue while both versions were running.
Brandon
________________________________
From: Jamie <jamiedd13@aol.co.uk.INVALID>
Sent: Thursday, January 30, 2020 1:03 PM
To: users@kafka.apache.org <users@kafka.apache.org>
Subject: Re: High CPU in 2.2.0 kafka cluster
Hi Brandon,
Which version of Kafka are the consumers running? My understanding is that if they're running a version lower than the brokers then they could be using a different format for the messages which means the brokers have to convert each record before sending to the consumer.
Thanks,
Jamie
-----Original Message-----
From: Brandon Barron <brandon.barron@live.com>
To: users@kafka.apache.org <users@kafka.apache.org>
Sent: Thu, 30 Jan 2020 16:11
Subject: High CPU in 2.2.0 kafka cluster
Hi,
We had a small cluster (4 brokers) dealing with very low throughput - a couple hundred messages per minute at the very most. In that cluster we had a little under 3300 total consumers (all were kafka streams instances). All broker CPUs were maxed out almost consistently for a few weeks.
We switched traffic to a new cluster eventually. The old cluster sitting idle for a few days was at ~40% CPU, with consumers still running. When I took down all the consumers, the idle CPU on the brokers went to about 4%.
To test, we decided to mirror active traffic in our new cluster to the old cluster (which now has no running consumers). The CPU didn't budge; it's still at ~4% as expected with the low throughput.
One more thing to add: I ran a thread profiler on a couple brokers when the old cluster was taking active traffic with running consumers and the CPU was maxed out. Each time, I saw the ReplicaFetcherThread eating up around 40% of CPU time.
Can you give any advice on what might be the root cause of this?
Thanks,
Brandon
Brandon
________________________________
From: Jamie <jamiedd13@aol.co.uk.INVALID>
Sent: Thursday, January 30, 2020 1:03 PM
To: users@kafka.apache.org <users@kafka.apache.org>
Subject: Re: High CPU in 2.2.0 kafka cluster
Hi Brandon,
Which version of Kafka are the consumers running? My understanding is that if they're running a version lower than the brokers then they could be using a different format for the messages which means the brokers have to convert each record before sending to the consumer.
Thanks,
Jamie
-----Original Message-----
From: Brandon Barron <brandon.barron@live.com>
To: users@kafka.apache.org <users@kafka.apache.org>
Sent: Thu, 30 Jan 2020 16:11
Subject: High CPU in 2.2.0 kafka cluster
Hi,
We had a small cluster (4 brokers) dealing with very low throughput - a couple hundred messages per minute at the very most. In that cluster we had a little under 3300 total consumers (all were kafka streams instances). All broker CPUs were maxed out almost consistently for a few weeks.
We switched traffic to a new cluster eventually. The old cluster sitting idle for a few days was at ~40% CPU, with consumers still running. When I took down all the consumers, the idle CPU on the brokers went to about 4%.
To test, we decided to mirror active traffic in our new cluster to the old cluster (which now has no running consumers). The CPU didn't budge; it's still at ~4% as expected with the low throughput.
One more thing to add: I ran a thread profiler on a couple brokers when the old cluster was taking active traffic with running consumers and the CPU was maxed out. Each time, I saw the ReplicaFetcherThread eating up around 40% of CPU time.
Can you give any advice on what might be the root cause of this?
Thanks,
Brandon
Comments
Post a Comment