-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org
iQIzBAEBCgAdFiEE8osu2CcCCF5douGQu8PBaGu5w1EFAlzHa44ACgkQu8PBaGu5
w1FqPg/+OV64wEGxnp6hOoi8q0F+SY/UlZntqgxrAmeWgDKrZR0AGu2WVowZcKrR
6Or4CaYMFYVqmMNsCJFx+zEJbdKHJF0jsOBushcnrf4cYZ+S7Za92ZZ9Naxo0Cvo
6iCJZJYO6kAmsGvcgeCiSQvgBep5y6px8qo4bfauywOyKsVmC0m9NxvADfmoMCpv
8EC4Mt10Jb19EuKLJE5e+EbJa7MGK1lWumDfrCTrVHZRM1phMFv0DGEkOi4LnvhE
bhPA3cm7b2S4SMTTYNEOZ/0R4B4vgycSo3U2WxnjGsNcFozR6JOzmhXZcmQ18dXf
uuAbWcaBL2rf7jR6IwFuhebhUWrgKTh8DvTA5rEObFntX6qEEkVlrLBRW72OZAgD
JJcQ/Eb0mZzlC4U4JDC7DJCwUZ2X7hohcxZXpMKemsRR8axCeel4t+t069M5XdCY
mMCGo+B1gKxD1Mnq56mLd1/O9shwMBC1Lu6j3y25FblpI/8tDAVHwYTAq7ewHhGL
v6BDR10qPY8jcijjnc0YZ24fw++5muvsOubCorfjDOE8ZCSecD6V9ioT7uF84WYD
DfBjVkJVXXIon6F+Y0FoOQQKQbi5JZZpexq1YRWuQSJr8584pwB8gGKaulje9U5c
KpARQpzgI/4oeB2ofPNLrUk6dcSw+GWegGDezoE8E+VL1wu8PTE=
=2yH5
-----END PGP SIGNATURE-----
Not really.
The only exact way, would be to consume the topic, but this is rather
expensive.
Note: The regular use case for Kafka is that new data is appended all
the time, thus, the current number of messages changes all the time.
Therefore, it does not seem to be useful to now the current count as the
information is most likely stale quickly.
-Matthias
On 4/29/19 5:52 AM, jaaz jozz wrote:
> Hey Matthias,
> Is there better way than Peter's suggestion?
> Is there a definite way to count the available messages regarding
> compaction and transactions?
>
> On Sun, Apr 28, 2019 at 7:41 PM Matthias J. Sax <matthias@confluent.io>
> wrote:
>
>> This won't work if your topic is compacted though. Also, if you are
>> using transactions, it might not be accurate, depending on how many
>> transaction markers are in the topics.
>>
>> -Matthias
>>
>> On 4/28/19 2:59 PM, Peter Bukowinski wrote:
>>> You'll need to do this programmatically with some simple math. There's a
>> binary included with kafka called kafka-run-class that you can use to
>> expose earliest and latest offset information.
>>>
>>> This will return the earliest unexpired offsets for each partition in a
>> topic:
>>>
>>> kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
>> localhost:9092 --topic TOPIC --time -2
>>>
>>> This will return the latest offset:
>>>
>>> kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
>> localhost:9092 --topic TOPIC --time -1
>>>
>>> With that info, subtract the latest from the earliest per partition, sum
>> the results, and you'll have the number of messages available in your topic.
>>>
>>> -- Peter
>>>
>>>> On Apr 28, 2019, at 1:41 AM, jaaz jozz <jazzlofi2@gmail.com> wrote:
>>>>
>>>> Hello,
>>>> I want to count how many messages available in each topic in my kafka
>>>> cluster.
>>>> I understand that just looking at the latest offset available is not
>>>> correct, because older messages may have been already purged due to
>>>> retention policy.
>>>> So what is the correct way of counting that?
>>>>
>>>> Thanks,
>>>> Jazz.
>>
>>
>
Comment: GPGTools - https://gpgtools.org
iQIzBAEBCgAdFiEE8osu2CcCCF5douGQu8PBaGu5w1EFAlzHa44ACgkQu8PBaGu5
w1FqPg/+OV64wEGxnp6hOoi8q0F+SY/UlZntqgxrAmeWgDKrZR0AGu2WVowZcKrR
6Or4CaYMFYVqmMNsCJFx+zEJbdKHJF0jsOBushcnrf4cYZ+S7Za92ZZ9Naxo0Cvo
6iCJZJYO6kAmsGvcgeCiSQvgBep5y6px8qo4bfauywOyKsVmC0m9NxvADfmoMCpv
8EC4Mt10Jb19EuKLJE5e+EbJa7MGK1lWumDfrCTrVHZRM1phMFv0DGEkOi4LnvhE
bhPA3cm7b2S4SMTTYNEOZ/0R4B4vgycSo3U2WxnjGsNcFozR6JOzmhXZcmQ18dXf
uuAbWcaBL2rf7jR6IwFuhebhUWrgKTh8DvTA5rEObFntX6qEEkVlrLBRW72OZAgD
JJcQ/Eb0mZzlC4U4JDC7DJCwUZ2X7hohcxZXpMKemsRR8axCeel4t+t069M5XdCY
mMCGo+B1gKxD1Mnq56mLd1/O9shwMBC1Lu6j3y25FblpI/8tDAVHwYTAq7ewHhGL
v6BDR10qPY8jcijjnc0YZ24fw++5muvsOubCorfjDOE8ZCSecD6V9ioT7uF84WYD
DfBjVkJVXXIon6F+Y0FoOQQKQbi5JZZpexq1YRWuQSJr8584pwB8gGKaulje9U5c
KpARQpzgI/4oeB2ofPNLrUk6dcSw+GWegGDezoE8E+VL1wu8PTE=
=2yH5
-----END PGP SIGNATURE-----
Not really.
The only exact way, would be to consume the topic, but this is rather
expensive.
Note: The regular use case for Kafka is that new data is appended all
the time, thus, the current number of messages changes all the time.
Therefore, it does not seem to be useful to now the current count as the
information is most likely stale quickly.
-Matthias
On 4/29/19 5:52 AM, jaaz jozz wrote:
> Hey Matthias,
> Is there better way than Peter's suggestion?
> Is there a definite way to count the available messages regarding
> compaction and transactions?
>
> On Sun, Apr 28, 2019 at 7:41 PM Matthias J. Sax <matthias@confluent.io>
> wrote:
>
>> This won't work if your topic is compacted though. Also, if you are
>> using transactions, it might not be accurate, depending on how many
>> transaction markers are in the topics.
>>
>> -Matthias
>>
>> On 4/28/19 2:59 PM, Peter Bukowinski wrote:
>>> You'll need to do this programmatically with some simple math. There's a
>> binary included with kafka called kafka-run-class that you can use to
>> expose earliest and latest offset information.
>>>
>>> This will return the earliest unexpired offsets for each partition in a
>> topic:
>>>
>>> kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
>> localhost:9092 --topic TOPIC --time -2
>>>
>>> This will return the latest offset:
>>>
>>> kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
>> localhost:9092 --topic TOPIC --time -1
>>>
>>> With that info, subtract the latest from the earliest per partition, sum
>> the results, and you'll have the number of messages available in your topic.
>>>
>>> -- Peter
>>>
>>>> On Apr 28, 2019, at 1:41 AM, jaaz jozz <jazzlofi2@gmail.com> wrote:
>>>>
>>>> Hello,
>>>> I want to count how many messages available in each topic in my kafka
>>>> cluster.
>>>> I understand that just looking at the latest offset available is not
>>>> correct, because older messages may have been already purged due to
>>>> retention policy.
>>>> So what is the correct way of counting that?
>>>>
>>>> Thanks,
>>>> Jazz.
>>
>>
>
Comments
Post a Comment