Hey Matthias,
Is there better way than Peter's suggestion?
Is there a definite way to count the available messages regarding
compaction and transactions?
On Sun, Apr 28, 2019 at 7:41 PM Matthias J. Sax <matthias@confluent.io>
wrote:
> This won't work if your topic is compacted though. Also, if you are
> using transactions, it might not be accurate, depending on how many
> transaction markers are in the topics.
>
> -Matthias
>
> On 4/28/19 2:59 PM, Peter Bukowinski wrote:
> > You'll need to do this programmatically with some simple math. There's a
> binary included with kafka called kafka-run-class that you can use to
> expose earliest and latest offset information.
> >
> > This will return the earliest unexpired offsets for each partition in a
> topic:
> >
> > kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
> localhost:9092 --topic TOPIC --time -2
> >
> > This will return the latest offset:
> >
> > kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
> localhost:9092 --topic TOPIC --time -1
> >
> > With that info, subtract the latest from the earliest per partition, sum
> the results, and you'll have the number of messages available in your topic.
> >
> > -- Peter
> >
> >> On Apr 28, 2019, at 1:41 AM, jaaz jozz <jazzlofi2@gmail.com> wrote:
> >>
> >> Hello,
> >> I want to count how many messages available in each topic in my kafka
> >> cluster.
> >> I understand that just looking at the latest offset available is not
> >> correct, because older messages may have been already purged due to
> >> retention policy.
> >> So what is the correct way of counting that?
> >>
> >> Thanks,
> >> Jazz.
>
>
Is there better way than Peter's suggestion?
Is there a definite way to count the available messages regarding
compaction and transactions?
On Sun, Apr 28, 2019 at 7:41 PM Matthias J. Sax <matthias@confluent.io>
wrote:
> This won't work if your topic is compacted though. Also, if you are
> using transactions, it might not be accurate, depending on how many
> transaction markers are in the topics.
>
> -Matthias
>
> On 4/28/19 2:59 PM, Peter Bukowinski wrote:
> > You'll need to do this programmatically with some simple math. There's a
> binary included with kafka called kafka-run-class that you can use to
> expose earliest and latest offset information.
> >
> > This will return the earliest unexpired offsets for each partition in a
> topic:
> >
> > kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
> localhost:9092 --topic TOPIC --time -2
> >
> > This will return the latest offset:
> >
> > kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list
> localhost:9092 --topic TOPIC --time -1
> >
> > With that info, subtract the latest from the earliest per partition, sum
> the results, and you'll have the number of messages available in your topic.
> >
> > -- Peter
> >
> >> On Apr 28, 2019, at 1:41 AM, jaaz jozz <jazzlofi2@gmail.com> wrote:
> >>
> >> Hello,
> >> I want to count how many messages available in each topic in my kafka
> >> cluster.
> >> I understand that just looking at the latest offset available is not
> >> correct, because older messages may have been already purged due to
> >> retention policy.
> >> So what is the correct way of counting that?
> >>
> >> Thanks,
> >> Jazz.
>
>
Comments
Post a Comment