Kafka

Posts

Showing posts from December, 2025

Intermittent Message Latency in Consumer

Problem Description When using a consumer created with librdkafka to receive messages from Kafka, intermittent message latency issues are observed. The time difference between message receipt and the timestamp in the message body exceeds 1 second, although most messages are received within about 10ms. Environment Information Software Versions librdkafka version: 2.11.0 Operating System: CentOS 7.6 Kafka version: 3.6.2 (zookeeper mode deployment) Kafka Cluster Number of nodes: 3 nodes Server configuration: 64 vCPU, 128GB RAM Network: Gigabit network, connected to the same switch, low network latency disk: HDD RAID1 Topic Configuration Test Topic (test): Partitions: 1 Replicas: 2 message.timestamp.type=LogAppendTime min.insync.replicas=1 Load Topics (testA, testB, testC, testD): Each topic: 128 partitions, 2 replicas Total message rate: 80,000 messages/second (20,000 messages/second per topic) Message size: 500 bytes per message Consumer Configuration (librdkafka) fetch.wait.max.ms : 10 ...

Apache Kafka Contact

Hello Apache Kafka Team, I hope this message finds you well. We are interested in understanding which partners or recommended channels may be available to support customers in Mexico. Our objective is to identify the appropriate contacts so that we can reach out and potentially initiate an RFP process. Any guidance you can provide regarding local or global partners associated with Apache Kafka would be greatly appreciated. Thank you very much for your support. Saludos cordiales, Ana Solis (O): +1 732 382 6565, 48559 ana.solis@gep.com www.gep.com Disclaimer: This message contains information that may be privileged or confidential and is the property of GEP, its subsidiaries, its affiliates and its clients. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you...

Kafka role questions for migrating zookeeper based Kafka to Kraft Kafka

Dear Kafka Users group folks, For zookeeper based Kafka(3.9.1) to Kraft Kafka migration, could you help on below role related questions? 1. In https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=225153708#KIP866ZooKeepertoKRaftMigration-ControllerMigration , it says "This migration only supports dedicated KRaft controllers as the target deployment. There will be no support for migrating to a combined broker/controller KRaft deployment.", does it mean "A new set of nodes will be provisioned to host the controller quorum" is the only supported option(dedicated new set of controller nodes) for the migration? 2. After the migration(suppose answer to question 1. Is yes), is it possible to change the deployment to use hybrid roles and remove those dedicated controller quorum nodes where are newly setup for the migration? 3. What is the officially recommended Kraft kafka deployment mode in production, dedicated controllers + brokers? Or hybr...

Re: Migration to Kafka4

Not sure if this helps, but one of the main aspects is moving from ZooKeeper to Kraft, we have a blog on how we do it here: https://www.instaclustr.com/support/documentation/kafka/useful-concepts/zookeeper-to-kraft-migration-process/ Regards, Paul Brebner From: Sachin Jangle via users < users@kafka.apache.org > Date: Thursday, 18 December 2025 at 6:22 pm To: users@kafka.apache.org < users@kafka.apache.org > Cc: Sachin Jangle < sachin.jangle@oracle.com > Subject: Migration to Kafka4 EXTERNAL EMAIL - USE CAUTION when clicking links or attachments Hi, Are there any technical documentation/case study available for migrating from 3.x to 4.x, other than the HELP documentation. Thanks, Sachin Jangle.

Migration to Kafka4

Hi, Are there any technical documentation/case study available for migrating from 3.x to 4.x, other than the HELP documentation. Thanks, Sachin Jangle.

Quotation for Kafka Distributed streaming platform.

Classification: Confidential Hi Team, I hope you are doing well. We need your support to provide us the quotation as we are in acquisition stage of existing product where we have requirement of Kafka Distributed streaming platform. Please advice next step. Bill to and Ship to – HCL Technologies Ltd, Cessna Business Park, Outer Ring Road (ORR), Kadubeesanahalli, Bengaluru, Karnataka 56000 Thanks & Regards Nitin Tyagi +91-7838388207 ::DISCLAIMER:: The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL...

Re: Kafka connect timestamp header serialization issue of Midnight UTC Timestamps

Hi team, I'm following up on my earlier email regarding the midnight-UTC timestamp serialization issue in Kafka Connect. Could you please advise whether this is expected behavior or if I should open a JIRA ticket? I have signed up for JIRA, but it appears I haven't received any mails regarding the same as well. If needed, could you please grant access or advise the process? On Sun, Nov 30, 2025 at 12:29 PM Vinayak Gaikwad < gaikwadvinayak291@gmail.com > wrote: > Hi, > I am encountering an issue with how timestamp headers are serialized in > Kafka Connect and would appreciate clarification on whether this is the > intended behavior or a bug. > > Problem: Incorrect Serialization of Midnight UTC Timestamps > When using connectors that produce timestamp headers (e.g., the MongoDB > source connector), timestamps that fall exactly at midnight UTC are being > serialized incorrectly, losing their time component. Timestamps a...

Re: Subscribe to Kafka users list

Hi, You can subscribe yourself, see the instructions on https://kafka.apache.org/contact Thanks, Mickael On Mon, Dec 8, 2025 at 3:49 PM Vijay Roy < vkroy241@gmail.com > wrote: > > Hi Team , > > Please add me to Kafka subscriber list as I am interested in learning and > contributing to the Kafka project. > > Thanks, > Vijay

Subscribe to Kafka users list

Hi Team , Please add me to Kafka subscriber list as I am interested in learning and contributing to the Kafka project. Thanks, Vijay

Re: Validating Kafka Disk Throughput Formulas (Write & Read)

Prateek, yes sorry forgot to mention that. Basically it does 2 lots of calculations. 1st calculation is for the producer workload and the replication factor. 2nd calculation includes whatever consumer load is running (combinations of the 3 types I mentioned). So, if you only have the producer workload and replication factor, it will give the network and local storage (total for cluster) values for part 1. Then add consumer workloads to include for part 2 of the calculation. From memory, I don't assume follower fetching (I did this a while ago)! You could check your formula with it (and potentially my model - I haven't validated it!) Hope that's useful, Paul From: Prateek Kohli < prateek.kohli@ericsson.com > Date: Tuesday, 2 December 2025 at 3:42 pm To: Brebner, Paul < Paul.Brebner@netapp.com >, users@kafka.apache.org < users@kafka.apache.org > Subject: RE: Validating Kafka Disk Throughput Formulas (Write & Read) EXTERNAL ...

RE: Validating Kafka Disk Throughput Formulas (Write & Read)

Thanks!! Just to confirm, should follower replicas also be considered delayed consumers in this calculator? Prateek From: Brebner, Paul < Paul.Brebner@netapp.com > Sent: 02 December 2025 09:58 To: Prateek Kohli < prateek.kohli@ericsson.com >; users@kafka.apache.org Subject: Re: Validating Kafka Disk Throughput Formulas (Write & Read) You don't often get email from paul.brebner@netapp.com <mailto: paul.brebner@netapp.com >. Learn why this is important< https://aka.ms/LearnAboutSenderIdentification > Hi - the 3 consumer scenarios actually model (1) real-time consumers = reading from cache, not local disk (2) delayed consumers = reading from local disk and (3) remote consumers = reading from remote tiered storage. So I think for your example delayed consumers (2) is what you want :-) - just set others to 0. Paul From: Prateek Kohli < prateek.kohli@ericsson.com <mailto: prateek.kohli@ericsson.com >> Date: Tuesday, 2 December...

Re: Validating Kafka Disk Throughput Formulas (Write & Read)

Hi - the 3 consumer scenarios actually model (1) real-time consumers = reading from cache, not local disk (2) delayed consumers = reading from local disk and (3) remote consumers = reading from remote tiered storage. So I think for your example delayed consumers (2) is what you want :-) - just set others to 0. Paul From: Prateek Kohli < prateek.kohli@ericsson.com > Date: Tuesday, 2 December 2025 at 3:20 pm To: Brebner, Paul < Paul.Brebner@netapp.com >, users@kafka.apache.org < users@kafka.apache.org > Subject: RE: Validating Kafka Disk Throughput Formulas (Write & Read) EXTERNAL EMAIL - USE CAUTION when clicking links or attachments Hi @Brebner, Paul<mailto: Paul.Brebner@netapp.com >, Thanks for your reply. I checked this calculator: https://github.com/instaclustr/code-samples/blob/main/Kafka/TieredStorage/kafka_calculator_graphs.html < https://urldefense.com/v3/__https://github.com/instaclustr/code-samples/blob/main/Kafka/Tiere...

RE: Validating Kafka Disk Throughput Formulas (Write & Read)

Hi @Brebner, Paul<mailto: Paul.Brebner@netapp.com >, Thanks for your reply. I checked this calculator: https://github.com/instaclustr/code-samples/blob/main/Kafka/TieredStorage/kafka_calculator_graphs.html But I don't think it considers scenarios with replication lag. For example, if my replicas are significantly behind the leader, then when they fetch data from the leader, the leader will need to read from disk to serve those followers. Shouldn't we consider disk bandwidth for this scenario as well? From: Brebner, Paul < Paul.Brebner@netapp.com > Sent: 02 December 2025 07:23 To: users@kafka.apache.org Cc: Prateek Kohli < prateek.kohli@ericsson.com > Subject: Re: Validating Kafka Disk Throughput Formulas (Write & Read) You don't often get email from paul.brebner@netapp.com <mailto: paul.brebner@netapp.com >. Learn why this is important< https://aka.ms/LearnAboutSenderIdentification > Hi Prateek, You may find this bl...

Re: Validating Kafka Disk Throughput Formulas (Write & Read)

Hi Prateek, You may find this blog on wrote on Kafka sizing useful: https://www.instaclustr.com/blog/how-to-size-apache-kafka-clusters-for-tiered-storage-part-1/ With the associated calculator here: https://github.com/instaclustr/code-samples/tree/main/Kafka/TieredStorage This one in particular: https://github.com/instaclustr/code-samples/blob/main/Kafka/TieredStorage/kafka_calculator_graphs.html Just download and use locally with a browser. For your example you want "delayed consumers" only. Regards, Paul Brebner NetApp Instaclustr From: Prateek Kohli via users < users@kafka.apache.org > Date: Monday, 1 December 2025 at 8:49 pm To: users < users@kafka.apache.org > Cc: Prateek Kohli < prateek.kohli@ericsson.com > Subject: Validating Kafka Disk Throughput Formulas (Write & Read) EXTERNAL EMAIL - USE CAUTION when clicking links or attachments Hi everyone, I'm working on capacity planning for Kafka and wanted to ...

Request for API Access to Kafka Upgrade Notes

Hi Kafka Team, I'm reaching out to request information regarding access to the Kafka upgrade notes. I'm currently developing an AI agent for our team to help automate parts of our due-diligence process. The agent will follow defined SOPs for performing upgrades, read our Kafka configuration files, retrieve the latest kafka upgrade notes, and then generate a due-diligence report along with a list of required changes for the upgrade. At the moment, we're encountering an issue with retrieving the Kafka upgrade notes via API calls. Could you please advise if there is an API endpoint available that we can use to access these notes? Thank you in advance for your support. Best regards, Luma Alhajjar DevOps Engineer, CX S&S Cloud Operations Canada SAP Canada, Toronto E: l.alhajjar@sap.com <mailto: l.alhajjar@sap.com > M: +1 (437) 696-6719

Validating Kafka Disk Throughput Formulas (Write & Read)

Hi everyone, I'm working on capacity planning for Kafka and wanted to validate two formulas I'm using to estimate cluster-level disk throughput in a worst-case scenario (when all reads come from disk due to large consumer lag and replication lag). 1. Disk Write Throughput Write_Throughput = Ingest_MBps × Replication_Factor(3) Explanation: Every MB of data written to Kafka is stored on all replicas (leader + followers), so total disk writes across the cluster scale linearly with the replication factor. 2. Disk Read Throughput (worst case, cache hit = 0%) Read_Throughput = Ingest_MBps × (Replication_Factor − 1 + Number_of_Consumer_Groups) Explanation: Leaders must read data from disk to: * serve followers (RF − 1 times), and * serve each consumer group (each group reads the full stream). If pagecache misses are assumed (e.g., heavy lag), all of these reads hit disk, so the terms add up. Are these calculations accurate for estimating cluster...