Kafka

Posts

Showing posts from October, 2022

Re: Metadata Refresh and TimeoutException when MAX_BLOCK_MS_CONFIG set 0

Hi Luke and Kafka Dev Team, Any interest in making Kafka Producer non-blocking when Broker is down and when the metadata refresh cache does not have topic details? This seems to be a missing piece when it comes to Kafka Producer not being able to handle state when it is really down vs metadata refresh is not available. I hope there is enough interest to make this producer broker down vs metadata not available. Thanks, Bhavesh On Mon, Oct 10, 2022 at 4:04 PM Bhavesh Mistry < mistry.p.bhavesh@gmail.com > wrote: > Hi Luke, > > Thanks for the pointers. > > Sorry for being late I was out. > > > > I would like to propose the following which might be a little different > from the Old one: > > Kafka Producer must distinguish between *broker down state* vs *metadata > NOT available* for a given topic. > > > > Like the boot-strap server option, many applications (like ours) do not > dynamically c...

[DISCUSS] KIP-880: X509 SAN based SPIFFE URI ACL within mTLS Client Certificates

Hi all, I wanted to check and verify if there is any interest and animo to adopt this feature request: https://issues.apache.org/jira/browse/KAFKA-14340 Istio and other *SPIFFE* based systems use X509 Client Certificates to provide workload ID. Kafka currently does support Client Cert based AuthN/Z and mapping to ACL, but only so be inspecting the CN field within a Client Certificate. There are several POC implementations out there implementing a bespoke *KafkaPrincipalBuilder* implementation for this purpose. Two examples include - https://github.com/traiana/kafka-spiffe-principal - https://github.com/boeboe/kafka-istio-principal-builder (written by myself) The gist is to introspect X509 based client certificates, look for a URI based SPIFFE entry in the SAN extension and return that as a principle, that can be used to write ACL rules. This KIP request is to include this functionality into Kafka's main functionality so end-users do...

Re: Granting permission for Kafka Contributor

Hi, You should be good to go now. Cheers, Chris On Fri, Oct 28, 2022 at 4:14 PM yuachieve1234 <260738464@qq.com.invalid> wrote: > Jira ID：yuxj109 > > > > > yuachieve1234 > 260738464@qq.com > > > >  

Re: Topic Compaction

Hi Navneeth Your configuration looks correct to me. If you observe that compaction is not cleaning up old records, it could either be due to slow compaction or could be due to incorrect configuration. Here are a few things that I would check: First, validate that the log cleaner is running. There are multiple ways to do that: - Option#1: Check if a thread with the name "kafka-log-cleaner-thread-" is running. You can either use a utility such as jstack or jconsole to check the status of running threads or you can take a thread dump using kill -3 pId.You should observe N threads with prefix "kafka-log-cleaner-thread-" where N is value for configuration log.cleaner.threads - Option#2: Check the value of the metric kafka.log:type=LogCleanerManager,name=time-since-last-run-ms to observe the last time the cleaner was run. After that check the value of metric max-clean-time-secsto verify that the run did not immediate...

RE: Supported Kafka/Zookeeper Version with ELK 8.4.3

Hi Team, We are still waiting for the reply. Please update we must know what version of Kafka is compatible with ELK 8.4 version. Still, I can see no one replied on user and Dev community portal Thanks Sudip From: Kumar, Sudip Sent: Monday, October 17, 2022 5:23 PM To: users@kafka.apache.org; dev@kafka.apache.org Cc: Rajendra Bangal, Nikhil <nikhil.rajendra-bangal@capgemini.com>; Verma, Harshit <harshit.c.verma@capgemini.com>; Verma, Deepak Kumar <deepak-kumar.verma@capgemini.com>; Arkal, Dinesh Balaji <dinesh-balaji.arkal@capgemini.com>; Saurabh, Shobhit <shobhit.saurabh@capgemini.com> Subject: Supported Kafka/Zookeeper Version with ELK 8.4.3 Importance: High Hi Kafka Team, Currently we are planning to upgrade ELK 7.16 to 8.4.3 version. In our ecosystem we are using Kafka as middleware which is ingesting data coming from different so...

Granting permission for Kafka Contributor

Jira ID：yuxj109 yuachieve1234 260738464@qq.com  

申请加入到 Kafka 的 Contributor 列表中

Jira ID：yuxj109 yuachieve1234 260738464@qq.com  

RE: Kafka Streams - Producer attempted to produce with an old epoch.

Detail I forgot to mention is that I am using EOS, so the streams application was consistently getting this error causing it to restart that task, which obviously would bottle neck everything. Once I increased the threads on the brokers the epoch error subsided until the volume increased. Right now the streams application has 36 threads per box (5 boxes w/ 48 threads) and when running normally it is keeping up, but once this epoch error starts it causes a cascade of slowness. The CPU is not even being fully utilized due to this error happening every minute or so. Perhaps some other tuning is needed too, but I'm lost what to look at. I do have JMX connection to the broker and streams applications if there's any useful information I should be looking at. -----Original Message----- From: Sophie Blee-Goldman <sophie@confluent.io.INVALID> Sent: Thursday, October 27, 2022 11:22 PM To: users@kafka.apache.org Subject: Re: Kafka Streams - Producer attempted to pro...

Re: Kafka Streams - Producer attempted to produce with an old epoch.

I'm not one of the real experts on the Producer and even further from one with broker performance, so someone else may need to chime in for that, but I did have a few questions: What specifically are you unsatisfied with w.r.t the performance? Are you hoping for a higher throughput of your Streams app's output, or is there something about the brokers? I'm curious why you started with increasing the broker threads, especially if the perf issue/bottleneck is with the Streams app's processing (but maybe it is not). I would imagine that throwing more and more threads at the machine could even make things worse, it definitely will if the thread count gets high enough though it's hard to say where/when it might start to decline. Point is, if the brokers are eating up all the cpu time with their own threads then the clients embedded in Streams may be getting starved out at times, causing that StreamThread/consumer to drop out of the group and resulting i...

Kafka Streams - Producer attempted to produce with an old epoch.

Hi, I have a kafka streams application deployed on 5 nodes and with full traffic I am getting the error message: org.apache.kafka.common.errors.InvalidProducerEpochException: Producer attempted to produce with an old epoch. Written offsets would not be recorded and no more records would be sent since the producer is fenced, indicating the task may be migrated out I have 5 x 24 CPU/48 core machines with 128gb of ram. These machines are the kafka brokers with 2x1TB disks for kafka logs and also running the kafka Streams application. 2x replication factor on topic, topic is producing about 250k per second. I have 2 aggregations in the topology to 2 output topics, the final output topics are in the 10s of k range per second. I'm assuming I have a bottleneck somewhere, I increased the broker thread counts and observed that this frequency of this error reduced, but it's still happening. Here's the broker configuration I'm using now, but I might be overshooting so...

Topic Compaction

Hi All, We are using AWS MSK with kafka version 2.6.1. There is a compacted topic with the below configurations. After reading the documentation my understanding was that null values in the topic can be removed using delete retention time but I can see months old keys having null values. Is there any other configuration that needs to be changed for removing null values from a compacted topic? Thanks! cleanup.policy=compact segment.bytes=1073741824 min.cleanable.dirty.ratio=0.1 delete.retention.ms =3600000 segment.ms =3600000 Regards, Navneeth

AWS MSK Rack Awareness

Hi All, Has anyone implemented rack awareness with AWS MSK? In the broker configuration I see the following values (use1-az2, use1-az4 & use1-az6). How are these values determined and how can we inject these values to java consumers? Thanks! Regards, Navneeth

Re: Balancing traffic between multiple directories

I don't think the Confluent self-balancing feature works if you have your broker data in multiple directories anyway - it's expecting a single dir per broker and will try and keep the data balanced between brokers. Also just as an aside, I'm not sure there's much value in using multiple directories. I assume you have these mapped to individual disks? I'd be curious to hear if you actually get any performance benefit out of that, especially when weighed against the increased likelihood of disk failure. I realize that doesn't help your current problem, more of a question/discussion point I guess. I think your only option for moving the data around is the kafka-reassign-partitions.sh script. Alex C On Thu, Oct 27, 2022 at 9:30 AM Andrew Grant <agrant@confluent.io.invalid> wrote: > There's Cruise Control, https://github.com/linkedin/cruise-control , which > is open-source and could help with automated balancing. > > On...

Re: Balancing traffic between multiple directories

There's Cruise Control, https://github.com/linkedin/cruise-control , which is open-source and could help with automated balancing. On Thu, Oct 27, 2022 at 10:26 AM < gaode@hotmail.co.uk > wrote: > Auto rebalancing is a very important feature to run Kafka in a production > environment. Given the confluent already have this feature, are there any > space that the open source version could have this feature as well? > Or, is it the idea that opensource version shouldn't be used in a high > load production environment? > > ________________________________ > 发件人: sunil chaudhari < sunilmchaudhari05@gmail.com > > 发送时间: 2022年10月27日 3:11 > 收件人: users@kafka.apache.org < users@kafka.apache.org > > 主题: Re: Balancing traffic between multiple directories > > Hi Lehar, > You are right. There is no better way in open source Kafka. > However confluent has something called as Auto Rebalancing feature. > Can...

回复: Balancing traffic between multiple directories

Auto rebalancing is a very important feature to run Kafka in a production environment. Given the confluent already have this feature, are there any space that the open source version could have this feature as well? Or, is it the idea that opensource version shouldn't be used in a high load production environment? ________________________________ 发件人: sunil chaudhari < sunilmchaudhari05@gmail.com > 发送时间: 2022年10月27日 3:11 收件人: users@kafka.apache.org < users@kafka.apache.org > 主题: Re: Balancing traffic between multiple directories Hi Lehar, You are right. There is no better way in open source Kafka. However confluent has something called as Auto Rebalancing feature. Can you check if there is free version with this feature? It start balancing of brokers automatically when it see there is uneven distribution of partitions. Regards, Sunil. On Wed, 26 Oct 2022 at 12:03 PM, Lehar Jain <lehar.j@media.net.invalid> wrote: > Hey Andrew, > ...

Re: Balancing traffic between multiple directories

Hi Lehar, You are right. There is no better way in open source Kafka. However confluent has something called as Auto Rebalancing feature. Can you check if there is free version with this feature? It start balancing of brokers automatically when it see there is uneven distribution of partitions. Regards, Sunil. On Wed, 26 Oct 2022 at 12:03 PM, Lehar Jain <lehar.j@media.net.invalid> wrote: > Hey Andrew, > > Thanks for the reply. Currently, we are using the same method as you > described. Wanted to make sure if there is a better way. > > It seems there isn't currently. So we will keep using this only. > > On Tue, Oct 25, 2022 at 7:23 PM Andrew Grant <agrant@confluent.io.invalid> > wrote: > > > Hey Lehar, > > > > > > I don't think there's a way to control this during topic creation. I just > > took a look through > > > > > https://github.com/apache/kafka/blob...

Re: Balancing traffic between multiple directories

Hey Andrew, Thanks for the reply. Currently, we are using the same method as you described. Wanted to make sure if there is a better way. It seems there isn't currently. So we will keep using this only. On Tue, Oct 25, 2022 at 7:23 PM Andrew Grant <agrant@confluent.io.invalid> wrote: > Hey Lehar, > > > I don't think there's a way to control this during topic creation. I just > took a look through > > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/AdminUtils.scala > and it does appear partition assignment does not account for each broker's > different log directories. I also took a look at the kafka-topics.sh script > and it has a --replica-assignment argument but that looks to only allow > specifying brokers. During topic creation, once a replica has been chosen I > think we then choose the directory with the fewest number of partitions - > see > > https://github....

Re: Balancing traffic between multiple directories

Hey Lehar, I don't think there's a way to control this during topic creation. I just took a look through https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/AdminUtils.scala and it does appear partition assignment does not account for each broker's different log directories. I also took a look at the kafka-topics.sh script and it has a --replica-assignment argument but that looks to only allow specifying brokers. During topic creation, once a replica has been chosen I think we then choose the directory with the fewest number of partitions - see https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogManager.scala#L1192 What I think you can do is move existing partitions around with the kafka-reassign-partitions.sh script. From running the command locally: --reassignment-json-file <String: The JSON file with the partition manual assignment json file path> reassignment configurationTh...

Balancing traffic between multiple directories

Hey, We run Kafka brokers with multiple log directories. I wanted to know how Kafka balances traffic between various directories. Can we have our own strategy to distribute different partitions to different directories. As currently, we are facing an imbalance in sizes of the aforementioned directories, some directories have a lot of empty space whereas others are getting filled quickly. Regards

Task :core:compileScala FAILED

Hello, I'm using kafka version 3.3.1, and openjdk version 11.0.16. when i run the command ./gradlew jar -PscalaVersion=2.13.8, i have as error: > java.io.IOException: Cannot run program "/usr/lib/jvm/java-11-openjdk-amd64/bin/javac" (in directory "/root/.gradle/workers"): error=2, No such file or directory. Can i find any one to help me to solve this problem? Regards,

Re: Help with MM2 active/passive configuration

This is what Ryanne mentioned doc: https://github.com/apache/kafka/tree/trunk/connect/mirror Thanks. Luke On Mon, Oct 24, 2022 at 6:38 PM Chris Peart < chris@peart.me.uk > wrote: > > > Hi Ryanne, > > I cannot find what you mentioned, all i'm looking for are some example > configurations for and active/passive setup. > > Many Thanks, > > Chris > > On 2022-10-19 18:32, Chris Peart wrote: > > > Thanks Ryan, can you provide a link to the readme please. > > Many Thanks > > Chris > > > > On 19 Oct 2022, at 6:15 pm, Ryanne Dolan < ryannedolan@gmail.com > wrote: > > > > Hey Chris, check out the readme in connect/mirror for examples on how > > to > > run mirror maker as a standalone process. It's similar to how > > Cloudera's > > mirror maker is configured. If you've got a Connect cluster already, I > > recommend using tha...

Re: Help with MM2 active/passive configuration

Hi Ryanne, I cannot find what you mentioned, all i'm looking for are some example configurations for and active/passive setup. Many Thanks, Chris On 2022-10-19 18:32, Chris Peart wrote: > Thanks Ryan, can you provide a link to the readme please. > Many Thanks > Chris > > On 19 Oct 2022, at 6:15 pm, Ryanne Dolan < ryannedolan@gmail.com > wrote: > > Hey Chris, check out the readme in connect/mirror for examples on how > to > run mirror maker as a standalone process. It's similar to how > Cloudera's > mirror maker is configured. If you've got a Connect cluster already, I > recommend using that and manually configuring the > MirrorSourceConnector. > > Ryanne > > On Wed, Oct 19, 2022, 5:26 AM Chris Peart < chris@peart.me.uk > wrote: > > Hi All, > > I'm new to MirrorMaker and have a production cluster running version > 2.8.1 and have a development c...

in case of disk failure,why not recover from middle position of the log file

can we write the byte position of .log 、.index file to a position-checkpoint file, and recover from the position to rebuild the index files?

Re: Help with MM2 active/passive configuration

Thanks Ryan, can you provide a link to the readme please. Many Thanks Chris > On 19 Oct 2022, at 6:15 pm, Ryanne Dolan < ryannedolan@gmail.com > wrote: > > Hey Chris, check out the readme in connect/mirror for examples on how to > run mirror maker as a standalone process. It's similar to how Cloudera's > mirror maker is configured. If you've got a Connect cluster already, I > recommend using that and manually configuring the MirrorSourceConnector. > > Ryanne > >> On Wed, Oct 19, 2022, 5:26 AM Chris Peart < chris@peart.me.uk > wrote: >> >> >> >> Hi All, >> >> I'm new to MirrorMaker and have a production cluster running version >> 2.8.1 and have a development cluster running the same version. >> >> Our prod cluster has 4 brokers and our dev cluster has 3 brokers, both >> have a replication factor of 3. >> >> I would lik...

Re: Help with MM2 active/passive configuration

Hey Chris, check out the readme in connect/mirror for examples on how to run mirror maker as a standalone process. It's similar to how Cloudera's mirror maker is configured. If you've got a Connect cluster already, I recommend using that and manually configuring the MirrorSourceConnector. Ryanne On Wed, Oct 19, 2022, 5:26 AM Chris Peart < chris@peart.me.uk > wrote: > > > Hi All, > > I'm new to MirrorMaker and have a production cluster running version > 2.8.1 and have a development cluster running the same version. > > Our prod cluster has 4 brokers and our dev cluster has 3 brokers, both > have a replication factor of 3. > > I would like to setup an active/passive replication using MM2, we used > to do this with Cloudera but have we have decommissioned Cloudera and > would like to know how to configure topic replication using MM2. > > I believe i need a mm2.properties file to achieve this. i do...

Help with MM2 active/passive configuration

Hi All, I'm new to MirrorMaker and have a production cluster running version 2.8.1 and have a development cluster running the same version. Our prod cluster has 4 brokers and our dev cluster has 3 brokers, both have a replication factor of 3. I would like to setup an active/passive replication using MM2, we used to do this with Cloudera but have we have decommissioned Cloudera and would like to know how to configure topic replication using MM2. I believe i need a mm2.properties file to achieve this. i do see what looks like a configuration in /opt/kafka_2.13-2.8.1/config/connect-mirror-maker.properties but i'm not sure if this is an active/passive configuration. Ideally an example file would be ideal if possible? Many Thanks, Chris

Supported Kafka/Zookeeper Version with ELK 8.4.3

Hi Kafka Team, Currently we are planning to upgrade ELK 7.16 to 8.4.3 version. In our ecosystem we are using Kafka as middleware which is ingesting data coming from different sources where publisher (Logstash shipper) publishing data in different Kafka Topics and subscriber (Logstash indexer) consuming the data. We have an integration of ELK 7.16 with Kafka V2.5.1 and zookeeper 3.5.8. Please suggest if we upgrade on ELK 8.4.3 version which Kafka and Zookeeper version will be supported? Provide us handful documents. Let me know if you any further questions. Thanks Sudip Kumar Capgemini-India This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please not...

question

Does kafka support Android client？

Re: SSL configuration in Apache Kafka

Hi Sunil, I have tried replacing localhost with IP and have also changed ssl.client.auth to required from none as I want my server and producer to communicate via HTTPS. However, this isn't working for me. When I try sending data to the topic through ny java producer via an API, I'm getting error in my application console Http method names must be tokens. This probably happens when hitting an API which should be tried on http over https. But in my case if I try running the producer api over http it again gives me the same issue ase previous. While in Http it states Http method names must be tokens. Do you have any suggestions or advise into this? Am I missing some configuration? Please correct me in case any misconfigurations. Thanks in advance. Best Regards, Namita On Fri, 7 Oct, 2022, 17:57 sunil chaudhari, < sunilmchaudhari05@gmail.com > wrote: > You can try two things. > Instead of localhost, can you publish the kafka service on...

Default direct buffer memory kafka

Dear Kafka, I want to ask about how much is the Default Direct buffer memory for kafka? Thanks, Best Regards, Bayu Hendra Setiawan

KStream consumers paused after hit by throttling

Hello, Iam looking for guidance on an issue we are having with our KStream clusters of version 3.1.1. We observe that some consumers of input topic of topology get into a paused state and newer return into consuming. The outcome is clearly seen by consumer lags rising on a few affected partitions of topology input topic and it requires manual intervention to replace some random KStream cluster host that triggers full rebalance. After that KS tasks are migrated to other nodes and the cluster is operational as it should be, consumption continues normally afterwards. We have enabled debug logs on client side and found out this: {"level":"DEBUG","logger":"org.apache.kafka.clients.consumer.internals.Fetcher","thread":"AAAA-StreamThread-1","message":"[Consumer instanceId=ABCDEF-1, clientId=AAAA-StreamThread-1-consumer, groupId=AAAA] Skipping fetching records for assigned partition Input.topic-28 beca...

in case of disk failure,why not recover from middle position of the log file

can we write the byte position of .log 、.index file to a position-checkpoint file, and recover from the position to rebuild the index files?

Re: Entire Kafka Connect cluster stuck because of a stuck sink connector

Hi, What version of Kafka Connect are you running? This sounds like a bug that was fixed a few releases ago. Cheers, Chris On Wed, Oct 12, 2022, 21:27 Hemanth Savasere < hemanth.savasere@gmail.com > wrote: > We have stumbled upon an issue on a running cluster with multiple > source/sink connectors: > > 1. One of our connectors was a JDBC sink connector connected to an SQL > Server database (using the oracle JDBC driver). > 2. It turns out that the DB instance had a problem causing all queries > to be stuck forever, which in turn made the start method of the > connector > hang forever. > 3. After some time, the entire Kafka Connect cluster was unavailable and > the REST API was not responding giving > {"error_code":500,"message":"Request > timed out"} for most requests. > 4. Pausing (just before the deletion of the consumer group) or deleting > th...

Entire Kafka Connect cluster stuck because of a stuck sink connector

We have stumbled upon an issue on a running cluster with multiple source/sink connectors: 1. One of our connectors was a JDBC sink connector connected to an SQL Server database (using the oracle JDBC driver). 2. It turns out that the DB instance had a problem causing all queries to be stuck forever, which in turn made the start method of the connector hang forever. 3. After some time, the entire Kafka Connect cluster was unavailable and the REST API was not responding giving {"error_code":500,"message":"Request timed out"} for most requests. 4. Pausing (just before the deletion of the consumer group) or deleting the problematic connector allowed the cluster to run normally again. We could reproduce the same issue by adding Thread.sleep(300000) in the start method or in the put method of the ConnectorTask. Wanted to know if there's any wiki/documentation provided that mentions how to handle this issue. ...

Issues when setting up Kafka mTLS with certs from GCP's Certificate Authority Service

Hello everyone, I am trying to setup a Kafka cluster with mTLS authentication using certificates signed by GCP's CAS (Certificate Authority Service). I have three Kafka nodes: a master and two workers. Each node has a PEM truststore containing the CA Root certificate from the authority on CAS and a PEM keystore containing a signed certificate from CAS and the private key. I followed this webpage < https://codingharbour.com/apache-kafka/using-pem-certificates-with-apache-kafka > for the setup. This is the server.properties file for the master node. Other nodes have a similar config except the ssl.keystore.location property. listeners=INTERNAL://:port,EXTERNAL://:port advertised.listeners=INTERNAL://:port,EXTERNAL://:port listener.security.protocol.map=INTERNAL:SSL,EXTERNAL: SSLinter.broker.listener.name =INTERNAL ssl.enabled.protocols=TLSv1.2 ssl.endpoint.identification.algorithm= producer.ssl.endpoint.identification.algorithm= consumer.ssl.endpoint.ide...

Leftover bootstrap connection with Kafka Java Client

Dear Kafka Community we are facing an Issue with kafka-console-consumer.sh / kafka-console-producer.sh (which both use the native java client) and the bootstrap behavior in conjunction with haproxy. Behavior: After the bootsrap process the java client doesn't disconnect and just keeps the bootstrapping socket open which result in a client disconnect (cD) from haproxy after two minutes. The haproxy documentation states for that cD message "The client did not read any data for as long as the 'clitimeout' delay". In our case that's a problem because during application startup there will be a lot of unneeded connections blocking resources on the proxy. Expected behavior: The Java Client does the bootstrap and disconnects. Our connection path for bootstrapping: Java Client ↔ haproxy ↔ bootstrap url of kafka running with strimzi on kubernetes. After bootstrapping the client connects directly to the brokers and the initial bootstrap...

Leftover bootstrap connection with Kafka Java Client

Re: consumer loses offset

Hi Lorenzo, In theory, it should commit every 5 secs IF you keep polling the server. But I saw you "stopped" the consumer for some hours, which means the commit won't happen during this period. So, if it exceeds the retention period, it'll get deleted. That's my assumption. You need to check the logs (both client and server) to find the clues. Luke On Mon, Oct 10, 2022 at 5:53 PM Lorenzo Rovere < l.rovere@reply.it > wrote: > Hi, thanks for your response. > > Is there any chance the offset is never committed on the > "__consumer_offsets" topic, although auto commit is enabled every 5 seconds? > We are checking daily and the offset is always set to NULL. > > > > Lorenzo Rovere > > Technology Reply > Via Avogadri, 2 > 31057 - Silea (TV) - ITALY > phone: +39 0422 1836521 > l.rovere@reply.it > www.reply.it > -----Original Message----- > From: Luke Chen < showuon...

Re: Metadata Refresh and TimeoutException when MAX_BLOCK_MS_CONFIG set 0

Hi Luke, Thanks for the pointers. Sorry for being late I was out. I would like to propose the following which might be a little different from the Old one: Kafka Producer must distinguish between *broker down state* vs *metadata NOT available* for a given topic. Like the boot-strap server option, many applications (like ours) do not dynamically create topics and publish/subscribe to predefine topics. So, the Kafka producer can have a configuration option to "*predefine-topics*". When a predefine-topic is configured, Metadata is fetched for those pre-defined topics before the producer is initialized. Also, these pre-defined topics will always guarantee that Metadata will be refreshed before it expires meaning (the metadata cache will expire at X time, then the producer should automatically fetch metadata refresh request X-(3000) ms so the cache will always have the latest mapping of topic partition states continue to fetch everyone seconds ti...

Re: [ANNOUNCE] New committer: Deng Ziming

Congrats Ziming! Regards, Bill On Mon, Oct 10, 2022 at 5:32 PM Ismael Juma < ismael@juma.me.uk > wrote: > Congratulations Ziming! > > Ismael > > On Mon, Oct 10, 2022 at 9:30 AM Jason Gustafson <jason@confluent.io.invalid > > > wrote: > > > Hi All > > > > The PMC for Apache Kafka has invited Deng Ziming to become a committer, > > and we are excited to announce that he has accepted! > > > > Ziming has been contributing to Kafka for about three years. He has > > authored > > more than 100 patches and helped to review nearly as many. In particular, > > he made significant contributions to the KRaft project which had a big > part > > in reaching our production readiness goal in the 3.3 release: > > https://blogs.apache.org/kafka/entry/what-rsquo-s-new-in . > > > > Please join me in congratulating Ziming! Thanks for all of your > > contrib...