Skip to main content

Posts

Showing posts from October, 2023

Kafka Behavior During Partition Leader Shutdowns

Hello,   I've been working on testing Kafka availability in Zookeeper mode during single broker shutdowns within a Kubernetes setup, and I've come across something interesting that I wanted to run by you.   We've noticed that when a partition leader goes down, messages are not delivered until a new leader is elected. While we expect this to happen, there’s a part of it that’s still not adding up. The downtime, or the time it takes for the new leader to step up, is about a minute. But what’s interesting is that when we increase the producer side retries to just 1, all of our messages get delivered successfully.   This seems a bit odd to me because, theoretically, increasing the retries should only resend the message, giving it an extra 10 seconds before it times out, while the first few messages should still have around 40 seconds to wait for the new leader. So, this behavior is a bit of a head-scratcher. I was wondering if you might have any insights or ...

Re: Java 1.8 and TLSv1.3

Hi Andreas, The TLS code has run into changes in behavior across different Java versions, so we only wanted to allow TLS 1.3 in the versions we tested against. TLS 1.3 landed in Java 8 a while after we made the relevant changes for Java 11 and newer. That said, Java 8 support is deprecated and will be removed in Apache Kafka 4.0 (in a few months). So, it's not clear we want to invest in making more features available for that version. Thanks, Ismael On Thu, Oct 26, 2023 at 4:49 AM Andreas Martens1 < amartens@uk.ibm.com > wrote: > Hello good people of Kafka, > > I was recently informed that TLS 1.3 doesn't work for connecting our > product to Kafka, and after some digging realised it was true, no matter > how hard I type "TLSv1.3" it doesn't work, weirdly with an error about no > applicable Ciphers. > > So after a bunch more digging I realised that the problem lies in the > Kafka client classes, in Kafka c...

Example configuration for kraft controllers with SASL_PLAINTEXT

Hi, there. I have a working 3-node kafka kraft mode network. Everything works fine with no authentication. I am using new Kafka 3.6. The node_id for the kraft controllers are "1000", "1001" and "1002". There is a regular kafka broker with node_id "1". I am trying to move that controller configuration to "scram-sha-256" authentication. The steps I did were: 1. With the cluster unauthenticated, I created scram-sha-256 credentials for users "1000", "1001" and "1002", using "kafka-configs.sh". The credentials reached the quorum servers and and they were distributed to the entire cluster, as inspection of "__cluster_metadata-0" storage files showed. 2. I stopped the quorum servers and I added this to the configuration of each one: """ listeners=CONTROLLER://:9093 # A comma-separated list of the names of the listeners used by the controll...

KRaft high watermark repeatedly off-by-one only on broker 2

I have been seeing an issue in my KRaft-based, 5-broker, Kafka 3.4.0 clusters, where broker 2 will report something like the following (see logs below): "The latest computed high watermark 16691349 is smaller than the current value 16691350, which suggests that one of the voters has lost committed data." This has happened at least 5 times in the past month, on different days. It has happened in two different clusters. It always happens on broker 2. This error always has lastFetchTimestamp=-1, lastCaughtUpTimestamp=-1 on broker 2, with timestamps that match the log's timestamp (to the second, at least) on every other broker. The error is always off-by-one, where the high watermark is one higher than the current value . This is always preceded by the log cleaner compacting partitions. The error is always logged 5-6 times, and then nothing more is mentioned about it until the next occurrence days or weeks later. I read through the ticket names included in the Kafka 3.5...

Migrating kraft controller to another machine

Hi, I've recently deployed a kraft cluster and by accident I set the process.roles to 'controller,broker'. This works, but the controller bounces back and forth around the cluster because of random IO slowdowns. To fix that I wanted to split the controller away from the broker and it looks like this is not supported as there's not a lot of interest in implementing KAFKA-14094? As I understand it even moving a controller to a different IP address is currently not supported by kafka? I'm very confused as this seems to be a fairly basic feature so I really doubt if I understand it correctly. Are there any workarounds for this? Thanks, Kris

Issue: *...Version 3.5 is not a valid version`* while trying to upgrade from Confluent Kafka v5.5.1 to v7.5.1

Hello. I faced an issue during the Kafka upgrade that I can't fix so far. I have the Confluentinc Kafka single test broker deployed onto OCP4. My goal is to upgrade from the ConfluentInc cp-enterprise-kafka v5.5.1(Kavka v2.3.0-IV1) to cp-kafka v7.5.1(Kafka v3.5.0). Upgrading in accordance to https://kafka.apache.org/documentation/#upgrade_3_5_0 . Initial config(Kafka is up and running): ``` confluentinc image: docker-public-virtual.docker.devstack.vwgroup.com/confluentinc/cp-kafka confluentinc imageTag: 7.5.1 server property inter.broker.protocol.version: "2.3-IV1" log.message.format.version: "2.3-IV1" ``` Updated config(error, see below): ``` confluentinc image: same confluentinc imageTag: same server property inter.broker.protocol.version: "3.5" #omitting the log.message.format.version for 3.* ``` Issue: *Exiting Kafka due to fatal exception org.apache.kafka.common.config.ConfigException: Invalid value 3.5 for configurat...

Question regarding controller election problems on Kafka 3.5.0 (protocol 3.5)

Hello all, I have recently upgraded a number of Kafka clusters from Kafka 2.5.1 (protocol 2.5) to Kafka 3.5.0 (protocol 3.5) according to this steps outlined here: https://kafka.apache.org/documentation/#upgrade_350_zk For the majority of these clusters, the new version and protocol have been running smoothly, with no recorded produce and/or consume availability issues. However, for some of my larger clusters, I have noticed that during a controller failover and subsequent election that occasionally a subset of brokers have trouble acknowledging the new controller. This causes those affected brokers to lock-up and become completely unresponsive to requests from other brokers within the cluster as well as clients, resulting in the number of ISRs for various partitions sinking below the minimum set value and eventually going offline. I have to manually bounce these troublesome brokers in order for them to successfully acknowledge the new controller and continue operatio...

Re: EOL and EOS

Not aware of a specific EOL strategy, but note that there is a future release strategy in discussion https://cwiki.apache.org/confluence/display/KAFKA/Time+Based+Release+Plan#TimeBasedReleasePlan-WhatIsOurEOLPolicy ? Personally, would recommend upgrading AK to the latest version at least once a year to get the latest feature enhancements and improvements made. With respect to Zookeeper, assuming that you are asking for using ZK as the quorum and matadata manager for AK, then note that AK3.5.0 itself supports Zookeeper 3.6.4 https://archive.apache.org/dist/kafka/3.5.0/RELEASE_NOTES.html [KAFKA-14731 < https://issues.apache.org/jira/browse/KAFKA-14731 > Only AK3.6 support ZK 3.8 ~ nag pavan On Wed, Oct 25, 2023 at 11:18 AM Kiran Satpute < kiransatpute52@gmail.com > wrote: > Hi Team, > > What is EOL and EOS of Apache Kafka 3.3.1 and Apache Zookeeper 3.7.1 > > -- > Thanks & Regard > Kiran Satpute > (9921424521) >

Java 1.8 and TLSv1.3

Hello good people of Kafka, I was recently informed that TLS 1.3 doesn't work for connecting our product to Kafka, and after some digging realised it was true, no matter how hard I type "TLSv1.3" it doesn't work, weirdly with an error about no applicable Ciphers. So after a bunch more digging I realised that the problem lies in the Kafka client classes, in Kafka clients' SslConfigs.java there is this code: static { if (Java.IS_JAVA11_COMPATIBLE) { DEFAULT_SSL_PROTOCOL = "TLSv1.3"; DEFAULT_SSL_ENABLED_PROTOCOLS = "TLSv1.2,TLSv1.3"; } else { DEFAULT_SSL_PROTOCOL = "TLSv1.2"; DEFAULT_SSL_ENABLED_PROTOCOLS = "TLSv1.2"; } } My initial thoughts were that these just set the defaults, I should still be able to set TLSv1.3 in my properties, but no. If I change the above block to: static { DEFAULT_SSL_PROTOCOL = "TLSv1.3"; DEFAULT_...

Re: Mirror Maker 2 - offset sync from source to target

Alexander, Looking at the MM2 documentation https://kafka.apache.org/documentation/#georeplication it appears that there's no "summary" of the auto-sync functionality, only the documentation strings for the individual configurations. Point (5) above is not covered in the public documentation, and you would need to read the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-545%3A+support+automated+consumer+offset+sync+across+clusters+in+MM+2.0 to learn about it. Point (6) above is also not covered in the public documentation, but has always been a hidden limitation of MirrorMaker2 for a long time that has recently become more apparent. We still have some work to do in order to make MirrorMaker2 more resilient to restarts. Thanks! Greg On Mon, Oct 23, 2023 at 9:00 PM Alexander Shapiro (ashapiro) <Alexander.Shapiro@amdocs.com.invalid> wrote: > > Not a problem Greg. > > Is there some documentation explaining the same ? ...

Re: Mirror Maker 2 - offset sync from source to target

Not a problem Greg. Is there some documentation explaining the same ? I tried to find a lot in the past ________________________________ From: Greg Harris <greg.harris@aiven.io.INVALID> Sent: Monday, October 23, 2023 11:23:26 PM To: users@kafka.apache.org < users@kafka.apache.org > Subject: Re: Mirror Maker 2 - offset sync from source to target [You don't often get email from greg.harris@aiven.io.invalid. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] CAUTION: This email is from an external source. Please don't open any unknown links or attachments. Alexander, My apologies for calling you Andrew. Greg On Mon, Oct 23, 2023 at 1:22 PM Greg Harris < greg.harris@aiven.io > wrote: > > Andrew, > > Yes, there isn't an explicit "create consumer group" operation, it > should be created when MM2 emits a sync for it. > > Best, > Greg > > On Mon, Oct 23, 20...

Re: Mirror Maker 2 - offset sync from source to target

Offset is an excellent artist On Mon, Oct 23, 2566 BE at 22:24 Greg Harris <greg.harris@aiven.io.invalid> wrote: > Alexander, > > My apologies for calling you Andrew. > > Greg > > On Mon, Oct 23, 2023 at 1:22 PM Greg Harris < greg.harris@aiven.io > wrote: > > > > Andrew, > > > > Yes, there isn't an explicit "create consumer group" operation, it > > should be created when MM2 emits a sync for it. > > > > Best, > > Greg > > > > On Mon, Oct 23, 2023 at 1:15 PM Alexander Shapiro (ashapiro) > > <Alexander.Shapiro@amdocs.com.invalid> wrote: > > > > > > Thanks, one clarification plz > > > > > > In bullet for You mention "4. The target group does not exist, or has > no active consumers" > > > If group on target does not exist, will it be created without active > consumers ? > > ...

Re: Mirror Maker 2 - offset sync from source to target

Alexander, My apologies for calling you Andrew. Greg On Mon, Oct 23, 2023 at 1:22 PM Greg Harris < greg.harris@aiven.io > wrote: > > Andrew, > > Yes, there isn't an explicit "create consumer group" operation, it > should be created when MM2 emits a sync for it. > > Best, > Greg > > On Mon, Oct 23, 2023 at 1:15 PM Alexander Shapiro (ashapiro) > <Alexander.Shapiro@amdocs.com.invalid> wrote: > > > > Thanks, one clarification plz > > > > In bullet for You mention "4. The target group does not exist, or has no active consumers" > > If group on target does not exist, will it be created without active consumers ? > > > > -----Original Message----- > > From: Greg Harris <greg.harris@aiven.io.INVALID> > > Sent: Monday, October 23, 2023 8:56 PM > > To: users@kafka.apache.org > > Subject: Re: Mirror Maker 2 - offset sync from ...

Re: Mirror Maker 2 - offset sync from source to target

Andrew, Yes, there isn't an explicit "create consumer group" operation, it should be created when MM2 emits a sync for it. Best, Greg On Mon, Oct 23, 2023 at 1:15 PM Alexander Shapiro (ashapiro) <Alexander.Shapiro@amdocs.com.invalid> wrote: > > Thanks, one clarification plz > > In bullet for You mention "4. The target group does not exist, or has no active consumers" > If group on target does not exist, will it be created without active consumers ? > > -----Original Message----- > From: Greg Harris <greg.harris@aiven.io.INVALID> > Sent: Monday, October 23, 2023 8:56 PM > To: users@kafka.apache.org > Subject: Re: Mirror Maker 2 - offset sync from source to target > > [You don't often get email from greg.harris@aiven.io.invalid. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > CAUTION: This email is from an external source. Please don't o...

RE: Mirror Maker 2 - offset sync from source to target

Thanks, one clarification plz In bullet for You mention "4. The target group does not exist, or has no active consumers" If group on target does not exist, will it be created without active consumers ? -----Original Message----- From: Greg Harris <greg.harris@aiven.io.INVALID> Sent: Monday, October 23, 2023 8:56 PM To: users@kafka.apache.org Subject: Re: Mirror Maker 2 - offset sync from source to target [You don't often get email from greg.harris@aiven.io.invalid. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] CAUTION: This email is from an external source. Please don't open any unknown links or attachments. Hi Alexander, Sorry I noticed an inconsistency in my last email. For point 6: If the MirrorCheckpointTask restarts after replication but before offset is translated, then it may not be able to perform a translation. If the MirrorCheckpointTask does not restart, it should be able to perform translat...

Re: Mirror Maker 2 - offset sync from source to target

Hi Alexander, Sorry I noticed an inconsistency in my last email. For point 6: If the MirrorCheckpointTask restarts after replication but before offset is translated, then it may not be able to perform a translation. If the MirrorCheckpointTask does not restart, it should be able to perform translation. So if your MirrorMaker2 is restarting frequently, that may prevent consistent translation. Thanks, Greg On Mon, Oct 23, 2023 at 10:46 AM Alexander Shapiro (ashapiro) <Alexander.Shapiro@amdocs.com.invalid> wrote: > > Hi Greg, > Thank you very much, > it is the most detailed answer I would expect. > > -----Original Message----- > From: Greg Harris <greg.harris@aiven.io.INVALID> > Sent: Monday, October 23, 2023 8:42 PM > To: users@kafka.apache.org > Subject: Re: Mirror Maker 2 - offset sync from source to target > > CAUTION: This email is from an external source. Please don't open any unknown links or attac...

RE: Mirror Maker 2 - offset sync from source to target

Hi Greg, Thank you very much, it is the most detailed answer I would expect. -----Original Message----- From: Greg Harris <greg.harris@aiven.io.INVALID> Sent: Monday, October 23, 2023 8:42 PM To: users@kafka.apache.org Subject: Re: Mirror Maker 2 - offset sync from source to target CAUTION: This email is from an external source. Please don't open any unknown links or attachments. Hi Alexander, Thanks for using MirrorMaker2! If you turn on `sync.group.offsets.enabled`, then the MirrorCheckpointTask will sync the offsets if all of the following is true: 1. The source group exists 2. The source group name matches the configured group filter (group.filter.class, groups, groups.exclude) 3. The source group has an offset for a topic which matches the configured topic filter (topic.filter.class, topics, topics.exclude) 4. The target group does not exist, or has no active consumers 5. The target group has no offset for a specified partition, or the offset i...

Re: Mirror Maker 2 - offset sync from source to target

Hi Alexander, Thanks for using MirrorMaker2! If you turn on `sync.group.offsets.enabled`, then the MirrorCheckpointTask will sync the offsets if all of the following is true: 1. The source group exists 2. The source group name matches the configured group filter (group.filter.class, groups, groups.exclude) 3. The source group has an offset for a topic which matches the configured topic filter (topic.filter.class, topics, topics.exclude) 4. The target group does not exist, or has no active consumers 5. The target group has no offset for a specified partition, or the offset is earlier than the translated offset 6. MirrorCheckpointTask restarted after replication happened but before the offset could be translated. If one of these isn't true, you won't see translation happening. Are you having a problem with too many consumer groups being created? You can restrict the group or topic filters, as they're very permissive by default. Or is the problem tha...

Mirror Maker 2 - offset sync from source to target

Hi Can someone advise please if sync.group.offsets.enabled : true to sync offset from source to target for particular consumer group That group must be created on target, even if no actual consumption will be done ? This email and the information contained herein is proprietary and confidential and subject to the Amdocs Email Terms of Service, which you may review at https://www.amdocs.com/about/email-terms-of-service < https://www.amdocs.com/about/email-terms-of-service >

Re: [External Email] Re: Upgrading Kafka Kraft in Kubernetes

That is definitely confusing that the --bootstrap-server option works only when it is before the upgrade part. Maybe it is worth opening a JIRA for it? Jakub On Fri, Oct 20, 2023 at 7:40 AM Soukal, Jiří <j.soukal@quadient.com.invalid> wrote: > Hello Jakub, > > Thank you for reply. I have tried the --bootstrap-server option before but > have used it incorrectly. > > this gives unrecognized arguments: '--bootstrap-server' > ./bin/kafka-features.sh upgrade --metadata 3.6 --bootstrap-server > [server:port] > > but this works: > ./bin/kafka-features.sh --bootstrap-server [server:port] upgrade > --metadata 3.6 > > This is doable using declarative Kubernetes job. > > Thank you very much, I appreciate your help. > > Jiri > > From: Jakub Scholz < jakub@scholz.cz > > Sent: Thursday, October 19, 2023 5:18 PM > To: users@kafka.apache.org > Subject: [External Email] Re: Upg...

RE: [External Email] Re: Upgrading Kafka Kraft in Kubernetes

Hello Jakub, Thank you for reply. I have tried the --bootstrap-server option before but have used it incorrectly. this gives unrecognized arguments: '--bootstrap-server' ./bin/kafka-features.sh upgrade --metadata 3.6 --bootstrap-server [server:port] but this works: ./bin/kafka-features.sh --bootstrap-server [server:port] upgrade --metadata 3.6 This is doable using declarative Kubernetes job. Thank you very much, I appreciate your help. Jiri From: Jakub Scholz < jakub@scholz.cz > Sent: Thursday, October 19, 2023 5:18 PM To: users@kafka.apache.org Subject: [External Email] Re: Upgrading Kafka Kraft in Kubernetes Hi Jiří, Why can't you run it from another Pod? You should be able to specify --bootstrap-server and point it to the brokers to connect to. You can also pass further properties to it using the --command-config option. It should be also possible to use it from the Admin API < https://kafka.apache.org/36/javadoc/org/apac...

Re: Upgrading Kafka Kraft in Kubernetes

Hi Jiří, Why can't you run it from another Pod? You should be able to specify --bootstrap-server and point it to the brokers to connect to. You can also pass further properties to it using the --command-config option. It should be also possible to use it from the Admin API < https://kafka.apache.org/36/javadoc/org/apache/kafka/clients/admin/Admin.html#updateFeatures(java.util.Map,org.apache.kafka.clients.admin.UpdateFeaturesOptions) > directly from anywhere if needed. But there is indeed no way to manage this declaratively in the Kafka properties file as it was possible with inter.broker.protocol.version. It also works a bit differently than the inter.broker.protocol.version worked before KRaft: * I think it does more checking whether all nodes in the cluster support the version etc. * You can't really downgrade it easily (at least not safely). So maybe it is better you cannot just change some environment variables as that might result in crash-lo...