Kafka

Posts

Showing posts from March, 2022

[Question] What is the best practice in consuming multiple topics?

To whom it may concern, I have a very different problem when using Kafka, and hope someone can help me! Background I need to consume several different topics in my java project. For each topic, they has different configurations, such as max.pull.records, but the same consumer group. Also, there are different business logic for each topic. How did I do I have created different KafkaConsumer instances with different topics. Because I can setup the configurations only when creating the KafkaConsumer instance. Problem Some topic partitions cannot be consumed. This is really a mazing phenomenon! I found that the client.id will be changed when I redeployed and restarted the project. All of KafkaConsumer share with the global static variable CONSUMER_CLIENT_ID_SEQUENCE in ConsumerConfig if I didn't set client.id . Is it the best practice to configure the different client.id for each KafkaConsumer instance? Or who can tell me how to solve my issue? ...

Re: Need Help - getting vulnerability due to Log4j- v1.2.17 jar being used in Kafka_2.11-2.4.0.

Hi Sandip, I just merged the PR https://github.com/apache/kafka/pull/11743 that replaces log4j with reload4j. Reload4j will be part of Apache Kafka 3.2.0 and 3.1.1. Best, Bruno On 30.03.22 04:26, Luke Chen wrote: > Hi Sandip, > > We plan to replace log4j with reload4j in v3.2.0 and v3.1.1. (KAFKA-13660 > < https://issues.apache.org/jira/browse/KAFKA-13660 >) > And plan to upgrade to log4j2 in v4.0.0. > > You can check this discussion thread for more details: > https://lists.apache.org/thread/qo1y3249xldt4cpg6r8zkcq5m1q32bf1 > > Thank you. > Luke > > On Tue, Mar 29, 2022 at 10:18 PM Sandip Bhunia > <sandip.bhunia@tcs.com.invalid> wrote: > >> Dear Team, >> >> We are getting vulnerability due to Log4j- v1.2.17 jar being used in >> Kafka_2.11-2.4.0. >> We tried to upgrade the same to Kafka_2.13-3.1.0 to remediate >> vulnerability due to Log4j- v1.2.17 (obso...

Re: Kafka Connect - offset.storage.topic reuse across clusters

Hi Robin, I'm interested in a use case in which I need to be able to have a connect cluster fail, and then bring up a new cluster with the same offset topics and connectors. By new cluster I mean a cluster with a new ` group.id `. I am aware I could just use the same group id as before but I would like to explore this route. I'm keen to learn more about the reasons the described case above, and those in my original thread, aren't recommended. Thank you, Jordan On Wed, 30 Mar 2022 at 14:00, Robin Moffatt <robin@confluent.io.invalid> wrote: > Hi Jordan, > > Is there a good reason for wanting to do this? I can think of multiple > reasons why you shouldn't do this even if technically it works in some > cases. > Or it's just curiosity as to whether you can/should? > > thanks, Robin. > > > -- > > Robin Moffatt | Principal Developer Advocate | robin@confluent.io | @rmoff > > > On We...

Re: Kafka Connect - offset.storage.topic reuse across clusters

Hi Jordan, Is there a good reason for wanting to do this? I can think of multiple reasons why you shouldn't do this even if technically it works in some cases. Or it's just curiosity as to whether you can/should? thanks, Robin. -- Robin Moffatt | Principal Developer Advocate | robin@confluent.io | @rmoff On Wed, 30 Mar 2022 at 13:36, Jordan Wyatt < jwyatt1995@gmail.com > wrote: > Hi, > > I've recently been experimenting with setting the values of the `offset,` > `storage` and `status` topics within Kafka Connect. > > I'm aware from various sources (Robin Moffatt blogs, StackOverflow, > Confluent Kafka Connect docs) that these topics should not be shared across > different connect **clusters**. e.g for each unique set of workers with a > given ` group.id `, a unique set of internal storage topics should be used. > > These discussions and documentations usually talk about sharing all three >...

Kafka Connect - offset.storage.topic reuse across clusters

Hi, I've recently been experimenting with setting the values of the `offset,` `storage` and `status` topics within Kafka Connect. I'm aware from various sources (Robin Moffatt blogs, StackOverflow, Confluent Kafka Connect docs) that these topics should not be shared across different connect **clusters**. e.g for each unique set of workers with a given ` group.id `, a unique set of internal storage topics should be used. These discussions and documentations usually talk about sharing all three topics at once, however, I am interested in reusing only the offset storage topic. I am struggling to find the risks of sharing this offset topic between different connect clusters. I'm aware of issues with sharing the config and status topics from blogs and my own testing (clusters can end up running connectors from other clusters, for example), but I cannot find a case for not sharing the offset topic despite guidance to avoid this. The use cases I am inter...

Call for Presentations now open, ApacheCon North America 2022

[You are receiving this because you are subscribed to one or more user or dev mailing list of an Apache Software Foundation project.] ApacheCon draws participants at all levels to explore "Tomorrow's Technology Today" across 300+ Apache projects and their diverse communities. ApacheCon showcases the latest developments in ubiquitous Apache projects and emerging innovations through hands-on sessions, keynotes, real-world case studies, trainings, hackathons, community events, and more. The Apache Software Foundation will be holding ApacheCon North America 2022 at the New Orleans Sheration, October 3rd through 6th, 2022. The Call for Presentations is now open, and will close at 00:01 UTC on May 23rd, 2022. We are accepting presentation proposals for any topic that is related to the Apache mission of producing free software for the public good. This includes, but is not limited to: Community Big Data Search IoT Cloud Fintech Pulsar Tomcat You ...

Re: Apache Kafka Questions

Hello Team Thank you for answering the questions I reached out earlier. We have a new set, and I would really like to have them verified and confirmed by an expert. Request to answer them with specification in a way they can be replicated in local user environment. f. How are Consumer Groups created? Are new consumers automatically bound to a default group (using consumer group id)? g. How do Offsets help in achieving Fault Tolerance? h. Can we identify the IP Address of the Broker Leader from which the message originated via the Meta.Properties file? i. In the server.properties file if I set the partition count to >1, are partitions implicitly created, if yes, are they all within the same node? Thanks and Regards, C. Jatin Shyam T: @jatinchhab ________________________________ From: Jatin Chhabriya Sent: Wednesday, March 16, 2022 7:32 PM To: users@kafka.apache.org < users@kafka.apache.org > Cc: Murali Krishna < murali.krishna@circularedge.com > Subje...

Re: Need Help - getting vulnerability due to Log4j- v1.2.17 jar being used in Kafka_2.11-2.4.0.

Hi Sandip, We plan to replace log4j with reload4j in v3.2.0 and v3.1.1. (KAFKA-13660 < https://issues.apache.org/jira/browse/KAFKA-13660 >) And plan to upgrade to log4j2 in v4.0.0. You can check this discussion thread for more details: https://lists.apache.org/thread/qo1y3249xldt4cpg6r8zkcq5m1q32bf1 Thank you. Luke On Tue, Mar 29, 2022 at 10:18 PM Sandip Bhunia <sandip.bhunia@tcs.com.invalid> wrote: > Dear Team, > > We are getting vulnerability due to Log4j- v1.2.17 jar being used in > Kafka_2.11-2.4.0. > We tried to upgrade the same to Kafka_2.13-3.1.0 to remediate > vulnerability due to Log4j- v1.2.17 (obsolete version- Log4j 1.x has > reached End of Life in 2015 and is no longer supported.) but found this > version of Kafka do not use Log4j v2.X > > As per your website there is no such information available. Please let us > know when this will get upgraded. Please us know how to get this > vulnerability ...

Re: Newbie looking for a connector I can configure on my mac

Hi Andrew Otto (Andrew 2? :D ), Well noted. I should have worded it as "You don't need a Confluent subscription to use them" :) Cheers, Liam On Wed, 30 Mar 2022 at 13:49, Andrew Otto < otto@wikimedia.org > wrote: > > And while a lot of the connector documentation is on the Confluent > website, you can still use them with FOSS Kafka so long as you're in line > with the Confluent Community Licence > > *Drive by unhelpful comment:* > While this is true legally, the fact that (most?) actual connector > implementations are CCL and not FOSS, means that organizations that use > purely FOSS software (like the Wikimedia Foundation) makes Kafka Connect > effectively unusable. > > Okay carry on! :) > > - Andrew Otto > > > > > > On Tue, Mar 29, 2022 at 8:27 PM Liam Clarke-Hutchinson < > lclarkeh@redhat.com > > wrote: > > > Hi Andrew, > > > >...

Re: Newbie looking for a connector I can configure on my mac

> And while a lot of the connector documentation is on the Confluent website, you can still use them with FOSS Kafka so long as you're in line with the Confluent Community Licence *Drive by unhelpful comment:* While this is true legally, the fact that (most?) actual connector implementations are CCL and not FOSS, means that organizations that use purely FOSS software (like the Wikimedia Foundation) makes Kafka Connect effectively unusable. Okay carry on! :) - Andrew Otto On Tue, Mar 29, 2022 at 8:27 PM Liam Clarke-Hutchinson < lclarkeh@redhat.com > wrote: > Hi Andrew, > > So if you've downloaded Apache Kafka, you can run a standalone connect > instance using the bin/connect-standalone.sh script mentioned. And while a > lot of the connector documentation is on the Confluent website, you can > still use them with FOSS Kafka so long as you're in line with the Confluent > Community Licence (basically, IIRC, you c...

Re: Newbie looking for a connector I can configure on my mac

Hi Andrew, So if you've downloaded Apache Kafka, you can run a standalone connect instance using the bin/connect-standalone.sh script mentioned. And while a lot of the connector documentation is on the Confluent website, you can still use them with FOSS Kafka so long as you're in line with the Confluent Community Licence (basically, IIRC, you can use them for free, but not to run a SAAS or similar that competes with Confluent, but IANAL). I agree that there's not much useful documentation for your use case. I will look into writing a tutorial for your use case, would you be happy to give me feedback on it as I go? The most important configuration initially is the plugin.path, where your standalone KC process will look for those JARs. You can see an example properties file for standalone Connect under the config/ dir in the Kafka you downloaded. Note that it has the plugin path commented out initially. So, Kafka ships with a connector that exposes a fil...

Newbie looking for a connector I can configure on my mac

I found the quick start https://kafka.apache.org/quickstart example very helpful. It made it really easy to understand how download, start up, create topic, push some data through the Kafka. I did not find https://kafka.apache.org/quickstart#quickstart_kafkaconnect useful. I am looking for something very simple to learning how to configure and use connectors using Apache Kafka distribution, not Confluent. I can run on my mac or Linux server. Being a newbie I want to keep things super simple. I do not want to have to debug firewalls, ACL, … I do not have a data base, access to twitter, … I thought maybe something some sort source/sink using the local file system? Any suggestions? Kind regards Andy p.s. I have read a lot of documentation most of it is very high level. Can anyone recommend a "hand on" tutorial?

Re: Setting up the CooperativeStickyAssignor in Java

Hi Liam, I've gotten the cooperative sticky assignor to work with the latest fs2-kafka wrapper. There was a bug in my code where the `.parJoinUnbounded` which processes the streams needed to move out 1 scope of execution to pull in the notification message stream. It's possible that the 2.4.0 version of the f2-kafka wrapper would also work after my fix. "timestamp":"2022-03-25T17:22:08.736Z" Updating assignment with Assigned partitions: [dw_notifications_aoa_v3-8, dw_notifications_aoa_v3-7] Current owned partitions: [dw_notifications_aoa_v3-9, dw_notifications_aoa_v3-8, dw_notifications_aoa_v3-7] Added partitions (assigned - owned): [] Revoked partitions (owned - assigned): [dw_notifications_aoa_v3-9] "timestamp":"2022-03-25T17:25:33.721Z Updating assignment with Assigned partitions: [dw_notifications_aoa_v3-4, dw_notifications_aoa_v3-2, dw_notifi...

Python client failed to connect secured Kafka: SSL handshake failed: error:1408F10B:SSL routines:ssl3_get_record:wrong version number

Hi Kafka Team Recently I moved Kafka cluster from CentOS8 to UbuntuServer20.04, same Kafka version(2.13-3.0.0), same Kafka configuration(check below), same JDK(openjdk-11-jdk) in server, but I get python client failed to connect. # SASL-SSL security.inter.broker.protocol=SASL_SSL sasl.enabled.mechanisms=SCRAM-SHA-512 sasl.mechanism.inter.broker.protocol=SCRAM-SHA-512 ssl.client.auth=required ssl.endpoint.identification.algorithm= ssl.keystore.location=/data/ssl/2022-03-25/kafka.server.keystore.jks ssl.keystore.password=sasl_ssl ssl.key.password=sasl_ssl ssl.truststore.location=/data/ssl/2022-03-25/kafka.server.truststore.jks ssl.truststore.password=sasl_ssl ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1 ssl.truststore.type=JKS ssl.keystore.type=JKS I create client jks file, and convert ca-cert to python pem, my java application can send/recv message from/to Kafka successfully. keytool -keystore kafka.truststore.jks -alias CARoot -import -file ca-cert -storepass sasl_s...

Kraft - Adding new controllers in production

Hello! I'm wondering if there's a right way to add new controllers to an already existing cluster without downtime. I've tried the following: I have three controllers joined in a cluster then one by one I change configuration to 4 voters, stop controller, delete quorum-state, start controller. When it is done for all 3 existing controllers I start the fourth one. In most of my experiments there is a consensus on who is the leader after the 4th controller is up, but there were some cases where the leader couldn't be elected and I had to restart each controller again. Although the leader was elected after the transition, I'm not sure if during the transition the cluster has not temporarily lost an ability to elect a leader. I think that this problem should potentially go away if there is atleast 5 controllers to begin with. Then I do the same thing but change the configurations to 5 voters. When I run the fifth one it can't seem to figure out who ...

Re: Running multiple MM2 instances

Hi Julia, Sounds like KAFKA-9981 [1]. This is a known issue with MirrorMaker 2 that impacts horizontal scalability and has not yet been addressed. There is some work in progress to fix this issue [2], but the effort hasn't received much attention to date. There may be other issues as well, but until KAFKA-9981 is addressed, running MirrorMaker 2 in a multi-node cluster will be at best difficult and worst, impossible. [1] - https://issues.apache.org/jira/browse/KAFKA-9981 [2] - https://cwiki.apache.org/confluence/display/KAFKA/KIP-710%3A+Full+support+for+distributed+mode+in+dedicated+MirrorMaker+2.0+clusters Cheers, Chris On Wed, Mar 23, 2022 at 11:15 AM Kalimova, Julia <Julia.Kalimova@blackrock.com.invalid> wrote: > > > Hello! > > > > I'm wondering if it's possible to scale up MM2 by running multiple > instances per data center? > > For scalability purposes, I would like to run 2 instances of > con...

Running multiple MM2 instances

Hello! I'm wondering if it's possible to scale up MM2 by running multiple instances per data center? For scalability purposes, I would like to run 2 instances of connect-mirror-maker.sh in one data center. However, I cannot get 2 instances of mirror maker to work at the same time: once I start up the second mirror maker, it takes over for the first one, and the first one completely stops replicating: i.e. rather than scaling up and rebalancing, the whole workload is still handled by a single instance. What am I missing here? Would greatly appreciate any guidance with this! Thank you! This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. See http://www.blackrock.com/corporate/compliance/email-disclaimers for further information. Please refer to http://www.blackrock.com/corporate/compliance/privacy-policy for more information about Blac...

Running multiple instances of MM2

Re: Transactions and `endOffsets` Java client consumer method

Hi Chris, Since you are using read_committed mode, the txn marker from the `endOffsets()` should indeed be skipped, no matter for committed of aborted txns. For example if the log looks like this: offsets: 0, 1, 2, 3, t_c(4) // or t_a(4) which means "abort marker" then the endOffset should return "5". That's why I'm unclear why you're seeing this issue. Guozhang On Tue, Mar 22, 2022 at 6:19 AM Chris Jansen < chris.jansen@permutive.com > wrote: > Thanks Luke, > > If the transaction marker should be hidden, does it follow that aborted > transaction at the end of the log should also be hidden for clients that > are in read committed mode? > > Happy to do a KIP/PR. > > Thanks again, > > Chris > > On Tue, Mar 22, 2022 at 10:21 AM Luke Chen < showuon@gmail.com > wrote: > > > Hi Chris, > > > > Yes, the transaction marker should be hidden to clie...

Re: Transactions and `endOffsets` Java client consumer method

Thanks Luke, If the transaction marker should be hidden, does it follow that aborted transaction at the end of the log should also be hidden for clients that are in read committed mode? Happy to do a KIP/PR. Thanks again, Chris On Tue, Mar 22, 2022 at 10:21 AM Luke Chen < showuon@gmail.com > wrote: > Hi Chris, > > Yes, the transaction marker should be hidden to clients. > There is similar issues reported: > https://issues.apache.org/jira/browse/KAFKA-10683 > Welcome to submit KIP/PR to improve it. > > Thank you. > Luke > > > On Tue, Mar 22, 2022 at 5:16 PM Chris Jansen < chris.jansen@permutive.com > > wrote: > > > Hi Guozhang, > > > > Sorry, I should have been more clear. By "an unreadable end of the log", > I > > mean the `endOffsets` method returns an offset for a record that is never > > surfaced to the caller of `poll`. > > > > I...