Kafka

Posts

Showing posts from March, 2019

Re: Need help to find references to antipatterns/pitfalls/incorrect ways to use Kafka

Yes but you have more than 1 POS terminal per location so you still don't need 20,000 partitions. Just one per location. How many locations do you have? In doesn't matter anyway since you can build a Kafka cluster with up to 200,000 partitions if you use the latest versions of Kafka. https://blogs.apache.org/kafka/entry/apache-kafka-supports-more-partitions "As a rule of thumb, we recommend each broker to have up to 4,000 partitions and each cluster to have up to 200,000 partitions" -hans > On Apr 1, 2019, at 2:02 AM, Alexander Kuterin < akuterin@gmail.com > wrote: > > Thanks, Hans! > We use location specific SKU pricing and send specific price lists to the > specific POS terminal. > > пн, 1 апр. 2019 г., 3:01 Hans Jespersen < hans@confluent.io >: > >> Doesn't every one of the 20,000 POS terminals want to get the same price >> list messages? If so then there is no need for 20,000 partitio...

Re: Need help to find references to antipatterns/pitfalls/incorrect ways to use Kafka

Thanks, Hans! We use location specific SKU pricing and send specific price lists to the specific POS terminal. пн, 1 апр. 2019 г., 3:01 Hans Jespersen < hans@confluent.io >: > Doesn't every one of the 20,000 POS terminals want to get the same price > list messages? If so then there is no need for 20,000 partitions. > > -hans > > > On Mar 31, 2019, at 7:24 PM, < akuterin@gmail.com > < akuterin@gmail.com > > wrote: > > > > Hello! > > > > > > > > I ask for your help in connection with the my recent task: > > > > - Price lists are delivered to 20,000 points of sale with a frequency of > <10 > > price lists per day. > > > > - The order in which the price lists follow is important. It is also > > important that the price lists are delivered to the point of sale online. > > > > - At each point of sale, an agent application is depl...

Re: Need help to find references to antipatterns/pitfalls/incorrect ways to use Kafka

Yes, we have location specific SKU pricing. пн, 1 апр. 2019 г., 4:08 Steve Howard < steve.howard@confluent.io >: > Do you have location specific SKU pricing? > > On Sun, Mar 31, 2019, 7:25 PM < akuterin@gmail.com wrote: > > > Hello! > > > > > > > > I ask for your help in connection with the my recent task: > > > > - Price lists are delivered to 20,000 points of sale with a frequency of > > <10 > > price lists per day. > > > > - The order in which the price lists follow is important. It is also > > important that the price lists are delivered to the point of sale online. > > > > - At each point of sale, an agent application is deployed, which > processes > > the received price lists. > > > > > > > > This task is not particularly difficult. Help in solving the task is not > > required. > > > > >...

Re: Need help to find references to antipatterns/pitfalls/incorrect ways to use Kafka

Do you have location specific SKU pricing? On Sun, Mar 31, 2019, 7:25 PM < akuterin@gmail.com wrote: > Hello! > > > > I ask for your help in connection with the my recent task: > > - Price lists are delivered to 20,000 points of sale with a frequency of > <10 > price lists per day. > > - The order in which the price lists follow is important. It is also > important that the price lists are delivered to the point of sale online. > > - At each point of sale, an agent application is deployed, which processes > the received price lists. > > > > This task is not particularly difficult. Help in solving the task is not > required. > > > > The difficulty is that Kafka in our company is a new "silver bullet", and > the project manager requires me to implement the following technical > decision: > > deploy 20,000 Kafka consumer instances (one instance for each poi...

Re: Need help to find references to antipatterns/pitfalls/incorrect ways to use Kafka

Doesn't every one of the 20,000 POS terminals want to get the same price list messages? If so then there is no need for 20,000 partitions. -hans > On Mar 31, 2019, at 7:24 PM, < akuterin@gmail.com > < akuterin@gmail.com > wrote: > > Hello! > > > > I ask for your help in connection with the my recent task: > > - Price lists are delivered to 20,000 points of sale with a frequency of <10 > price lists per day. > > - The order in which the price lists follow is important. It is also > important that the price lists are delivered to the point of sale online. > > - At each point of sale, an agent application is deployed, which processes > the received price lists. > > > > This task is not particularly difficult. Help in solving the task is not > required. > > > > The difficulty is that Kafka in our company is a new "silver bullet", and > the p...

Re: Need help to find references to antipatterns/pitfalls/incorrect ways to use Kafka

I don't want to be a downer, but because kafka is relatively new, the reference material you seek probably doesn't exist. Kafka is flexible and can be made to work in many different scenarios — not all of the ideal. It sounds like you've already reached a conclusion that kafka is the wrong solution for your requirements. Please share with us the evidence that you used to reach this conclusion. It would be helpful if you described the technical problems you encountered in your experiments so that others can give their opinion on whether can can be resolved or whether they are deal-breakers. -- Peter > On Mar 31, 2019, at 4:24 PM, < akuterin@gmail.com > < akuterin@gmail.com > wrote: > > Hello! > > > > I ask for your help in connection with the my recent task: > > - Price lists are delivered to 20,000 points of sale with a frequency of <10 > price lists per day. > > - The order in which the price lis...

Need help to find references to antipatterns/pitfalls/incorrect ways to use Kafka

Hello! I ask for your help in connection with the my recent task: - Price lists are delivered to 20,000 points of sale with a frequency of <10 price lists per day. - The order in which the price lists follow is important. It is also important that the price lists are delivered to the point of sale online. - At each point of sale, an agent application is deployed, which processes the received price lists. This task is not particularly difficult. Help in solving the task is not required. The difficulty is that Kafka in our company is a new "silver bullet", and the project manager requires me to implement the following technical decision: deploy 20,000 Kafka consumer instances (one instance for each point of sale) for one topic partitioned into 20,000 partitions - one partition per consumer. Technical problems obtained in experiments with this technical decision do not convince him. Please give me references to the book...

Fwd: leader none, with only one replicat end no ISR

Hi all, I would like to know if nobody is able to answer about several questions that I asked for a few months, or if I was just banished from the mailing list ... thank a lot & best regards, Adrien ---------- Forwarded message --------- De : Adrien Ruffie < adryen31@gmail.com > Date: jeu. 28 mars 2019 à 12:18 Subject: leader none, with only one replicat end no ISR To: < users@kafka.apache.org > Hello all, since yesterday several of my topics have the following description: ./kafka-topics.sh --zookeeper ctl1.edeal.online:2181 --describe | grep -P "none" !2032 Topic: edeal_cell_dev Partition: 0 Leader: none Replicas: 5 Isr: Topic: edeal_number_dev Partition: 0 Leader: none Replicas: 8 Isr: Topic: edeal_timeup_dev Partition: 0 Leader: none Replicas: 8 Isr: Without leader, only one replicas and no isr ... I tried to delete it by --delete from kafka-topics.sh, but no...

Re: Kafka console consumer group

Hi, auto offset commit is enabled by default in consumer. When auto offset commit is enabled, consumer commits the offset (based on auto.offset.reset config, default : latest), when it first joins the group. Thanks, Manikumar On Fri, Mar 29, 2019 at 8:33 PM Sharmadha Sainath < sharmadha94@gmail.com > wrote: > Hi all, > > > I have a basic question regarding kafka consumer groups. I will detail it > with an example : > > > 1.I run a console-producer on a topic and produce messages : > > > kafka-console-producer.sh --broker-list localhost:9092 --topic first_topic > > >message1 > > >message2 > > > > 2.I run console-consumer with —group first_group without —from-beginning > > > kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic > first_topic --group first_group > > > (no messages come) > > > > 3. when I described first_group...

Re: Kafka console consumer group

You can specify an offset to consume from with --offset, it defaults to latest if not provided. On Sat, 30 Mar. 2019, 4:03 am Sharmadha Sainath, < sharmadha94@gmail.com > wrote: > Hi all, > > > I have a basic question regarding kafka consumer groups. I will detail it > with an example : > > > 1.I run a console-producer on a topic and produce messages : > > > kafka-console-producer.sh --broker-list localhost:9092 --topic first_topic > > >message1 > > >message2 > > > > 2.I run console-consumer with —group first_group without —from-beginning > > > kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic > first_topic --group first_group > > > (no messages come) > > > > 3. when I described first_group using kafka-consumer-group , I see the > current offset is 2 > > > > Now I run console-consumer with —group second_group ...

Kafka console consumer group

Hi all, I have a basic question regarding kafka consumer groups. I will detail it with an example : 1.I run a console-producer on a topic and produce messages : kafka-console-producer.sh --broker-list localhost:9092 --topic first_topic >message1 >message2 2.I run console-consumer with —group first_group without —from-beginning kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic first_topic --group first_group (no messages come) 3. when I described first_group using kafka-consumer-group , I see the current offset is 2 Now I run console-consumer with —group second_group with —from-beginning I see the messages : kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic first_topic —group second_group —from-beginning message1 message2 My question here is , how with the first_group , based on --from-beginning flag , current offset is set to 2 without actually reading any messages?...

Re: Offsets of deleted consumer groups do not get deleted correctly

Hi, You should keep the policy compact for the topic __consumer_offsets This topic stores for each group/topic/partition the offset consumed. As only the latest message for a group/topic/partition is relevant, the policy compact will keep only this message. When you delete a group, actually it will produce a tombstone to this topic (i.e body NULL). Then when the log compaction is running, it will definitively remove the tombstone. But to have an effective delete of the tombstones, keep in mind : * compaction runs only on rolled out segments * deletion of tombstone only occurs if the delete.retention.ms delay is expired Best regards On Fri, Mar 29, 2019 at 2:16 PM Claudia Wegmann < c.wegmann@kasasi.de > wrote: > Hey there, > > I've got the problem that the "__consumer_offsets" topic grows pretty big > over time. After some digging, I found offsets for consumer groups that > were deleted a long time ago still being present in t...

Offsets of deleted consumer groups do not get deleted correctly

Hey there, I've got the problem that the "__consumer_offsets" topic grows pretty big over time. After some digging, I found offsets for consumer groups that were deleted a long time ago still being present in the topic. Many of them are offsets for console consumers, that have been deleted with "kafka-consumer-groups.sh --delete --group ...". As far as I understand log cleaning, those offsets should have been deleted a long time ago, because these consumers are no longer active. When I query "kafka-consumer-groups.sh --bootstrap-server ... --list" I don't see those consumers either. Is there a bug in "kafka-consumer-groups.sh --delete --group ..." that let's kafka hang on to those consumer groups? How can I get the log cleaner to delete these old offsets? Is there another way than setting "cleanup.policy" to "delete"? Thanks for our help! Best, Claudia

leader none, with only one replicat end no ISR

Hello all, since yesterday several of my topics have the following description: ./kafka-topics.sh --zookeeper ctl1.edeal.online:2181 --describe | grep -P "none" !2032 Topic: edeal_cell_dev Partition: 0 Leader: none Replicas: 5 Isr: Topic: edeal_number_dev Partition: 0 Leader: none Replicas: 8 Isr: Topic: edeal_timeup_dev Partition: 0 Leader: none Replicas: 8 Isr: Without leader, only one replicas and no isr ... I tried to delete it by --delete from kafka-topics.sh, but nothing change. After that I tried to do this: https://medium.com/@contactsunny/manually-delete-apache-kafka-topics-424c7e016ff3 but any against ... the /brokers/topics/edeal_cell{number/timeup}_dev keep always, but without partition ... I'd run out of ideas ... could someone please help me? thank a lot. Adrian

KafkaConsumer.poll() and exceptions

Hi all From reading the javadoc I've made the assumption that all exceptions thrown by KafkaConsumer.poll() are unrecoverable. What exactly does this mean with regards to the consumer instance itself? Can it be used again in any way e.g. close then call subscribe again? Or do you need to replace it with a new consumer instance? Thanks Mark

Re: Kafka avro producer

Hi Karishma, you can definitely use Avro without the Confluent schema registry. Just write you own serializer/ deserializer. However , you need to share the avro schema version between your producer and consumer somehow.. and also think about changes on your avro schema. Greets Valentin Von meinem iPad gesendet > Am 27.03.2019 um 07:28 schrieb KARISHMA MALIK < karishma39malik@gmail.com >: > > On Wed 27 Mar, 2019, 11:57 AM KARISHMA MALIK, < karishma39malik@gmail.com > > wrote: > >> Hi Team >> Is there any possible method to use apache Kafka avro producer without >> using schema registry ? >> >> Thanks >> Karishma >> M:7447426338 >>

Re: Kafka avro producer

Not really possible as the producer assumes you are using the schema registry. You can use avro for the deserialisation in some other way, but you need to create (de)serializers that fit with the other way. Op wo 27 mrt. 2019 om 17:33 schreef lsroudi abdel < lsroudi@gmail.com >: > It depend on your use case, you could push your schema with the first > message get it on the other side > > Le mer. 27 mars 2019 à 17:15, KARISHMA MALIK < karishma39malik@gmail.com > a > écrit : > > > On Wed 27 Mar, 2019, 11:57 AM KARISHMA MALIK, < karishma39malik@gmail.com > > > > wrote: > > > > > Hi Team > > > Is there any possible method to use apache Kafka avro producer without > > > using schema registry ? > > > > > > Thanks > > > Karishma > > > M:7447426338 > > > > > >

Re: Kafka avro producer

First ideally we need to provision our schema on cluster statically after that only the put operation can be performed. Dynamic schema provisions is supported in kafka? Not sure.. I guess no On Wed, Mar 27, 2019, 22:03 lsroudi abdel < lsroudi@gmail.com > wrote: > It depend on your use case, you could push your schema with the first > message get it on the other side > > Le mer. 27 mars 2019 à 17:15, KARISHMA MALIK < karishma39malik@gmail.com > a > écrit : > > > On Wed 27 Mar, 2019, 11:57 AM KARISHMA MALIK, < karishma39malik@gmail.com > > > > wrote: > > > > > Hi Team > > > Is there any possible method to use apache Kafka avro producer without > > > using schema registry ? > > > > > > Thanks > > > Karishma > > > M:7447426338 > > > > > >

Re: Kafka avro producer

It depend on your use case, you could push your schema with the first message get it on the other side Le mer. 27 mars 2019 à 17:15, KARISHMA MALIK < karishma39malik@gmail.com > a écrit : > On Wed 27 Mar, 2019, 11:57 AM KARISHMA MALIK, < karishma39malik@gmail.com > > wrote: > > > Hi Team > > Is there any possible method to use apache Kafka avro producer without > > using schema registry ? > > > > Thanks > > Karishma > > M:7447426338 > > >

Re: Kafka Streams upgrade.from config while upgrading to 2.1.0

Hello Anirudh, The config `upgrade.from` is recommended for safe and smooth upgrade. In your case it is possible that when rolling bounce the instances the first upgraded instance happen to be the leader of the group and hence even without the config it can recognize other instances; but if you are in bad luck and the leader is bounced later, then it would not be able to recognize other instances and hence cause it to crash. So in some words, setting is config would be a safe choice but it does not mean you are doomed to fail the upgrade if you do not execute this way. Guozhang On Tue, Mar 26, 2019 at 10:38 PM Anirudh Vyas < anirudh2403@gmail.com > wrote: > Hey, > We run 3 instances. > > Anirudh > > On Tue, Mar 26, 2019 at 9:28 PM Matthias J. Sax < matthias@confluent.io > > wrote: > > > Not sure. How many instances to do you run? If it's only one, you don't > > need the config. > > > > -...

Re: Kafka avro producer

On Wed 27 Mar, 2019, 11:57 AM KARISHMA MALIK, < karishma39malik@gmail.com > wrote: > Hi Team > Is there any possible method to use apache Kafka avro producer without > using schema registry ? > > Thanks > Karishma > M:7447426338 >

Re: Kafka Streams upgrade.from config while upgrading to 2.1.0

Hey, We run 3 instances. Anirudh On Tue, Mar 26, 2019 at 9:28 PM Matthias J. Sax < matthias@confluent.io > wrote: > Not sure. How many instances to do you run? If it's only one, you don't > need the config. > > -Matthias > > On 3/26/19 5:17 AM, Anirudh Vyas wrote: > > Hi, > > I am in the process of upgrading my Kafka streams services from 1.1 to > > 2.1.0. I am following the upgrade guide: > > https://kafka.apache.org/20/documentation/streams/upgrade-guide . > > > > My service is running on kafka version 2.0 and using kafka streams > 1.1.1. I > > updated my kafka-streams to 2.1.0 but DID NOT pass the config value > > `upgrade.from` (it is null), as can be verified from the logs > > ``` > > [INFO ] 2019-03-26 18:05:49,550 [main] > > org.apache.kafka.streams.StreamsConfig:logAll: StreamsConfig values: > > application.id = application_id > ...

Re: Proxying the Kafka protocol

mg>that depends on the underlying protocol you are attempting to proxy (see below) ________________________________ From: James Grant < james@queeg.org > Sent: Monday, March 25, 2019 1:21 PM To: users@kafka.apache.org Subject: Re: Proxying the Kafka protocol Thank you all. We have in the past exposed message streams backed by Kafka via a HTTP/POST and Websocket service which worked very well. We were able to filter messages based on schema compliance and it was very simple for the teams that generate the data to use. It also had no trouble scaling to the 100K messages / sec levels. However not exposing the Kafka protocol has it's drawbacks when you try to bring in other tools and teams who are already familiar with Kafka. So we looked for something that would provide: * Native Kafka protocol support MG>is the protocol you are trying to proxy is tcp/ip..then try juniper tcp/ip proxy: MG> https://www.juniper.net/documentation/en_US/junos/t...

Re: Tracking progress for messages generated by a batch process

Thanks all for the feedback and suggestions. @Henn I did consider something like your suggestion, though it brings a couple of challenges: 1) Not all messages emitted from batch job will make it to streaming app 3, some will be filtered out (granted you could change all applications to not filter messages and instead send on some kind of empty message) 2) You could still solve the problem, but would need some tracking at each stage. As you say it means you need some other datastore for tracking, and need to call when processing each message. Overall I think it would be quite invasive, requiring code changes in each component, and further code changes if the pipeline had additional stages added. Definitely still do-able, but I was leaning towards a monitoring component as it could be more generic, and extending it to other pipelines or apps would require only configuration changes @星月夜 I did also consider your suggestion, but you would need to add metadata to the last...

Re: [ANNOUNCE] Apache Kafka 2.2.0

Congratulations on this amazing release! Lots of cool new features :) I've also released a YouTube video that will hopefully help the community get up to speed: https://www.youtube.com/watch?v=kaWbp1Cnfo4&t=5s Happy watching! On Tue, Mar 26, 2019 at 7:02 PM Matthias J. Sax < mjsax@apache.org > wrote: > The Apache Kafka community is pleased to announce the release for Apache > Kafka 2.2.0 > > - Added SSL support for custom principal name > - Allow SASL connections to periodically re-authenticate > - Command line tool bin/kafka-topics.sh adds AdminClient support > - Improved consumer group management > - default group.id is `null` instead of empty string > - API improvement > - Producer: introduce close(Duration) > - AdminClient: introduce close(Duration) > - Kafka Streams: new flatTransform() operator in Streams DSL > - KafkaStreams (and other classed) now implement AutoClosable to > ...

[ANNOUNCE] Apache Kafka 2.2.0

The Apache Kafka community is pleased to announce the release for Apache Kafka 2.2.0 - Added SSL support for custom principal name - Allow SASL connections to periodically re-authenticate - Command line tool bin/kafka-topics.sh adds AdminClient support - Improved consumer group management - default group.id is `null` instead of empty string - API improvement - Producer: introduce close(Duration) - AdminClient: introduce close(Duration) - Kafka Streams: new flatTransform() operator in Streams DSL - KafkaStreams (and other classed) now implement AutoClosable to support try-with-resource - New Serdes and default method implementations - Kafka Streams exposed internal client.id via ThreadMetadata - Metric improvements: All `-min`, `-avg` and `-max` metrics will now output `NaN` as default value All of the changes in this release can be found in the release notes: https://www.apache.org/dist/kafka/2.2.0/RELEASE_NOTES.html You can downl...

Re: Kafka Streams upgrade.from config while upgrading to 2.1.0

-----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIzBAEBCgAdFiEESn/iOv2tmCkcP0KLVp2sL37kObwFAlyaTBgACgkQVp2sL37k Obxn2g/+Pkqu2g6HGuWF6qam9/q8h+3pExYX62tER6WF7H+wJ+CfqVpc2fE+9Im+ RSNVgclNNaMpAsmQQX9trYI5NOEPhlFF01ZAuHrzZLe8OOivsPF/aHmmAUkbUuL+ pXqd81tfbJzG8J2w3YfIpHcj2HesDldFkbZCdsaylVBB5DPa/a2wWopJ0Rx8mUaC T5vAijRPsV+e6kdLPTBn6oypvuyirKbeXNp7wjdHdwXIts3/YT9P9s8CjPV7cXkb MUrplK78W/ctt8Jmdxvcj16L/gnMOBIFB+95HMjN2nw+READQ+uz/LdbVQP1T0Oe 1Vei68Ykt27T74eh6n+V/7qo911W6I+8jGGX7n4QuXBgCjoP1lG8ul+XRbg68L01 uq7Cv8A/wYGNXadOW428iUt8C7hioW6V++n85xPI91yslKl0ctARRzEjLHgzOAqy xIkyj8Mkz2cuK9zSF3qtp6zL3vQl0z3DYpsGVbOonEpdJO5TSXahjUww//VrEZdG u5ydNJd56Kl0F9C974cf3gkSCyNGr3lsQ6mMabDdwmANQaC0sMnQQiZi8ZHFV655 Y0Kpf1GbCFZW2RURM/RJbrMBFiqHEzPLTj5l+EX/Ydrd9MpKBFS9S+X96kt8DV6F fO1BDXmQEBouaKU++B5rDMff1nhK0AI4k2pN6CsIUS+/xtHJLzg= =oXDH -----END PGP SIGNATURE----- Not sure. How many instances to do you run? If it's only one, you don't need the config. -Matthi...

Re: KafkaStreams backoff for non-existing topic

Thank you guys for your response. That was very helpful. With that in mind, and with some discussions with the team, we decided to let the application die, so we can monitor and relaunch it externally. I think we had something similar to what is described in KAFKA-7970, where our streams application was wrapped by another application, so the application would not die. But we are going to move away from that approach and launch the streams application as a separate process anyway, so that won't be an issue anymore. Thanks again. Murilo On Mon, 25 Mar 2019 at 22:14, Guozhang Wang < wangguoz@gmail.com > wrote: > Hi Patrik, > > Are you referring to the source topic not available or state directory not > available? > > For source topic not available case, as Murilo asked about, the main issue > is of the "split brain" case: in very early versions of Kafka, Streams does > check if topics are available at the beginning before ...

Kafka Streams upgrade.from config while upgrading to 2.1.0

Hi, I am in the process of upgrading my Kafka streams services from 1.1 to 2.1.0. I am following the upgrade guide: https://kafka.apache.org/20/documentation/streams/upgrade-guide . My service is running on kafka version 2.0 and using kafka streams 1.1.1. I updated my kafka-streams to 2.1.0 but DID NOT pass the config value `upgrade.from` (it is null), as can be verified from the logs ``` [INFO ] 2019-03-26 18:05:49,550 [main] org.apache.kafka.streams.StreamsConfig:logAll: StreamsConfig values: application.id = application_id application.server = bootstrap.servers = [bootstrap-server-01:6667, bootstrap-server-02:6667] buffered.records.per.partition = 10000 cache.max.bytes.buffering = 10485760 client.id = commit.interval.ms = 15000 connections.max.idle.ms = 540000 default.deserialization.exception.handler = class org.apache.kafka.streams.errors.LogAndFailExceptionHandler ...

Re: KafkaStreams backoff for non-existing topic

Hi Patrik, Are you referring to the source topic not available or state directory not available? For source topic not available case, as Murilo asked about, the main issue is of the "split brain" case: in very early versions of Kafka, Streams does check if topics are available at the beginning before starting up, but then there is an observation that when multiple instances of Streams app are starting up at the same time, depending on 1) concurrent edge cases if topics are created just before / around the same time that Streams instances are started, and 2) which broker they talk to refresh their metadata, they may get different answers and hence behave differently, i.e. some move on starting normally while some others report an error and shutdown. So what we did in KAFKA-5037 is that, only one instance will make the decision, who's the leader of the corresponding consumer group. And if that leader decides that not all topics are available, it will pr...

Re: KafkaStreams backoff for non-existing topic

Hi Guozhang Just a small question, why can't this be checked when trying to instantiate KafkaStreams? The Topology should know all topics and the existence of the topics could be verified with the AdminClient. This would allow to fail fast similar to when the state directory is not available. Or am I missing something? best regards Patrik On Mon, 25 Mar 2019 at 23:15, Guozhang Wang < wangguoz@gmail.com > wrote: > Hello Murilo, > > Just to give some more background to John's message and KAFKA-7970 here. > The main reason of trickiness is around the scenario of "topics being > partially available", e.g. say your application is joining to topics A and > B, while topicA exists but topicB does not (it is surprisingly common due > to either human errors, or topic creation race conditions, etc). Then you > have a few options at hand: > > 1. Just start the app normally, which will only process data from topicA > ...

Re: KafkaStreams backoff for non-existing topic

Hello Murilo, Just to give some more background to John's message and KAFKA-7970 here. The main reason of trickiness is around the scenario of "topics being partially available", e.g. say your application is joining to topics A and B, while topicA exists but topicB does not (it is surprisingly common due to either human errors, or topic creation race conditions, etc). Then you have a few options at hand: 1. Just start the app normally, which will only process data from topicA and none from topicB. When topicB is created later the app will auto-rebalance to get the data (this is guaranteed by Streams itself). However before this is true the join operator would see no data from topicB to join to while proceeding. This behavior is actually the case before Kafka version 2.0 and many users complained about it. 2. Does not start the app at all, notify the users that some topics are missing and stop. This is what we changed in KAFKA-5037. 3. We can also, l...