Kafka

Posts

Showing posts from January, 2020

Asynchronous Producer Retry Behaviour

Hi group, I am wondering what is the retry behavior of asynchronous producer. Does asynchronous producer retry to send messages if it doesn't receive a response from a broker for a certain amount of time? Thank you for the help! Trinity

RE: kafka-run-class.sh kafka.tools.GetOffsetShell in SASL enabled cluster

Team, Could you please help on this? Thanks Bibin John From: JOHN, BIBIN Sent: Thursday, January 30, 2020 11:59 PM To: users@kafka.apache.org Subject: kafka-run-class.sh kafka.tools.GetOffsetShell in SASL enabled cluster Team, I am getting below error when try to use kafka-run-class.sh kafka.tools.GetOffsetShell in SASL enabled cluster. I set KAFKA_OPTS with path for jaas file. could you please help me what is reason for this? Kafka Version : 2.3.0 [2020-01-31 00:55:12,934] WARN [Consumer clientId=GetOffsetShell, groupId=null] Bootstrap broker <hostname>:9092 (id: -1 rack: null) disconnected (org.apache.kafka.clients.NetworkClient) [2020-01-31 00:55:13,223] WARN [Consumer clientId=GetOffsetShell, groupId=null] Bootstrap broker <hostname>:9092 (id: -1 rack: null) disconnected (org.apache.kafka.clients.NetworkClient) [2020-01-31 00:55:13,521] WARN [Consumer clientId=GetOffsetShell, groupId=null] Bootstrap broker <hostname>:9092 (id: -1 rack: nul...

Re: High CPU in 2.2.0 kafka cluster

We were running client version 2.3.0 for a while, then bumped to 2.3.1 for a particular kafka streams bug fix. We saw this issue while both versions were running. Brandon ________________________________ From: Jamie <jamiedd13@aol.co.uk.INVALID> Sent: Thursday, January 30, 2020 1:03 PM To: users@kafka.apache.org < users@kafka.apache.org > Subject: Re: High CPU in 2.2.0 kafka cluster Hi Brandon, Which version of Kafka are the consumers running? My understanding is that if they're running a version lower than the brokers then they could be using a different format for the messages which means the brokers have to convert each record before sending to the consumer. Thanks, Jamie -----Original Message----- From: Brandon Barron < brandon.barron@live.com > To: users@kafka.apache.org < users@kafka.apache.org > Sent: Thu, 30 Jan 2020 16:11 Subject: High CPU in 2.2.0 kafka cluster Hi, We had a small cluster (4 brokers) dealing with ver...

Kafka audit logs in Apache Kafka?

Hi, Does Apache Kafka have audit logs equivalent of https://docs.confluent.io/current/security/audit-logs.html or does only Confluent's Kafka have this? Thanks, Otis -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/

Eclipse Scala IDE with MacOS (Catalina) and OpenJDK11

Hey Dev Group, Apologies if this was meant to be sent for dev group, so I thought I would put it out here. It seems that I cannot get Eclipse Scala to start with MacOS (Catalina). I must be honest, that I haven't tried it on Mac before. Additionally, I am using OpenJDK 11. I have put it inside the /Library/Java/OpenJDK/ subdirectory and confirmed that Java Path and executables are discovered correctly. Any pointers from folks out there who are using Eclipse Scala IDE with Mac and OpenJDK 11 would be appreciated. If I get it it to work, I would probably note this down on Cwiki for all future comms. Regards, M. Manna

Re: Resource based kafka assignor

Hi Srinivas, Your approach sounds fine, as long as you don't need the view of the assignment to be strictly consistent. As a roughy approximation, it could work. On the other hand, if you're writing a custom assignor, you could consider using the SubscriptionInfo field of the joinGroup request to encode arbitrary information from the members to the leader, which it can use in making decisions. So you could encode a custom "node id" there and not have to rely on patterns in the group id. Or you could even just directly encode node load information and use that to influence the assignment. Iirc, there's no explicit "trigger rebalance" command, but you can still make it happen by doing something like unsubscribing and resubscribing again. I hope this helps! John On Thu, Jan 30, 2020, at 09:25, Devaki, Srinivas wrote: > Also, want to clarify one more doubt, > > is there any way for the client to explicitly trigger a rebalance ...

Re: Kafka consumers freeze

Hey Tim On Fri, 31 Jan 2020 at 13:06, Sullivan, Tim < TIM_SULLIVAN@homedepot.com > wrote: > > > Is there a way I can proactively check my consumers to see if > they are consuming? Periodically some or all of my consumers stop > consuming. The only way I am made aware of this is when my down stream > feeds folks alert me that their data isn't flowing into Kafka. My normal > solution is to bump the kafka servers and then they begin to consume. > > > > Any help will be greatly appreciated. > Does using Burrow < https://github.com/linkedin/Burrow > help you? Also, using cruise-control < https://github.com/linkedin/cruise-control > you could dynamically rebalance the workload, if some consumers are not in steady state. Regards, > > > > Tim Sullivan > > Be Well > > > > Sr. Software Engineer > > Supply Chain – IT | Data & Analy...

Re: Kafka consumers freeze

Hi Tim, Can you use the "bytes-consumed-rate" jinx metric which reports the bytes consumed per topic? What errors are you seeing in the consumer when it freezes? Thanks, Jamie Sent from AOL Mobile Mail Get the new AOL app: mail.mobile.aol.com On Friday, 31 January 2020, Sullivan, Tim < TIM_SULLIVAN@homedepot.com > wrote:  ...

Kafka consumers freeze

Is there a way I can proactively check my consumers to see if they are consuming? Periodically some or all of my consumers stop consuming. The only way I am made aware of this is when my down stream feeds folks alert me that their data isn't flowing into Kafka. My normal solution is to bump the kafka servers and then they begin to consume. Any help will be greatly appreciated. Tim Sullivan Be Well Sr. Software Engineer Supply Chain – IT | Data & Analytics The Home Depot 2250 Newmarket Parkway | Atlanta, GA 30339 Work Cell 470-455-8346 Personal Cell 678-525-2583 Home Phone: 770-945-3315 The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email...

kafka-run-class.sh kafka.tools.GetOffsetShell in SASL enabled cluster

Team, I am getting below error when try to use kafka-run-class.sh kafka.tools.GetOffsetShell in SASL enabled cluster. I set KAFKA_OPTS with path for jaas file. could you please help me what is reason for this? Kafka Version : 2.3.0 [2020-01-31 00:55:12,934] WARN [Consumer clientId=GetOffsetShell, groupId=null] Bootstrap broker <hostname>:9092 (id: -1 rack: null) disconnected (org.apache.kafka.clients.NetworkClient) [2020-01-31 00:55:13,223] WARN [Consumer clientId=GetOffsetShell, groupId=null] Bootstrap broker <hostname>:9092 (id: -1 rack: null) disconnected (org.apache.kafka.clients.NetworkClient) [2020-01-31 00:55:13,521] WARN [Consumer clientId=GetOffsetShell, groupId=null] Bootstrap broker <hostname>:9092 (id: -1 rack: null) disconnected (org.apache.kafka.clients.NetworkClient)

Re: How to change/increase ISR

Hey Upendra, https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools The above should guide you through the reassignment of partitions/replicas. Also, you should read about offset.topic.num.partitions offset.topic.replication.factor I hope this helps you. Regards, On Thu, 30 Jan 2020 at 21:48, Upendra Yadav < upendra1024@gmail.com > wrote: > Hi Team, > > Is there way to change ISR for existing topics. > I want this for user topics as well as for __consumer_offset topic. > > By mistake, __consumer_offset topic was configured with 1 replication > factor and 1 ISR. > > kafka broker and client version: 0.10.0.1 > > Thanks, > Upendra >

Re: High CPU in 2.2.0 kafka cluster

Hi Brandon, Which version of Kafka are the consumers running? My understanding is that if they're running a version lower than the brokers then they could be using a different format for the messages which means the brokers have to convert each record before sending to the consumer. Thanks, Jamie -----Original Message----- From: Brandon Barron < brandon.barron@live.com > To: users@kafka.apache.org < users@kafka.apache.org > Sent: Thu, 30 Jan 2020 16:11 Subject: High CPU in 2.2.0 kafka cluster Hi, We had a small cluster (4 brokers) dealing with very low throughput - a couple hundred messages per minute at the very most. In that cluster we had a little under 3300 total consumers (all were kafka streams instances). All broker CPUs were maxed out almost consistently for a few weeks. We switched traffic to a new cluster eventually. The old cluster sitting idle for a few days was at ~40% CPU, with consumers still running. When I took down all t...

Re: Kafka streams DSL going into infinite loop

-----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEE8osu2CcCCF5douGQu8PBaGu5w1EFAl4zM3AACgkQu8PBaGu5 w1HlPxAAq3JN/8iSEMwWNbrUktjWRdJosdUne3Ha90FdzNWYzYlaB/kaKFLojKjK +9VmfKf1rw9lZsiEFo7Ft1Ei5rs8vRZ4fRsBCIgZwoPBJTS+r+EKw9EhLjsymZrG WSO/nCNmbGNzih1QtZ+TJrXdPMAw+LMlMip5X17BmPE5LbkKXNBTtT2FuYN+LLRd 50aCR1sTMwYHddawLLG1x6KXvzc4GMmxe503/Qt5a0Q9GLhxk6KFCJUblFzxnEwe jVRILVuj3EyvjCi35+SZZHUyywHtAzfPpwDmMVBaLEh9zhxP0NgHqXQZuVaS1WQa gGLNcWjgoCfsPTZxbHjcWzuvwsK56xFNy9Cu90MMjNKiLVW0zmMZUbSXw4h3gzza VQ1PJYdFfhAPKnhnwu7e8LmN70pC9UCJcs2YEAycGzEcSYB4O2xSQRO5QSa19+12 P9RbN57dMIPnNYT4p5ag62s5ErorRAILILqLkL7cfeXaA8I5SlQPuqudztYZ0CTh ZXveWCK2isE2L7cPXWE8fHg3+v6jqEdi0Kc9Y8fndQy8eiELeZx8ZwlMwxTVGGxw x9MpNSebgwqDu9tjfpezcgF18VXRVURh2LYSefQW5/ccFZZ7fNq2s15/cl4AhpIL QMjFJ0u02HfoFHI7lYiNvhuOOtgpTL91WbREIAW0uTEMzgRMXhk= =6StC -----END PGP SIGNATURE----- >> I really don't know what TOPOLOGY_OPTIMIZATION is for. If you enable optimization, Kafka Streams tries to generate a more efficient To...

Re: Kafka streams DSL going into infinite loop

Only streams specific props I am using are: props.put(StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, 0); props.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 1000); props.put(StreamsConfig.TOPOLOGY_OPTIMIZATION, StreamsConfig.OPTIMIZE); Yes there are three sub topologies and key does change between joins. Is this something to do with any of the above configs. I really don't know what TOPOLOGY_OPTIMIZATION is for. The way I start my stream is: final KafkaStreams streams = new KafkaStreams(topology, props); final CountDownLatch latch = new CountDownLatch(1); // attach shutdown handler to catch control-c Runtime.getRuntime().addShutdownHook(new Thread("streams-shutdown-hook") { ... }); try { streams.cleanUp(); streams.start(); latch.await(); } catch (Throwable e) { System.exit(1); } System.exit(0); Is the method to start the stream OK. There is indeed no loop so why th...

Re: Kafka streams DSL going into infinite loop

-----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEE8osu2CcCCF5douGQu8PBaGu5w1EFAl4zJ+cACgkQu8PBaGu5 w1F9TA/+ISPhav3MyzR8x/wYbeqM/MhjwGoIuKcShezmXQEAkmP2EHHv5/qR9sEG EMFcV5sLrDRGYOM/gFml6gsGA1VScOqBR8ITF8635/OpsBI6+YLAQLPEuNPX7GH5 uDMRSMGxf1spnoJWr9IfykY7uPeyWb2XlFGrkCIyvNwrVxzYsdp5i2Q5X9YLreLf 1vK1jpNQylvrYIOcA90dzDqRfMCBOM/84LzN8KGT5VoN54RegvpsiDw167R7d6N0 8E7uc92oBF+rVOdaihaBjE1xNi088g2A/JiOblSDgDy8xxph31yoAe0aRStxbnpq ggWaz/NbRDFaVvSZ2CEF8d1ftgM9quJDIeWTYG3helioaKCn/bCtV1OIzg+wbO7R 9/JALz8R8EzL+U9Ph4pKuzHBIOe5FzZsH7alPg1PzSF9Oss3FLaOGzNZ6ehe143/ a64DeyNWxbnX1eC08mjiKC6F0OJpze+vgF8JakBPym8FqQEk11CUqJ0CSv3nFVrC reyspwz2/d0lPtp3x6nmZK894ywk+x8atcmgvuSAhFlFRO7W4dKdlfCOrIA8pmFq cWPKvyDL1mfBvrn93eQA0FyOxZ2pQ6zYqDW0w3QfXaEsTcG7qAwcjP05yak+Vi+o 4m+uwcIVAc/6EFsricK1RhOSw5JwrmUvHbinnSitSaT+iyVAEPI= =PnvB -----END PGP SIGNATURE----- Your program does not seem to contain a loop. Hence, it's unclear to me atm what could be the issue. Does the application commit offset on the ...

Kafka streams DSL going into infinite loop

Hi, In a case I have found that when I define my topology using streams DSL it tends to go into infinite loop. This usually happens if I start my stream and shut it down and restart it again after a while (by that time source topic has moved ahead). Stream processing seems to be stuck in a loop and does not seem to progress ahead. My topology is something like this: source.branch( (k, v) -> ..., (k, v) -> ..., (k, v) -> ..., (k, v) -> ..., (k, v) -> ..., (k, v) -> ..., (k, v) -> ..., (k, v) -> ... ) stream12 = stream1.join(stream2, ...).peek((k, v12) -> log(v12)) stream34 = stream3.join(stream4, ...).peek((k, v34) -> log(v34)) stream56 = stream5.join(stream6, ...).peek((k, v56) -> log(v56)) stream78 = stream7.join(stream8, ...).peek((k, v78) -> log(v78)) stream1234 = stream12.join(stream34, ...).peek((k, v1234) -> log(v1234)) stream123478 = stream1234....

Re: KStreams in-memory state-store

-----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEE8osu2CcCCF5douGQu8PBaGu5w1EFAl4zHaEACgkQu8PBaGu5 w1FDExAAq2+0EFdtrZSTQUz5uoWlyBhS/4+rb2I69DxtS3iuEzkw7/ydG/nH5oMv EhWcFGAnoVncpoLkz1aWZiuDXd4tATVfz1a/bOzC/4XtWDIHSrY4UrimiJq88tN0 Pz4Sat7gyzqZdHnVTv4mOVgagrdWR179cuCGN3ZBJd+sOqPM6EviVshlnGz6k3Zr e4AE2SmR8iTpufbLRABBDP3vO/WaQ1rQ5u3CXSM7nKvDFvM5exg0Jjra6ZNf3Olq djN8apn6yB94/Z3kk5WRjrh2v/LCaMrpFV0iU5qrvucaLqs20tUuXttfXjYDHCQ1 VFgYCnkIRMl5tR3E7IPJUNDT+Ul0/SGq4dzHzRLH1Z2Yr+SNx3x1Kg7I/wVdrgMm KkFqH8GeG0jyWdDIyHPGkUu2vJDDd+WNN8Gcd+CUKCSZLmYWdKbUPaQG9iuxX2mf 4q7kP2RKQUdGSZfhh5T+XAahMGcd7stZR7lEY/XIgexQSvEY6vNhuronoDa46mdq QaAuCBN9dYONpz+sFX8/D34VEGqycxwoK3QefpCsDM0Ve3Buo46GexuB/59Wzwfk jRYV7DGnmVbnEw64FpHyok2BrqN8oZEFj3t6Yy3mtjZhF/Q+UnCjlnRFpdFvvOdf aa7ErlLFh5VY12DT8QUzAcgFkyEAPRKW2iOYFqFu3WmyN4In3QQ= =2Dpd -----END PGP SIGNATURE----- What you say is correct. This is a severe bug rendering standby tasks useless for in-memory state stores. Can you pleas open a ticket so we can fix i...

High CPU in 2.2.0 kafka cluster

Hi, We had a small cluster (4 brokers) dealing with very low throughput - a couple hundred messages per minute at the very most. In that cluster we had a little under 3300 total consumers (all were kafka streams instances). All broker CPUs were maxed out almost consistently for a few weeks. We switched traffic to a new cluster eventually. The old cluster sitting idle for a few days was at ~40% CPU, with consumers still running. When I took down all the consumers, the idle CPU on the brokers went to about 4%. To test, we decided to mirror active traffic in our new cluster to the old cluster (which now has no running consumers). The CPU didn't budge; it's still at ~4% as expected with the low throughput. One more thing to add: I ran a thread profiler on a couple brokers when the old cluster was taking active traffic with running consumers and the CPU was maxed out. Each time, I saw the ReplicaFetcherThread eating up around 40% of CPU time. Can you give any advice o...

Can't consume data from topic after re-installation kafka

Hello all, As said in in the title after re-installing kafka on cloudera cluster we have problem with consuming data on topic, we still can product on topic but nothing displayed on consumers. we think the problem is coming from leader election because we have many errors on the log brokers who says 2020-01-23 14:56:15,467 ERROR state.change.logger: [Controller id=414 epoch=101] Controller 414 epoch 101 failed to change state for partition __consumer_offsets-5 from OfflinePartition to OnlinePartition kafka.common.StateChangeFailedException: Failed to elect leader for partition __consumer_offsets-5 under strategy OfflinePartitionLeaderElectionStrategy at kafka.controller.PartitionStateMachine$$anonfun$doElectLeaderForPartitions$3.apply(PartitionStateMachine.scala:390) at kafka.controller.PartitionStateMachine$$anonfun$doElectLeaderForPartitions$3.apply(PartitionStateMachine.scala:388) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:...

Re: Resource based kafka assignor

Also, want to clarify one more doubt, is there any way for the client to explicitly trigger a rebalance without dying itself? On Thu, Jan 30, 2020 at 7:54 PM Devaki, Srinivas < me@eightnoteight.space > wrote: > Hi All, > > We have a set of logstash consumer groups running under the same set of > instances, we have decided to run separate consumer groups subscribing > multiple topics instead of running single consumer group for all topics(the > reasoning behind this decision is because of how our elasticsearch cluster > is designed). > > Since we are running multiple consumer groups, sometimes we have detected > that a few ec2 nodes are receiving multiple high throughput topics in > different consumer groups. which was expected based on the implementation > of round robin assignor. > > So I've decided to make a partition assignor which will consider the > assignment based on other consumer group assignment. ...

Resource based kafka assignor

Hi All, We have a set of logstash consumer groups running under the same set of instances, we have decided to run separate consumer groups subscribing multiple topics instead of running single consumer group for all topics(the reasoning behind this decision is because of how our elasticsearch cluster is designed). Since we are running multiple consumer groups, sometimes we have detected that a few ec2 nodes are receiving multiple high throughput topics in different consumer groups. which was expected based on the implementation of round robin assignor. So I've decided to make a partition assignor which will consider the assignment based on other consumer group assignment. Could you please give me some pointers on how to proceed. This is my initial ideas on the problem. Solution #0: write an assignor, and use a specific consumer id pattern across all consumer groups, and in the assignor do a describe on all consumer groups and based on the topic throughput...

How to change/increase ISR

Hi Team, Is there way to change ISR for existing topics. I want this for user topics as well as for __consumer_offset topic. By mistake, __consumer_offset topic was configured with 1 replication factor and 1 ISR. kafka broker and client version: 0.10.0.1 Thanks, Upendra

Re: offset reset on unavailability

unsubscribe On Wed, Jan 29, 2020 at 6:31 PM Sergey Shelukhin <Sergey.Shelukhin@microsoft.com.invalid> wrote: > Hi. > We've run into a situation where Kafka cluster was unstable, but some > brokers were still up and responding. > Some of the consumers restarted at that time and were not able to get > their commit offset. > We run with auto.offset.reset earliest by default, for bootstrap; after > some time, these consumers reset their commit offset to earliest and > started reprocessing a bunch of events. > > We are using Confluent.Kafka client. > Is that an expected behavior? > Is there an option to only reset offset on the positive ack that the > offset is not stored for this consumer? > We'd like the cases when the offset cannot be retrieved due to a transient > condition to result in retries, or at least a failure. >

offset reset on unavailability

Hi. We've run into a situation where Kafka cluster was unstable, but some brokers were still up and responding. Some of the consumers restarted at that time and were not able to get their commit offset. We run with auto.offset.reset earliest by default, for bootstrap; after some time, these consumers reset their commit offset to earliest and started reprocessing a bunch of events. We are using Confluent.Kafka client. Is that an expected behavior? Is there an option to only reset offset on the positive ack that the offset is not stored for this consumer? We'd like the cases when the offset cannot be retrieved due to a transient condition to result in retries, or at least a failure.

Re: Fwd: Jira access

-----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEE8osu2CcCCF5douGQu8PBaGu5w1EFAl4wpsgACgkQu8PBaGu5 w1Ezow/8CkVA2ousJAzTg0qukwbMB+n2JiHtAGYBDquzXuDYjYXU5ENKh/VdAVad eoLqMHcl9uklghcU8YM6VIaLyPNX5/A3ZFiapJy8pL4h8MbCt6XKl4zycc/lLDi5 JfY5zaDy0KJCa+p5tb5LCoSWeKcncH+h3HL3CkULnIEQ2OMS5rcH2Z1MHgNf3dHj k+JBQWz4HbrpnSRfpSfRzZqcaRoxogn++DTebT241DiO+SZqNFjiS+nk3OgqJo6W IsRaxldyX9WQZnxdtnihL8ouzqs665CVQxOAinAvivOZJCSBEKrZokgCPW9crt1h KiFXoe1cw/3CyVV4zsjxKqLuex14kROdRyLUcwSwWY5t2i53QqUUwm4enZYnBlaj c9mcexxzKWjTrlnBHoYOcQ0/7UPQvf3W70YHeDp3Z7C+BPJv09qeDFdSdUUMSJpy L9jZemNgbxKjzfxxcK2TR54t4++YZWSgKHleGJufshzPJ/gcMjhBBaOMXwpmWCAV 5lNSw822kUmHS/pNRSPMuOBC5ZrEjzZERkCHxmWySnQARcE2bazihuEY01EnDR6U a2vqb8YAoBJoyLJ+4vvocvDLMp2xGhJ5wU3rr3Fsnm5TjW8RPEheoUpib3z2MfoK WEE9la+w++lXTd7tkF8Vke0YGzrf7j7nudmdmXe/AG8H3MWhnnw= =OBt2 -----END PGP SIGNATURE----- What is your Jira account ID? -Matthias On 1/27/20 3:59 PM, Alexandra Rodoni wrote: > Hi, > Could you grant me the permission to assign J...

Deleted topics come back after restart

Hello, I found that we have like 1000 topics in our dev-environment because many Devs create new topics for testing something. So I clean up the old ones, and it was all fine. However, after restarting the cluster during some update of the heartbeat-container, the topics are all back again. The setting for topic.deletion is set to true and also from logs I checked after issuing the delete-command Kafka marked the partitions for deletion and then also deleted the segments list, log, offset index and time index. I ran it like this: ./kafka-topics.sh --bootstrap-server localhost:9092 --topic postman1 --delete How to get rid of the old topics forever? Thanks Sebastian -- DISCLAIMER This email contains information that is confidential and which may be legally privileged. If you have received this email in error please notify the sender immediately and delete the email. This email is intended solely for the use of the intended recipient and you ...

Fwd: Jira access

Hi, Could you grant me the permission to assign Jira tickets - specifically to myself? Thank you.

Streams Processing meetup on Wednesday, February 5, 2020 at LinkedIn, Sunnyvale

*[bcc: (users,dev)@ kafka.apache.org < http://kafka.apache.org >]* Hello, The Streams Infra team invites you to attend the Streams Processing meetup to be held on Wednesday, February 5, 2020. This meetup will focus on Apache Kafka, Apache Samza and related streaming technologies. *Where*: Unify Conference room, 950 W Maude Ave, Sunnyvale *When*: 5:00 - 8:00 PM *RSVP*: Please RSVP here (only if attending in person) https://www.meetup.com/Stream-Processing-Meetup-LinkedIn/events/267283444/ A streaming link will be posted approximately 30 minutes prior to the event. *Agenda:* - 5:00 PM: Doors open and catered food available 5:00 - 6:00 PM: Networking 6:00 - 6:30 PM: *High-performance data replication at Salesforce with MirusPaul Davidson, Salesforce* At Salesforce we manage high-volume Apache Kafka clusters in a growing number of data centers around the globe. In the past we relied on Kafka's Mirror Maker tool for cross-data center re...

KStreams in-memory state-store

Hi all, I have question about kafka-streams, particularly in-memory state-store (/org.apache.kafka.streams.state.internals.InMemoryKeyValueStore/). I believe that topology is irrelevant here, but let's say I have one source topic with single partition feeding data into one statefull processor associated to single in-memory state store. This results in topology with single task. This topology is run in 2 application instances: - First instance (A) runs the task in active mode - Second instance (B) runs the task as standby Our use-case is low-latency processing, hence we need to keep rebalance downtime as low as possible (ideally few hundreds milliseconds). Let's say that we kill instance A, which triggers rebalance and B takes over the processing. We found that, when task on B transitions from STANDBY into ACTIVE mode, it closes in-memory state-store and effectively throws away any state read from changelog while it was in STANDBY. No checkpoi...

RE: org.apache.kafka.common.KafkaException: org.apache.kafka.common.serialization.ByteArraySerializer is not an instance of org.apache.kafka.common.serialization.Serializer

Thanks for the reply, looks like half the message got deleted. Much appreciated! Extracts from the pom: <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-jpa</artifactId> <exclusions> <exclusion> <artifactId>spring-boot-starter-logging</artifactId> <groupId>org.springframework.boot</groupId> </exclusion> </exclusions> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-log4j2</artifactId> </dependency> <dependency> <groupId>org.apache.logging.log4j</groupId> <artifactId>log4j-web</artifactId> </dependency> <de...

Re: org.apache.kafka.common.KafkaException: org.apache.kafka.common.serialization.ByteArraySerializer is not an instance of org.apache.kafka.common.serialization.Serializer

unsubscribe On Mon, Jan 27, 2020 at 8:51 AM < buksvdl@gmail.com > wrote: > > > Hi, I would appreciate any help on this. > > Thanks a stack! > > Buks > > > > <dependency> > <groupId>org.apache.kafka</groupId> > <artifactId>kafka-clients</artifactId> > <version>${kafka.version}</version> > </dependency> > > > > <kafka.version>2.3.1</kafka.version> > > > > <parent> > <groupId>org.springframework.boot</groupId> > <artifactId>spring-boot-starter-parent</artifactId> > <version>2.2.4.RELEASE</version> > <relativePath/>  > </parent> > > > > > > Sent from Mail < https://go.microsoft.com/fwlink/?LinkId=550986 > for > Windows 10 > > >

Re: Reg: Guidance for Machine Capacity Plan

Hi, Please help/guide me to identify the size of the cluster. Thanks and Regards, Gowtham S. On Fri, 24 Jan 2020 at 14:31, Gowtham S < gowtham.co.inc@gmail.com > wrote: > Hi, > > We are in the process of deploying Kafka in our service. We need to decide > the machine capacity plan, we arrived at the below formulae for deriving > total machine capacity. > > Total Broker machine size = Message size per second * Retention period * > Replication Factor > > Am I need to consider the topic, index files in the calculation? Please > help/guide me if i missing any param required in the formulae. > > Index file calculation (Reference > < https://issues.apache.org/jira/browse/KAFKA-3300 >) > > Currently, the initial/max size of offset index file is configured by log.index.max.bytes. > This will be the offset index file size for the active log segment until it > rolls out. > > Theoretically,...

Re: org.apache.kafka.common.KafkaException: org.apache.kafka.common.serialization.ByteArraySerializer is not an instance of org.apache.kafka.common.serialization.Serializer

Hey Buks, On Mon, 27 Jan 2020 at 07:51, < buksvdl@gmail.com > wrote: > > > Hi, I would appreciate any help on this. > > Thanks a stack! > > Buks > > > > <dependency> > <groupId>org.apache.kafka</groupId> > <artifactId>kafka-clients</artifactId> > <version>${kafka.version}</version> > </dependency> > > > > <kafka.version>2.3.1</kafka.version> > > > > <parent> > <groupId>org.springframework.boot</groupId> > <artifactId>spring-boot-starter-parent</artifactId> > <version>2.2.4.RELEASE</version> > <relativePath/>  > </parent> > > > > > > Sent from Mail < https://go.microsoft.com/fwlink/?LinkId=550986 > for > Windows 10 > > > Seems like you...

org.apache.kafka.common.KafkaException: org.apache.kafka.common.serialization.ByteArraySerializer is not an instance of org.apache.kafka.common.serialization.Serializer

Hi, I would appreciate any help on this. Thanks a stack! Buks <dependency> <groupId> org.apache.kafka </groupId> <artifactId> kafka-clients </artifactId> <version> ${kafka.version} </version> </dependency> <kafka.version> 2.3.1 </kafka.version> <parent> <groupId> org.springframework.boot </groupId> <artifactId> spring-boot-starter-parent </artifactId> <version> 2.2.4.RELEASE </version> <relativePath/>  </parent> Sent from Mail for Windows 10

Re: min.insync.replicas and producer acks

Pushkar, On Sat, 25 Jan 2020 at 11:19, Pushkar Deole < pdeole2015@gmail.com > wrote: > Thank you for a quick response. > > What would happen if I set the producer acks to be 'one' and > min.insync.replicas to 2. In this case the producer will return when only > leader received the message but will not wait for other replicas to receive > the message. In this case, how min.insync.replicas of 2 will be guaranteed > by kafka? > To be fair, this is documented across various areas on official channels, confluence pages, and confluent websites. We suggest that you explore them first to understand where the confusion is. If you believe something is incorrectly documented, or rather ambiguous, we will be happy to explain. Regards, > On Sat, Jan 25, 2020 at 12:50 PM Boyang Chen < reluctanthero104@gmail.com > > wrote: > > > Hey Pushkar, > > > > producer ack only has 3 options: none, on...

Re: min.insync.replicas and producer acks

I mean, the producer acks to be 'none' On Sat, Jan 25, 2020 at 4:49 PM Pushkar Deole < pdeole2015@gmail.com > wrote: > Thank you for a quick response. > > What would happen if I set the producer acks to be 'one' and > min.insync.replicas to 2. In this case the producer will return when only > leader received the message but will not wait for other replicas to receive > the message. In this case, how min.insync.replicas of 2 will be guaranteed > by kafka? > > On Sat, Jan 25, 2020 at 12:50 PM Boyang Chen < reluctanthero104@gmail.com > > wrote: > >> Hey Pushkar, >> >> producer ack only has 3 options: none, one, or all. You could not nominate >> an arbitrary number. >> >> On Fri, Jan 24, 2020 at 7:53 PM Pushkar Deole < pdeole2015@gmail.com > >> wrote: >> >> > Hi All, >> > >> > I am a bit confused about min.insync.replicas a...

Re: min.insync.replicas and producer acks

Thank you for a quick response. What would happen if I set the producer acks to be 'one' and min.insync.replicas to 2. In this case the producer will return when only leader received the message but will not wait for other replicas to receive the message. In this case, how min.insync.replicas of 2 will be guaranteed by kafka? On Sat, Jan 25, 2020 at 12:50 PM Boyang Chen < reluctanthero104@gmail.com > wrote: > Hey Pushkar, > > producer ack only has 3 options: none, one, or all. You could not nominate > an arbitrary number. > > On Fri, Jan 24, 2020 at 7:53 PM Pushkar Deole < pdeole2015@gmail.com > > wrote: > > > Hi All, > > > > I am a bit confused about min.insync.replicas and producer acks. Are > these > > two configurations achieve the same thing? e.g. if I set > > min.insync.replicas to 2, I can also achieve it by setting producer acks > to > > 2 so the producer won't...

Re: min.insync.replicas and producer acks

Hey Pushkar, producer ack only has 3 options: none, one, or all. You could not nominate an arbitrary number. On Fri, Jan 24, 2020 at 7:53 PM Pushkar Deole < pdeole2015@gmail.com > wrote: > Hi All, > > I am a bit confused about min.insync.replicas and producer acks. Are these > two configurations achieve the same thing? e.g. if I set > min.insync.replicas to 2, I can also achieve it by setting producer acks to > 2 so the producer won't get a ack until 2 replicas received the message? >

min.insync.replicas and producer acks

Hi All, I am a bit confused about min.insync.replicas and producer acks. Are these two configurations achieve the same thing? e.g. if I set min.insync.replicas to 2, I can also achieve it by setting producer acks to 2 so the producer won't get a ack until 2 replicas received the message?

Re: MirrorMaker 2 Plugin class loader Error

Hi Ryanne, I'm wondering if I'm doing something wrong here. I'm using 2.4.0 now but seeing the same behavior. I'm still seeing the Plugin errors ("ERROR Plugin class loader for connector:'org.apache.kafka.connect.mirror.MirrorSourceConnector'"). The internal topics are created on both source and destination kafka clusters but the user topics are not and there is no message replication happening. I'm using the connect-mirror-maker.sh dedicated MirrorMaker cluster script and using the properties file in https://github.com/apache/kafka/blob/trunk/config/connect-mirror-maker.properties . Any pointers? Anyone else seeing this behavior? Regards,Rajeev On Tuesday, November 12, 2019, 10:51:24 AM EST, Vishal Santoshi < vishal.santoshi@gmail.com > wrote: +1 On Mon, Nov 11, 2019 at 2:07 PM Ryanne Dolan < ryannedolan@gmail.com > wrote: > Rajeev, the config errors are unavoidable at present and can be ignored or ...