Skip to main content

Posts

Showing posts from November, 2019

Deleting topics in windows - fix estimate or workaround

Hi All, I hope we are well aware of the critical bug in windows where kafka crashes when we delete a topic. This affects other areas too like in stream processing when trying to reset a stream using StreamsResetter. Right now only workaround I have found is to stop zookeeper and kafka server and manually delete directories and files containing streams, topics and offset information. However doing this every time is kind of unproductive. I see that there are multiple critical bugs logged in JIRA https://issues.apache.org/jira/browse/KAFKA-6203 https://issues.apache.org/jira/browse/KAFKA-1194 around the same issue. I would like to know by when would the fix be available? I see that there have been multiple pull requests issued around fixes of these issues. I wanted to know if one or more pull requests need to be merged to get the fix out or if there is something I can try config wise to have some workaround for this issue. Please note that there can be few...

RE: More partitions => less throughput?

What is happening imho is that when you have multiple partitions, each consumer will fetch data from its partition and find only 1/64th the amount of data (compared to the single partition case) to send every time it is its turn to send stuff. Therefore you end up having a more chatty situation, where each push to broker carry too small number of messages, compared to the single partition case that optimize can perform the same function but each set of message send to broker contains higher message count. Eric -----Original Message----- From: Craig Pastro < siyopao@gmail.com > Sent: Thursday, November 28, 2019 9:10 PM To: users@kafka.apache.org Subject: More partitions => less throughput? External Hello there, I was wondering if anyone here could help me with some insight into a conundrum that I am facing. Basically, the story is that I am running three Kafka brokers via docker on a single vm with log.flush.interval.messages = 1 and min.insync.replicas = ...

[VOTE] 2.4.0 RC2

Hello Kafka users, developers and client-developers, This is the third candidate for release of Apache Kafka 2.4.0. This release includes many new features, including: - Allow consumers to fetch from closest replica - Support for incremental cooperative rebalancing to the consumer rebalance protocol - MirrorMaker 2.0 (MM2), a new multi-cluster, cross-datacenter replication engine - New Java authorizer Interface - Support for non-key joining in KTable - Administrative API for replica reassignment - Sticky partitioner - Return topic metadata and configs in CreateTopics response - Securing Internal connect REST endpoints - API to delete consumer offsets and expose it via the AdminClient. Release notes for the 2.4.0 release: https://home.apache.org/~manikumar/kafka-2.4.0-rc2/RELEASE_NOTES.html *** Please download, test and vote by Thursday, December 5th, 9am PT Kafka's KEYS file containing PGP keys we use to sign the release: https://kafka.apache.org/KEYS...

Re: More partitions => less throughput?

Testing multiple brokers VMs on a single host won't give you accurate performance numbers unless that is how you will be deploying kafka in production. (Don't do this.) All your kafka networking is being handled by a single host, so instead of being spread out between machines to increase total possible throughput, they are competing with each other. Given that this is the test environment you settled on, you should tune the number of partitions taking number of producers and consumers, and also the average message size into account. If you have only one producer, then a single consumer should be sufficient to read the data in real-time. If you have multiple producers, you may need to scale up the consumer count and use consumer groups. -- Peter > On Nov 30, 2019, at 8:57 AM, Tom Brown < tombrown52@gmail.com > wrote: > > I think the number of partitions needs to be tuned to the size of the > cluster; 64 partitions on what is essentially a single ...

Re: More partitions => less throughput?

I think the number of partitions needs to be tuned to the size of the cluster; 64 partitions on what is essentially a single box seems high. Do you know what hardware you will be deploying on in production? Can you run your benchmark on that instead of a vm? —Tom On Thursday, November 28, 2019, Craig Pastro < siyopao@gmail.com > wrote: > Hello there, > > I was wondering if anyone here could help me with some insight into a > conundrum that I am facing. > > Basically, the story is that I am running three Kafka brokers via docker on > a single vm with log.flush.interval.messages = 1 and min.insync.replicas = > 2. Then I create two topics: both with replication factor = 3, but one with > one partition and the other with 64. > > Then I try to run a benchmark using these topics and what I find is as > follows: > > 1 partition, 1381.02 records/sec, 685.87 ms average latency > 64 partitions, 601.00 records/sec, 1298...

Re: [VOTE] 2.4.0 RC1

Hi All, We will consider KAFKA-9244 < https://issues.apache.org/jira/browse/KAFKA-9244 > as blocker and include the fix in 2.4 release. I am canceling this VOTE and will create third release candidate. Thank you all for testing. On Fri, Nov 29, 2019 at 10:52 AM Matthias J. Sax < matthias@confluent.io > wrote: > I did not find the bug -- it was reported by Kin Sui > ( https://issues.apache.org/jira/browse/KAFKA-9244 ) > > If the bug is a blocker is a judgment call thought, because it's > technically not a regression. However, if we don't include the fix in > 2.4.0, as Adam pointed out, the new foreign-key join would compute > incorrect results, and thus, it's at least a critical issue. > > > -Matthias > > > > On 11/28/19 11:48 AM, Adam Bellemare wrote: > > mjsax found an important issue for the foreign-key joiner, which I think > > should be a blocker (if it isn't already) ...

Re: kafka stream or samza

Hi On Fri, 29 Nov 2019 at 11:11, Roberts Roth <roberts_roth@yahoo.com.invalid> wrote: > Hello > > I was confused, for realtime streams, shall we use kafka or samza? > > We have deployed kafka cluster with large scale in production > environment.Shall we re-use kafka's streaming feature, or deploy new > cluster of samza? > > Thanks for your suggestion > This is too broad for discussion (at least, in my opinion anyway). I think you need to consider your business impact and development cost for managing Kafka vs. Samza. It's not what the whitepapers say which one is the best, but your needs and adaptability. Also, future support is an important question. Have you had such discussion amongst your teams? Thanks, > > regards. > Roberts >

kafka stream or samza

Hello I was confused, for realtime streams, shall we use kafka or samza? We have deployed kafka cluster with large scale in production environment.Shall we re-use kafka's streaming feature, or deploy new cluster of samza? Thanks for your suggestion regards. Roberts

Re: [VOTE] 2.4.0 RC1

-----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEE8osu2CcCCF5douGQu8PBaGu5w1EFAl3gqvAACgkQu8PBaGu5 w1HG9Q//c6BY+fk/akXEERQ5vIbN2wVeIoLXvLE14PPLBCaL9G/12FlFU4zBlUhS O4jZXmhfYxEPLJcVAVLSkZr4R2qwpmkMzPvCFI5F6Jr/cMWnku+KBgVYjGmYGMjp y3CdsrDoRvpHIvcl1WLZcmciTyphcU4Ho2HUyhORQvbUPbvPmfOmMBtqzVp69fPq Bs1/mMrZfY0FGqNooz+C9UbxZDg4Oqi9EXzVaohnnu1wn3Dtx6mQSj5X8t+gnRTX R9K+Ezypc3Lg2sn/0xkCZPf4t1m6B+QxA+tPoGWUvilKaUd3JNieItZmFrnRlG8W YRkfSgtRmv1UyFahSjGo4EV/Qp7luC0iIBiGtkg39wRyhhFOQ18L4vDIhw7Yzhyz MPC7O1DLyXoK+i2qPeU3urCvSKx5dq/Br/krrPfSUT4cPsuiH5IAD9ibAnBHMVjD GQ90ZSFpFtjPxsya4349DnDVE2izZqlkLrTTJIKrQyBo2UtNVQpiIc+gx2ahMT7h loQKdprHi3clwWfACnmAy6fIAOWVmL1iSW2zjhPI9gcNfNILkBZ8O7odL0E8cSl2 Jn4eNXEEo/AUa2G9C8pLaYSA7cxYn/2k9D0gY4WNr9b1QP40RvCbhX0PvtgUpuyT kUhvN8A96UVMA91oj/RrVkvMlfE7uQKZKd9Pal1bl7FQHTGkXPU= =a31I -----END PGP SIGNATURE----- I did not find the bug -- it was reported by Kin Sui ( https://issues.apache.org/jira/browse/KAFKA-9244 ) If the bug is a blocker is a judgment cal...

More partitions => less throughput?

Hello there, I was wondering if anyone here could help me with some insight into a conundrum that I am facing. Basically, the story is that I am running three Kafka brokers via docker on a single vm with log.flush.interval.messages = 1 and min.insync.replicas = 2. Then I create two topics: both with replication factor = 3, but one with one partition and the other with 64. Then I try to run a benchmark using these topics and what I find is as follows: 1 partition, 1381.02 records/sec, 685.87 ms average latency 64 partitions, 601.00 records/sec, 1298.18 ms average latency This is the opposite of what I expected. In neither case am I even close to the IOPS of what the disk can handle. So what I would like to know is if there is any obvious reason that I am missing for the slow down with more partitions? If it is helpful the docker-compose file and the code to do the benchmarking can be found at https://github.com/siyopao/kafka-benchmark . (Any comments or adv...

Re: [VOTE] 2.4.0 RC1

mjsax found an important issue for the foreign-key joiner, which I think should be a blocker (if it isn't already) since it is functionally incorrect without the fix: https://github.com/apache/kafka/pull/7758 On Tue, Nov 26, 2019 at 6:26 PM Sean Glover < sean.glover@lightbend.com > wrote: > Hi, > > I also used Eric's test script. I had a few issues running it that I > address below[0][1], otherwise looks good. > > - Signing keys all good > - All md5, sha1sums and sha512sums are good > - A couple transient test failures that passed on a second run > (ReassignPartitionsClusterTest.shouldMoveSinglePartitionWithinBroker, > SaslScramSslEndToEndAuthorizationTest. > testNoDescribeProduceOrConsumeWithoutTopicDescribeAcl) > - Passes our own test suite for Alpakka Kafka ( > https://travis-ci.org/akka/alpakka-kafka/builds/616861540 , > https://github.com/akka/alpakka-kafka/pull/971 ) > > +1 (non-binding...

Re: Producer slowness and Exception on broker restart - kafka.common.KafkaException: Unknown group metadata version 1

Hi, We also did version upgrade and then degraded to old due to some issue. But that we done ~1 year before. Do I need to follow this to solve this issue? http://mail-archives.apache.org/mod_mbox/kafka-dev/201907.mbox/%3CJIRA.13246826.1563975667000.40318.1563975720170@Atlassian.JIRA%3E In this he follow below steps to evicted the broken records with temporarily setting the segment time to a very low value and deactivation of compaction. /opt/kafka/bin/kafka-topics.sh --alter --config segment.ms =900000 --topic __consumer_offsets --zookeeper localhost:2181 /opt/kafka/bin/kafka-topics.sh --alter --config cleanup.policy=delete --topic __consumer_offsets --zookeeper localhost:2181 < wait for the cleaner to clean up > /opt/kafka/bin/kafka-topics.sh --alter --config segment.ms =604800000 --topic __consumer_offsets --zookeeper localhost:2181 /opt/kafka/bin/kafka-topics.sh --alter --config cleanup.policy=compact --topic __consumer_offsets --zookeeper localhos...

Re: Attempt to prove Kafka transactions work

On Wed, Nov 20, 2019 at 6:35 PM Edward Capriolo < edlinuxguru@gmail.com > wrote: > > > On Wednesday, November 20, 2019, Matthias J. Sax < matthias@confluent.io > > wrote: > >> I am not sure what Spring does, but using Kafka Streams writing the >> output and committing offset would be part of the same transaction. >> >> It seems Spring is doing something else and thus, is seems it does not >> use the EOS API correctly. >> >> If you use transactions to copy data from input to output topic, >> committing offsets must be done on the producer as part to the >> transaction; the consumer would not commit offsets. >> >> To me, it seems that Spring is committing offset using the consumer >> independent if the transaction was successful or not. This would be an >> incorrect usage of the API. >> >> >> -Matthias >> >> On 11/20/19 6:16 AM, ...

Re: BufferOverflowException on rolling new segment after upgrading Kafka from 1.1.0 to 2.3.1

bumping this up with new update: I've investigated another occurrence of this exception. For analyzes, I used: 1) a memory dump that was taken from the broker 2) kafka log file 3) kafka state-change log 4) log, index and time-index files of a failed segment 5) Kafka source code, version 2.3.1 and 1.1.0 Here's how the exception looks like in the kafka log: 2019/11/19 16:03:00 INFO [ProducerStateManager partition=ad_group_metrics-62] Writing producer snapshot at offset 13886052 (kafka.log.ProducerStateManager) 2019/11/19 16:03:00 INFO [Log partition=ad_group_metrics-62, dir=/mnt/kafka] Rolled new log segment at offset 13886052 in 1 ms. (kafka.log.Log) 2019/11/19 16:03:00 ERROR [ReplicaManager broker=17] Error processing append operation on partition ad_group_metrics-62 (kafka.server.ReplicaManager) 2019/11/19 16:03:00 java.nio.BufferOverflowException 2019/11/19 16:03:00 at java.nio.Buffer.nextPutIndex(Buffer.java:527) 2019/11/19 16:03:00 at java.nio.Dire...

Re: Transactional Producer

Hi! I think we need to step back a little bit and understand what is what you > are trying to achieve, please, will be beneficial to give you an accurate > answer. > Sure, I'm working on my pet project that is a simple key-value database replicated over Kafka. I already implemented simple atomic updates like putIfAbsent, but now I want to support transactional updates for multiple keys. Thus, I'm trying to understand limitations of Kafka transactions and how to correctly apply them to the task. > What order can I expect for these published messages? > - This depends on different factors, like linger, batch size, buffers, etc, > even the network latency. > Obviously, I'm not too much interested in cases when the batch size is huge and linger is huge as well and everything is batched together and transactions are published one after another. I'm looking into extreme case when there were no batching at all and records were ...

Re: best config for kafka 10.0.0.1 consumer.assign.

I have added this to my consumer config, and now it works fine. receive.buffer.bytes=1048576 On Wed, Nov 13, 2019 at 10:41 AM Upendra Yadav < upendra1024@gmail.com > wrote: > Hi, > > I m using consumer assign method and consume with 15000 poll time out to > consume single partition data from another DC. > > Below are my consumer configs: > enable.auto.commit=false > max.poll.records=4000 > max.partition.fetch.bytes=4096000 > key.deserializer=org.apache.kafka.common.serialization > .ByteArrayDeserializer value.deserializer=org.apache.kafka.common. > serialization.ByteArrayDeserializer > > with this my consumer works fine. but when I'm changing > max.partition.fetch.bytes to 16384000, my consumer is not receiving any > message. > there is no exception. if I'm using consumer assign, do I need to tune > below properties: > fetch.max.bytes > session.timeout.ms > heartbeat.interval.m...