Skip to main content

Posts

Showing posts from November, 2020

Large messages max.message.bytes > message.max.bytes

Hi all! I have questions regarding configuration for large messages. I understand that kafka has settings for broker and topic for message max sizes; message.max.bytes (broker config) and max.message.bytes (topic config). I wonder if max.message.bytes for a topic can be larger that the server default message.max.bytes? The problem I am facing is that now and then there are messages that hit the 1MB wall even compressed. It is rare that the messages get larger, but also there is no theoretical max size for the messages. I am not allowed to tamper with the general setting since it could have performance implications for all other users, but what could the performance implications be if the max.message.bytes is considerably higher on one topic? Best Regards, Sakke F

Re: Reading all messages from a Kafka topic for a state

Well, your KTable (by default) will be stored on local disk and thus a reply is not necessary. In case you loose your local state, Kafka Streams will first restore the state before resuming processing. -Matthias On 11/30/20 6:20 AM, Tomer Cohen wrote: > Thanks Matthias for the detailed and helpful explanation > > Is there a way that I can ensure that the KTable will be read from the > beginning of the topic on every restart of the application without having > to generate a new " application.id "? > > Thanks, > > Tomer >

Re: Large messages max.message.bytes > message.max.bytes

Topic limits can't override broker limits. On Tue, Dec 1, 2020 at 6:42 PM Sakke Ferraris < sakke@ferrasys.com > wrote: > Hi all! > > I have questions regarding configuration for large messages. I understand > that kafka has settings for broker and topic for message max sizes; > message.max.bytes (broker config) and max.message.bytes (topic config). > > I wonder if max.message.bytes for a topic can be larger that the server > default message.max.bytes? The problem I am facing is that now and then > there are messages that hit the 1MB wall even compressed. It is rare that > the messages get larger, but also there is no theoretical max size for the > messages. > > I am not allowed to tamper with the general setting since it could have > performance implications for all other users, but what could the > performance implications be if the max.message.bytes is considerably higher > on one topic? > > Best R...

Re: [VOTE] 2.7.0 RC3

Thanks for the vote, Gwen. Here's an update for Jenkins build * Successful Jenkins builds for the 2.7 branches: Unit/integration tests: https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-2.7-jdk8/detail/kafka-2.7-jdk8/63/ On Sun, Nov 29, 2020 at 2:20 AM Gwen Shapira < gwen@confluent.io > wrote: > +1 (binding) - assuming we get a successful Jenkins build for the branch. > > I built from sources, tested resulting binaries locally, verified > signature and checksums. > > Thank you for the release, Bill. > > On Wed, Nov 25, 2020 at 7:31 AM Bill Bejeck < bbejeck@gmail.com > wrote: > > > > This is the fourth candidate for the release of Apache Kafka 2.7.0. > > > > This is a major release that includes many new features, including: > > > > * Configurable TCP connection timeout and improve the initial metadata > fetch > > * Enforce broker-wide and per-listener c...

Regarding framing producer rate in-terms of software as well as hardware configurations

Team, *Use-case :* *IMAP* . I have an application in which an org has users , who use IMAP to send mails, where the mail contents are produced to kafka. Here the scaling factors are 1. org can grow from 1 to million 2. users can grow from 1 to million. For this use-case, I need to calculate the producer rate and broker response rate for a single machine. So far we have identified, the factors that will be involved in producer-rate are : 1. Message size 2. Request size 3. Request rate overhead 4. Request latency 5. Round Trip Time 6. Number of Sender Threads 7. Number of Processor Threads at Broker 8. Replication factor Variables identified at Network layer, Kernel, NIC : 1. sysctl_wmem 2. Tx queues 3. Ring Buffer 4. Driver Queue 5. NAPI Polling Observations made so far : 1. SocketChannel is the one who is the entry point of sending data at the application level. 2. sendfile() system ...

Re: [VOTE] 2.7.0 RC3

+1 (binding) - assuming we get a successful Jenkins build for the branch. I built from sources, tested resulting binaries locally, verified signature and checksums. Thank you for the release, Bill. On Wed, Nov 25, 2020 at 7:31 AM Bill Bejeck < bbejeck@gmail.com > wrote: > > This is the fourth candidate for the release of Apache Kafka 2.7.0. > > This is a major release that includes many new features, including: > > * Configurable TCP connection timeout and improve the initial metadata fetch > * Enforce broker-wide and per-listener connection creation rate (KIP-612, > part 1) > * Throttle Create Topic, Create Partition and Delete Topic Operations > * Add TRACE-level end-to-end latency metrics to Streams > * Add Broker-side SCRAM Config API > * Support PEM format for SSL certificates and private key > * Add RocksDB Memory Consumption to RocksDB Metrics > * Add Sliding-Window support for Aggregations > > This ...

Re: 在哪可以查看kafka各个版本的生命周期

Hello Joson, The EOL policy of Kafka is 3 releases, which is about a year: https://cwiki.apache.org/confluence/display/KAFKA/Time+Based+Release+Plan And you can infer the life cycle of each release from their publish date here: https://kafka.apache.org/downloads Please use English to engage the community next time so that more people can help you :) Guozhang On Sat, Nov 28, 2020 at 11:23 AM joson.yang@ce-service.com.cn < joson.yang@ce-service.com.cn > wrote: > 您好: > > > 在哪可以查看kafka各个版本的生命周期 > > > > > -- -- Guozhang

Maintaining same offset while migrating from Confluent Replicator to Apache Mirror Maker 2.0

Hi All, We are currently trying to migrate Confluent replicator to Apache Open Source Mirror Maker v2.0. We are facing an issue where the messages which are already replicated by replicator is getting replicated again when the mirror maker is started on the same topic. This should not happen as messages are getting duplicated at the target cluster. Here are more details: 1. RCA: replicator assign a consumer group for replicating messages. This consumer group maintains the offset of the source topic. But we are not able to assign same consumer group to the Consumer config in mirror maker 2. 2. Mirror Maker 1.0 : working as same consumer group can be assigned in consumer.properties file and the messages are picked right after where replicator was stopped. 3. Tried running and configuring source.cluster.consumer.group.id in mirror maker 2.0 in all available options (in cluster mode, in connect-standalone and connect-distributed mode) but mirror maker 2.0 i...

[VOTE] 2.6.1 RC2

Hello Kafka users, developers and client-developers, This is the third candidate for release of Apache Kafka 2.6.1. Since RC1, the following JIRAs have been fixed: KAFKA-10758 Release notes for the 2.6.1 release: https://home.apache.org/~mimaison/kafka-2.6.1-rc2/RELEASE_NOTES.html *** Please download, test and vote by Wednesday, December 2, 5PM PT Kafka's KEYS file containing PGP keys we use to sign the release: https://kafka.apache.org/KEYS * Release artifacts to be voted upon (source and binary): https://home.apache.org/~mimaison/kafka-2.6.1-rc2/ * Maven artifacts to be voted upon: https://repository.apache.org/content/groups/staging/org/apache/kafka/ * Javadoc: https://home.apache.org/~mimaison/kafka-2.6.1-rc2/javadoc/ * Tag to be voted upon (off 2.6 branch) is the 2.6.1 tag: https://github.com/apache/kafka/releases/tag/2.6.1-rc2 * Documentation: https://kafka.apache.org/26/documentation.html * Protocol: https://kafka.apache.org/26...

Re: [VOTE] 2.7.0 RC3

Hi Tom, Thanks for reviewing the docs and providing the fix. I agree that this doesn't necessarily meet the bar for a new RC, we'll fix the 2.7 HTML -Bill On Wed, Nov 25, 2020 at 11:17 AM Tom Bentley < tbentley@redhat.com > wrote: > Hi Bill, > > One very minor problem I just spotted is > https://kafka.apache.org/27/documentation.html#brokerconfigs_listeners , > because the <code> tag is not properly closed the HTML doesn't render > property (in Chromium and Firefox at least) and all the rest of the configs > are shown in a monospaced font. I opened a PR > https://github.com/apache/kafka/pull/9655 , but unless there's some other > reason for an RC4 it might be better to just fix the generated HTML for 2.7 > > Kind regards, > > Tom > > On Wed, Nov 25, 2020 at 3:37 PM Bill Bejeck < bbejeck@gmail.com > wrote: > > > This is the fourth candidate for the release of Apache Ka...

Re: [VOTE] 2.7.0 RC3

Hi Bill, One very minor problem I just spotted is https://kafka.apache.org/27/documentation.html#brokerconfigs_listeners , because the <code> tag is not properly closed the HTML doesn't render property (in Chromium and Firefox at least) and all the rest of the configs are shown in a monospaced font. I opened a PR https://github.com/apache/kafka/pull/9655 , but unless there's some other reason for an RC4 it might be better to just fix the generated HTML for 2.7 Kind regards, Tom On Wed, Nov 25, 2020 at 3:37 PM Bill Bejeck < bbejeck@gmail.com > wrote: > This is the fourth candidate for the release of Apache Kafka 2.7.0. > > This is a major release that includes many new features, including: > > * Configurable TCP connection timeout and improve the initial metadata > fetch > * Enforce broker-wide and per-listener connection creation rate (KIP-612, > part 1) > * Throttle Create Topic, Create Partition and Delete To...

[VOTE] 2.7.0 RC3

This is the fourth candidate for the release of Apache Kafka 2.7.0. This is a major release that includes many new features, including: * Configurable TCP connection timeout and improve the initial metadata fetch * Enforce broker-wide and per-listener connection creation rate (KIP-612, part 1) * Throttle Create Topic, Create Partition and Delete Topic Operations * Add TRACE-level end-to-end latency metrics to Streams * Add Broker-side SCRAM Config API * Support PEM format for SSL certificates and private key * Add RocksDB Memory Consumption to RocksDB Metrics * Add Sliding-Window support for Aggregations This release also includes a few other features, 53 improvements, and 84 bug fixes. Release notes for the 2.7.0 release: https://home.apache.org/~bbejeck/kafka-2.7.0-rc3/RELEASE_NOTES.html *** Please download, test and vote by Wednesday, December 2, 12PM ET Kafka's KEYS file containing PGP keys we use to sign the release: https://kafka.apache.org/KEYS ...

Re: Many dups received by consumer (kafka_2.13)

Liam, many thanks! We already jumped into v.2.6.0 :) I appreciate your help Regards, Den ср, 25 нояб. 2020 г. в 15:10, Liam Clarke-Hutchinson < liam.clarke@adscale.co.nz >: > Can you upgrade Kafka to 2.5.1? This problem was fixed in that release. > https://issues.apache.org/jira/browse/KAFKA-9839 > > On Wed, Nov 25, 2020 at 9:44 PM Dev Op < dsd7150@gmail.com > wrote: > > > Hello community! Hasn't anyone faced a similar problem? I see nobody can > > give me advice on what's happening with our Kafka cluster. :( > > > > пн, 9 нояб. 2020 г. в 10:57, Dev Op < dsd7150@gmail.com >: > > > > > Hello all! > > > > > > Please, help me to understand why my consumer start receives the > > > duplicates. I think it is because of problems on my kafka1 node. > > > > > > Cluster consists of three nodes: kafka1 (192.168.137.19, id=1), > > > kafka...

Re: Reg: Max number of partitions in a topic

Yes, you can have 1000 partitions. But, there are implications of having a large number. Each partition has a leader. Clients downloading metadata to find those 1000 leaders will be a bit slower than finding 100 leaders. Producers that are buffering messages in order to use batching create a buffer per partition, so you may see increased memory usage. Likewise, if one of your brokers failed, the more partition leaders on it, the more leader elections that have to occur. I'd suggest benchmarking your use case with different partition counts and see where your sweet spot is. This old blog post has some good ideas: https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/ Cheers, Liam Clarke On Tue, Nov 24, 2020 at 10:51 PM Gowtham S < gowtham.co.inc@gmail.com > wrote: > Hi, > Can we have 1000 partitions in a Single topic? If not how many partitions > will a single topic have at the max? > Anyone, please tell. > ...

Re: Many dups received by consumer (kafka_2.13)

Can you upgrade Kafka to 2.5.1? This problem was fixed in that release. https://issues.apache.org/jira/browse/KAFKA-9839 On Wed, Nov 25, 2020 at 9:44 PM Dev Op < dsd7150@gmail.com > wrote: > Hello community! Hasn't anyone faced a similar problem? I see nobody can > give me advice on what's happening with our Kafka cluster. :( > > пн, 9 нояб. 2020 г. в 10:57, Dev Op < dsd7150@gmail.com >: > > > Hello all! > > > > Please, help me to understand why my consumer start receives the > > duplicates. I think it is because of problems on my kafka1 node. > > > > Cluster consists of three nodes: kafka1 (192.168.137.19, id=1), > > kafka2 (192.168.137.20, id=2), kafka3 ( 192.168.137.21, id=3) > > Version of Kafka: kafka_2.13-2.4.1 > > Configs: > > - Broker config (server.properties from kafka1): > > https://pastebin.com/MR20rZdQ > > - Zookeeper config (zookeeper.propert...

Re: Many dups received by consumer (kafka_2.13)

Hello community! Hasn't anyone faced a similar problem? I see nobody can give me advice on what's happening with our Kafka cluster. :( пн, 9 нояб. 2020 г. в 10:57, Dev Op < dsd7150@gmail.com >: > Hello all! > > Please, help me to understand why my consumer start receives the > duplicates. I think it is because of problems on my kafka1 node. > > Cluster consists of three nodes: kafka1 (192.168.137.19, id=1), > kafka2 (192.168.137.20, id=2), kafka3 ( 192.168.137.21, id=3) > Version of Kafka: kafka_2.13-2.4.1 > Configs: > - Broker config (server.properties from kafka1): > https://pastebin.com/MR20rZdQ > - Zookeeper config (zookeeper.properties from kafka1): > https://pastebin.com/vCpFU0gp > > /opt/kafka_2.13-2.4.1/bin/kafka-topics.sh --describe --topic in_raw > --zookeeper localhost:2181 > Topic: in_raw PartitionCount: 1 ReplicationFactor: 3 Configs: > Topic: in_raw Partition...

Re: GlobalKTable restoration - Unexplained performance penalty

Hey, I get the log *after* the restart was triggered for my app (and my app actually restarted, meaning i get it as part of my app bootstrap logging) On Tue, Nov 24, 2020 at 12:03 AM Guozhang Wang < wangguoz@gmail.com > wrote: > Hello Nitay, > > Thanks for letting us know about your observations. When you restart the > application from empty local state Kafka Streams will try to restore all > the records up to the current log end for those global KTables, and during > this period of time there should be no processing. > > Do you mind sharing where you get the "totalRecordsToBeRestored", is it > before the restarting was triggered? > > Guozhang > > On Mon, Nov 23, 2020 at 4:15 AM Nitay Kufert < nitay.k@ironsrc.com > wrote: > > > Hey all, > > We have been running a kafka-stream based service in production for the > > last couple of years (we have 4 brokers on this specific cluster). ...