Kafka

Posts

Showing posts from June, 2019

Re: Can kafka internal state be purged ?

Yes, this all makes sense. The option untilWallClockTimeLimit would solve the problem for us. I can see the difficulty with delivering the "final" result. The problem here is assuming that something will be *complete* at some point in time and a final result can be delivered. The time for a window is not synchronized with the "event" that we want to detect. Hence, the application *has* to deal with state that may not be *complete* i.e some sort of cross window aggregation has to be implemented outside the application. I am wondering how you can possibly avoid that. Let me know how I can help. Thanks -mohan On 6/28/19, 1:21 PM, "John Roesler" < john@confluent.io > wrote: Ok, good, that's what I was hoping. I think that's a good strategy, at the end of the "real" data, just write a dummy record with the same keys with a high timestamp to flush everything else through. For the most part, I'...

Re: Kafka streams (2.1.1) - org.rocksdb.RocksDBException:Too many open files

Hey all, If you want to figure it out theoretically, if you print out the topology description, you'll have some number of state stores listed in there. The number of Rocks instances should just be (#global_state_stores + sum(#partitions_of_topic_per_local_state_store)) . The number of stream threads isn't relevant here. You can also figure it out empirically: the first level of subdirectories in the state dir are Tasks, and then within that, the next level is Stores. You should see the store directory names match up with the stores listed in the topology description. The number of Store directories is exactly the number of RocksDB instances you have. There are also metrics corresponding to each of the state stores, so you can compute it from what you find in the metrics. Hope that helps, -john On Thu, Jun 27, 2019 at 6:46 AM Patrik Kleindl < pkleindl@gmail.com > wrote: > > Hi Kiran > Without much research my guess would be "num_...

Re: Can kafka internal state be purged ?

Ok, good, that's what I was hoping. I think that's a good strategy, at the end of the "real" data, just write a dummy record with the same keys with a high timestamp to flush everything else through. For the most part, I'd expect a production program to get a steady stream of traffic with increasing timestamps, so windows would be constantly be getting flushed out as stream time moves forward. Some folks have reported that they don't get enough traffic through their program to flush out the suppressed results on a regular basis, though. Right now, the best solution is to have the same Producer that writes data to the input topics also write "heartbeat/dummy" records periodically when there is no data to send, just to keep stream time moving forward. But this isn't a perfect solution, as you have pointed out in this thread; you really want the heartbeat records to go to all partitions, and also go to all re-partitions if there are...

Re: org.apache.kafka.streams.processor.TimestampExtractor#extract method in version 2.3 always returns -1 as value

Jonathan, Matthias I've created a Jira for this issue https://issues.apache.org/jira/browse/KAFKA-8615 . Jonathan, I plan to work on this when I get back from vacation on 7/8. If you would like to work in this yourself before that, feel free to do so and assign the ticket to yourself. Thanks, Bill On Thu, Jun 27, 2019 at 1:38 PM Matthias J. Sax < matthias@confluent.io > wrote: > Sounds like a regression to me. > > We did change some code to track partition time differently. Can you > open a Jira? > > > -Matthias > > On 6/26/19 7:58 AM, Jonathan Santilli wrote: > > Sure Bill, sure, is the same code I have reported the issue for the > > suppress some months ago: > > > https://stackoverflow.com/questions/54145281/why-do-the-offsets-of-the-consumer-group-app-id-of-my-kafka-streams-applicatio > > > > In fact, I have reported at that moment, that after restarting the app, > the >...

Re: Kafka Connect JDBC Sink - SQLServerException: Column in table is of a type that is invalid for use as a key column in an index

Just checked the constraint, and the maximum is 900 for varchar before SQL Server 2016 On 2019/06/28 14:55:01, sendoh < unicorn.banachi@gmail.com > wrote: > > > On 2019/06/28 14:50:37, sendoh < unicorn.banachi@gmail.com > wrote: > > I encounter the same issue as https://github.com/confluentinc/kafka-connect-jdbc/issues/379 > > > > I think I could contribute to implement a custom `buildCreateTableStatement` in SqlServerDatabaseDialect that checks if primary key is a string, set it as VARCHAR(450) and logs it as warning. > > > > any suggestion? > > > > > or simply like the HANA implementation which only uses 1000 for VARCHAR > https://github.com/confluentinc/kafka-connect-jdbc/blob/bb352573e68698b14045e3fceed5d9b7632eb9e4/src/main/java/io/confluent/connect/jdbc/dialect/SapHanaDatabaseDialect.java#L102 >

Re: Kafka Connect JDBC Sink - SQLServerException: Column in table is of a type that is invalid for use as a key column in an index

On 2019/06/28 14:50:37, sendoh < unicorn.banachi@gmail.com > wrote: > I encounter the same issue as https://github.com/confluentinc/kafka-connect-jdbc/issues/379 > > I think I could contribute to implement a custom `buildCreateTableStatement` in SqlServerDatabaseDialect that checks if primary key is a string, set it as VARCHAR(450) and logs it as warning. > > any suggestion? > > or simply like the HANA implementation which only uses 1000 for VARCHAR https://github.com/confluentinc/kafka-connect-jdbc/blob/bb352573e68698b14045e3fceed5d9b7632eb9e4/src/main/java/io/confluent/connect/jdbc/dialect/SapHanaDatabaseDialect.java#L102

Kafka Connect JDBC Sink - SQLServerException: Column in table is of a type that is invalid for use as a key column in an index

I encounter the same issue as https://github.com/confluentinc/kafka-connect-jdbc/issues/379 I think I could contribute to implement a custom `buildCreateTableStatement` in SqlServerDatabaseDialect that checks if primary key is a string, set it as VARCHAR(450) and logs it as warning. any suggestion?

Re: update of GlobalKTable

-----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIzBAEBCgAdFiEE8osu2CcCCF5douGQu8PBaGu5w1EFAl0Vu+IACgkQu8PBaGu5 w1GBvw/9FSeBRt2u2V9k0auWz1zUc1htbXmZmj7azO2PNJ6m8vVOEeWu6m3lClMt A4kn9t1dvSQJ85PLJMGlI+d0TkY/U/q3UL+HmHP9FpV7dFYls9fSrWzrWAYOqo95 q62/kmcZZvE2VlsVgX3qCYjBlvPtezL8t7krw7qU23lQEv9rBQP2ITNQRVKMUewd rM4LmZlBZDj557sohS8nDDbSr2z+K0HJeSXtUCQLp7VoutPEKQRhGxHx08bVXeHa 0oxUNelY6Wibuvif3h2flTeLaVFcaviszIjMCOrRcIhbtlKeVWmwOEnrQzW31v69 kkB/XXPoYlWeLae6lgyVSjn+dkTk4rnKhRJrA7C5iIqSiB0GCEc/yB2DQOazQd5F 1vEm9jcwd7Rq3J7+y4mlnOBNP/Zlm8Qlf8qPYgfEhpE/FoQZ3u8KAAswh2CUmNHa /2BDMnQ1PiEVn+oK3Oxl3t85hDbTpVODZS0Xt2XTcghECVU5oIxzgKjwEMX0Rop+ HfPnsW79pMcW9lnc8Qgs77BsN3xdtwOBzb84Cwvd97fgIXbrSTmS7vGxkTlMjFd7 FCcuX6MX3heqT9xkLj2BTZwEa0asMP8WoLZLAJTB5wpXGgnzco8VZcH+1tgHB1Ss nItdVRES0ue5zzf2Pjagn864gIEX2/zeDVbu67opAURhz7ll/YE= =hgBr -----END PGP SIGNATURE----- Your expectation and understanding how GlobalKTables work is correct. If you can reproduce the issue in a...

Questions about KIP-415: Incremental Cooperative Rebalancing in Kafka Connect

Hi, team, We are using Kafka Connect at present and have encountered a problem with version 2.1.0 that all connectors kept restarting when one new connector was added into the cluster which then failed to start due to some network problem (firewall not open). And the Connect daemon failed to serve requests since the herder thread was blocked, and we could not delete the failed connector until we restarted the Connect process. We searched for solutions in new versions and found that KIP-415 may be what we want since it seems to be able to avoid the stop-the-world behavior when any connector change happens. After test on branch:2.3.0 which includes KIP-415, we found an unexpected behavior. Here are the steps for reproducing the case: 1. setup a Kafka Connect cluster by starting one worker with distributed config connecting to an existing Kafka 2. create two topics (say T1, T2) to be the destination for two source connectors 3. create a FileStreamSourceConnector (say C1...

Re: org.apache.kafka.streams.processor.TimestampExtractor#extract method in version 2.3 always returns -1 as value

-----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIzBAEBCgAdFiEE8osu2CcCCF5douGQu8PBaGu5w1EFAl0U/wUACgkQu8PBaGu5 w1HRFQ//aJ/qldMfhz+zpISBr+wTlTSwNoZ9ySYFnCD5ZUc2oAU05sgRKZTwgQap fhjxzTJVWuEWoYBwsvjtzPyoOOWzJblvySzwH1/OwBexI+fUfAo3dQuGHkETKjRe VFKcfwE1WPtoNVqGspHH4Zx84kIbufoW3CP0MIA8JuItI7RRkqNfi/oQMY5JOohs 29iNQ+tEhNk+sqYW93I2tAAJ80j6OSB2fjjE7hmvTCv/TY0BwuBAoV0AhlTm0Rat BdwYpnlKtZLH6o8ZW5oeEhpfvEftoF3OP0BiBGgPBFf8Aqrl5l3upr+DL+p3ZEDO pkaE/7tDF1Yt3xkLSh526NOsvG7nTWlHwzjMmIdywIC1Om9DAe7tJ9N26vmomU7o +rBnJTHHTIUBvfVbmWJB8S3pI/7xao7PeTgojuuWfBMxW3YvD+2niLhRBrAGYCES MRtZ3+54ppMaImhbVVokKk/ZPpqVK96BIIwSdT1mxlrt5uyEvf97S2nCNyW4ZHAX +79JTqw9deXVSItxeoKiI8BuPuEePEcUJ8GGuX8jAW36sJKQWap32PdVMLSMqGY7 Ctpz2zbh/pDxB36z+Vqh8lQGYkV4fhA+RJSDhh2WnJiv5v195bf0QNRU1xqdxqTp VMv8Vr3Vuc3y477iW2f6L9yk78qnUMd/4vlF0KczwfHaY5RrNjQ= =alQZ -----END PGP SIGNATURE----- Sounds like a regression to me. We did change some code to track partition time differently. Can you ope...

Re: Topic gets truncated

On 2019/06/27 15:39:29, Jan Kosecki < jan.kosecki91@gmail.com > wrote: > Hi, > > I have a hyperledger fabric cluster that uses a cluster of 3 kafka nodes + > 3 zookeepers. > Fabric doesn't support pruning yet so any change to recorded offsets is > detected by it and it fails to reconnect to kafka cluster. > Today morning all kafka nodes have gone offline and then slowly restarted. > After that fabric doesn't work anymore. After checking the logs I've learnt > that fabric has recorded last offset of 13356 > > 2019-06-27 10:34:51.325 UTC [orderer/consensus/kafka] newChain -> INFO 144 > [channel: audit] Starting chain with last persisted offset 13356 and last > recorded block 2881 > > ,however when I compared it with kafka logs of the first node that started > (kafka-0), I've found out that, although initially highwater mark was > around this value, for some reason, during reboot, the node tru...