Hi Stan and Gaurav,
Just to clarify some points mentioned here before
KAFKA-14616: I raised a year ago so it's not related to JBOD work. It is rather a blocker bug for KRAFT in general. The PR from Colin should fix this. Am not sure if it is a blocker for 3.7 per-say as it was a major bug since 3.3 and got missed from all other releases.
Regarding the JBOD's work:
KAFKA-16082: Is not a blocker for 3.7 instead it's nice fix. The pr https://github.com/apache/kafka/pull/15136 is quite a small one and was approved by Proven and I but it is waiting for a committer's approval.
KAFKA-16162: This is a blocker for 3.7. Same it's a small pr https://github.com/apache/kafka/pull/15270 and it is approved Proven and I and the PR is waiting for committer's approval.
KAFKA-16157: This is a blocker for 3.7. There is one small suggestion for the pr https://github.com/apache/kafka/pull/15263 but I don't think any of the current feedback is blocking the pr from getting approved. Assuming we get a committer's approval on it.
KAFKA-16195: Same it's a blocker but it has approval from Proven and I and we are waiting for committer's approval on the pr https://github.com/apache/kafka/pull/15262.
If we can't get a committer approval for KAFKA-16162, KAFKA-16157 and KAFKA-16195 in time for 3.7 then we can mark JBOD as early release assuming we merge at least KAFKA-16195.
Regards,
Omnia
> On 26 Jan 2024, at 15:39, kafka@gnarula.com wrote:
>
> Apologies, I duplicated KAFKA-16157 twice in my previous message. I intended to mention KAFKA-16195
> with the PR at https://github.com/apache/kafka/pull/15262 as the second JIRA.
>
> Thanks,
> Gaurav
>
>> On 26 Jan 2024, at 15:34, kafka@gnarula.com wrote:
>>
>> Hi Stan,
>>
>> I wanted to share some updates about the bugs you shared earlier.
>>
>> - KAFKA-14616: I've reviewed and tested the PR from Colin and have observed
>> the fix works as intended.
>> - KAFKA-16162: I reviewed Proven's PR and found some gaps in the proposed fix. I've
>> therefore raised https://github.com/apache/kafka/pull/15270 following a discussion with Luke in JIRA.
>> - KAFKA-16082: I don't think this is marked as a blocker anymore. I'm awaiting
>> feedback/reviews at https://github.com/apache/kafka/pull/15136
>>
>> In addition to the above, there are 2 JIRAs I'd like to bring everyone's attention to:
>>
>> - KAFKA-16157: This is similar to KAFKA-14616 and is marked as a blocker. I've raised
>> https://github.com/apache/kafka/pull/15263 and am awaiting reviews on it.
>> - KAFKA-16157: I raised this yesterday and have addressed feedback from Luke. This should
>> hopefully get merged soon.
>>
>> Regards,
>> Gaurav
>>
>>
>>> On 24 Jan 2024, at 11:51, kafka@gnarula.com wrote:
>>>
>>> Hi Stanislav,
>>>
>>> Thanks for bringing these JIRAs/PRs up.
>>>
>>> I'll be testing the open PRs for KAFKA-14616 and KAFKA-16162 this week and I hope to have some feedback
>>> by Friday. I gather the latter JIRA is marked as a WIP by Proven and he's away. I'll try to build on his work in the meantime.
>>>
>>> As for KAFKA-16082, we haven't been able to deduce a data loss scenario. There's a PR open
>>> by me for promoting an abandoned future replica with approvals from Omnia and Proven,
>>> so I'd appreciate a committer reviewing it.
>>>
>>> Regards,
>>> Gaurav
>>>
>>> On 23 Jan 2024, at 20:17, Stanislav Kozlovski <stanislav@confluent.io.INVALID> wrote:
>>>>
>>>> Hey all, I figured I'd give an update about what known blockers we have
>>>> right now:
>>>>
>>>> - KAFKA-16101: KRaft migration rollback documentation is incorrect -
>>>> https://github.com/apache/kafka/pull/15193; This need not block RC
>>>> creation, but we need the docs updated so that people can test properly
>>>> - KAFKA-14616: Topic recreation with offline broker causes permanent URPs -
>>>> https://github.com/apache/kafka/pull/15230 ; I am of the understanding that
>>>> this is blocking JBOD for 3.7
>>>> - KAFKA-16162: New created topics are unavailable after upgrading to 3.7 -
>>>> a strict blocker with an open PR https://github.com/apache/kafka/pull/15232
>>>> - although I understand Proveen is out of office
>>>> - KAFKA-16082: JBOD: Possible dataloss when moving leader partition - I am
>>>> hearing mixed opinions on whether this is a blocker (
>>>> https://github.com/apache/kafka/pull/15136)
>>>>
>>>> Given that there are 3 JBOD blocker bugs, and I am not confident they will
>>>> all be merged this week - I am on the edge of voting to revert JBOD from
>>>> this release, or mark it early access.
>>>>
>>>> By all accounts, it seems that if we keep with JBOD the release will have
>>>> to spill into February, which is a month extra from the time-based release
>>>> plan we had of start of January.
>>>>
>>>> Can I ask others for an opinion?
>>>>
>>>> Best,
>>>> Stan
>>>>
>>>> On Thu, Jan 18, 2024 at 1:21 PM Luke Chen <showuon@gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I think I've found another blocker issue: KAFKA-16162
>>>>> <https://issues.apache.org/jira/browse/KAFKA-16162> .
>>>>> The impact is after upgrading to 3.7.0, any new created topics/partitions
>>>>> will be unavailable.
>>>>> I've put my findings in the JIRA.
>>>>>
>>>>> Thanks.
>>>>> Luke
>>>>>
>>>>> On Thu, Jan 18, 2024 at 9:50 AM Matthias J. Sax <mjsax@apache.org> wrote:
>>>>>
>>>>>> Stan, thanks for driving this all forward! Excellent job.
>>>>>>
>>>>>> About
>>>>>>
>>>>>>> StreamsStandbyTask - https://issues.apache.org/jira/browse/KAFKA-16141
>>>>>>> StreamsUpgradeTest - https://issues.apache.org/jira/browse/KAFKA-16139
>>>>>>
>>>>>> For `StreamsUpgradeTest` it was a test setup issue and should be fixed
>>>>>> now in trunk and 3.7 (and actually also in 3.6...)
>>>>>>
>>>>>> For `StreamsStandbyTask` the failing test exposes a regression bug, so
>>>>>> it's a blocker. I updated the ticket accordingly. We already have an
>>>>>> open PR that reverts the code introducing the regression.
>>>>>>
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>> On 1/17/24 9:44 AM, Proven Provenzano wrote:
>>>>>>> We have another blocking issue for the RC :
>>>>>>> https://issues.apache.org/jira/browse/KAFKA-16157. This bug is similar
>>>>>> to
>>>>>>> https://issues.apache.org/jira/browse/KAFKA-14616. The new issue
>>>>> however
>>>>>>> can lead to the new topic having partitions that a producer cannot
>>>>> write
>>>>>> to.
>>>>>>>
>>>>>>> --Proven
>>>>>>>
>>>>>>> On Tue, Jan 16, 2024 at 12:04 PM Proven Provenzano <
>>>>>> pprovenzano@confluent.io>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> I have a PR https://github.com/apache/kafka/pull/15197 for
>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-16131 that is building
>>>>> now.
>>>>>>>> --Proven
>>>>>>>>
>>>>>>>> On Mon, Jan 15, 2024 at 5:03 AM Jakub Scholz <jakub@scholz.cz> wrote:
>>>>>>>>
>>>>>>>>> *> Hi Jakub,> > Thanks for trying the RC. I think what you found is a
>>>>>>>>> blocker bug because it *
>>>>>>>>> *> will generate huge amount of logspam. I guess we didn't find it in
>>>>>>>>> junit
>>>>>>>>> tests *
>>>>>>>>> *> since logspam doesn't fail the automated tests. But certainly it's
>>>>>> not
>>>>>>>>> suitable *
>>>>>>>>> *> for production. Did you file a JIRA yet?*
>>>>>>>>>
>>>>>>>>> Hi Colin,
>>>>>>>>>
>>>>>>>>> I opened https://issues.apache.org/jira/browse/KAFKA-16131.
>>>>>>>>>
>>>>>>>>> Thanks & Regards
>>>>>>>>> Jakub
>>>>>>>>>
>>>>>>>>> On Mon, Jan 15, 2024 at 8:57 AM Colin McCabe <cmccabe@apache.org>
>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Stanislav,
>>>>>>>>>>
>>>>>>>>>> Thanks for making the first RC. The fact that it's titled RC2 is
>>>>>> messing
>>>>>>>>>> with my mind a bit. I hope this doesn't make people think that we're
>>>>>>>>>> farther along than we are, heh.
>>>>>>>>>>
>>>>>>>>>> On Sun, Jan 14, 2024, at 13:54, Jakub Scholz wrote:
>>>>>>>>>>> *> Nice catch! It does seem like we should have gated this behind
>>>>> the
>>>>>>>>>>> metadata> version as KIP-858 implies. Is the cluster configured
>>>>> with
>>>>>>>>>>> multiple log> dirs? What is the impact of the error messages?*
>>>>>>>>>>>
>>>>>>>>>>> I did not observe any obvious impact. I was able to send and
>>>>> receive
>>>>>>>>>>> messages as normally. But to be honest, I have no idea what else
>>>>>>>>>>> this might impact, so I did not try anything special.
>>>>>>>>>>>
>>>>>>>>>>> I think everyone upgrading an existing KRaft cluster will go
>>>>> through
>>>>>>>>> this
>>>>>>>>>>> stage (running Kafka 3.7 with an older metadata version for at
>>>>> least
>>>>>> a
>>>>>>>>>>> while). So even if it is just a logged exception without any other
>>>>>>>>>> impact I
>>>>>>>>>>> wonder if it might scare users from upgrading. But I leave it to
>>>>>>>>> others
>>>>>>>>>> to
>>>>>>>>>>> decide if this is a blocker or not.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Jakub,
>>>>>>>>>>
>>>>>>>>>> Thanks for trying the RC. I think what you found is a blocker bug
>>>>>>>>> because
>>>>>>>>>> it will generate huge amount of logspam. I guess we didn't find it
>>>>> in
>>>>>>>>> junit
>>>>>>>>>> tests since logspam doesn't fail the automated tests. But certainly
>>>>>> it's
>>>>>>>>>> not suitable for production. Did you file a JIRA yet?
>>>>>>>>>>
>>>>>>>>>>> On Sun, Jan 14, 2024 at 10:17 PM Stanislav Kozlovski
>>>>>>>>>>> <stanislav@confluent.io.invalid> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hey Luke,
>>>>>>>>>>>>
>>>>>>>>>>>> This is an interesting problem. Given the fact that the KIP for
>>>>>>>>> having a
>>>>>>>>>>>> 3.8 release passed, I think it weights the scale towards not
>>>>> calling
>>>>>>>>>> this a
>>>>>>>>>>>> blocker and expecting it to be solved in 3.7.1.
>>>>>>>>>>>>
>>>>>>>>>>>> It is unfortunate that it would not seem safe to migrate to KRaft
>>>>> in
>>>>>>>>>> 3.7.0
>>>>>>>>>>>> (given the inability to rollback safely), but if that's true - the
>>>>>>>>> same
>>>>>>>>>>>> case would apply for 3.6.0. So in any case users w\ould be
>>>>> expected
>>>>>>>>> to
>>>>>>>>>> use a
>>>>>>>>>>>> patch release for this.
>>>>>>>>>>
>>>>>>>>>> Hi Luke,
>>>>>>>>>>
>>>>>>>>>> Thanks for testing rollback. I think this is a case where the
>>>>>>>>>> documentation is wrong. The intention was to for the steps to
>>>>>> basically
>>>>>>>>> be:
>>>>>>>>>>
>>>>>>>>>> 1. roll all the brokers into zk mode, but with migration enabled
>>>>>>>>>> 2. take down the kraft quorum
>>>>>>>>>> 3. rmr /controller, allowing a hybrid broker to take over.
>>>>>>>>>> 4. roll all the brokers into zk mode without migration enabled (if
>>>>>>>>> desired)
>>>>>>>>>>
>>>>>>>>>> With these steps, there isn't really unavailability since a ZK
>>>>>>>>> controller
>>>>>>>>>> can be elected quickly after the kraft quorum is gone.
>>>>>>>>>>
>>>>>>>>>>>> Further, since we will have a 3.8 release - it is
>>>>>>>>>>>> likely we will ultimately recommend users upgrade from that
>>>>> version
>>>>>>>>>> given
>>>>>>>>>>>> its aim is to have strategic KRaft feature parity with ZK.
>>>>>>>>>>>> That being said, I am not 100% on this. Let me know whether you
>>>>>> think
>>>>>>>>>> this
>>>>>>>>>>>> should block the release, Luke. I am also tagging Colin and David
>>>>> to
>>>>>>>>>> weigh
>>>>>>>>>>>> in with their opinions, as they worked on the migration logic.
>>>>>>>>>>
>>>>>>>>>> The rollback docs are new in 3.7 so the fact that they're wrong is a
>>>>>>>>> clear
>>>>>>>>>> blocker, I think. But easy to fix, I believe. I will create a PR.
>>>>>>>>>>
>>>>>>>>>> best,
>>>>>>>>>> Colin
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hey Kirk and Chris,
>>>>>>>>>>>>
>>>>>>>>>>>> Unless I'm missing something - KAFKALESS-16029 is simply a bad log
>>>>>>>>> due
>>>>>>>>>> to
>>>>>>>>>>>> improper closing. And the PR description implies this has been
>>>>>>>>> present
>>>>>>>>>>>> since 3.5. While annoying, I don't see a strong reason for this to
>>>>>>>>> block
>>>>>>>>>>>> the release.
>>>>>>>>>>>>
>>>>>>>>>>>> Hey Jakub,
>>>>>>>>>>>>
>>>>>>>>>>>> Nice catch! It does seem like we should have gated this behind the
>>>>>>>>>> metadata
>>>>>>>>>>>> version as KIP-858 implies. Is the cluster configured with
>>>>> multiple
>>>>>>>>> log
>>>>>>>>>>>> dirs? What is the impact of the error messages?
>>>>>>>>>>>>
>>>>>>>>>>>> Tagging Igor (the author of the KIP) to weigh in.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Stanislav
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Jan 13, 2024 at 7:22 PM Jakub Scholz <jakub@scholz.cz>
>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I was trying the RC2 and run into the following issue ... when I
>>>>>>>>> run
>>>>>>>>>>>>> 3.7.0-RC2 KRaft cluster with metadata version set to 3.6-IV2
>>>>>>>>> metadata
>>>>>>>>>>>>> version, I seem to be getting repeated errors like this in the
>>>>>>>>>> controller
>>>>>>>>>>>>> logs:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2024-01-13 16:58:01,197 INFO [QuorumController id=0]
>>>>>>>>>>>> assignReplicasToDirs:
>>>>>>>>>>>>> event failed with UnsupportedVersionException in 15 microseconds.
>>>>>>>>>>>>> (org.apache.kafka.controller.QuorumController)
>>>>>>>>>>>>> [quorum-controller-0-event-handler]
>>>>>>>>>>>>> 2024-01-13 16:58:01,197 ERROR [ControllerApis nodeId=0]
>>>>> Unexpected
>>>>>>>>>> error
>>>>>>>>>>>>> handling request RequestHeader(apiKey=ASSIGN_REPLICAS_TO_DIRS,
>>>>>>>>>>>>> apiVersion=0, clientId=1000, correlationId=14, headerVersion=2)
>>>>> --
>>>>>>>>>>>>> AssignReplicasToDirsRequestData(brokerId=1000, brokerEpoch=5,
>>>>>>>>>>>>> directories=[DirectoryData(id=w_uxN7pwQ6eXSMrOKceYIQ,
>>>>>>>>>>>>> topics=[TopicData(topicId=bvAKLSwmR7iJoKv2yZgygQ,
>>>>>>>>>>>>> partitions=[PartitionData(partitionIndex=2),
>>>>>>>>>>>>> PartitionData(partitionIndex=1)]),
>>>>>>>>>>>>> TopicData(topicId=uNe7f5VrQgO0zST6yH1jDQ,
>>>>>>>>>>>>> partitions=[PartitionData(partitionIndex=0)])])]) with context
>>>>>>>>>>>>>
>>>>> RequestContext(header=RequestHeader(apiKey=ASSIGN_REPLICAS_TO_DIRS,
>>>>>>>>>>>>> apiVersion=0, clientId=1000, correlationId=14, headerVersion=2),
>>>>>>>>>>>>> connectionId='172.16.14.219:9090-172.16.14.217:53590-7',
>>>>>>>>>> clientAddress=/
>>>>>>>>>>>>> 172.16.14.217, principal=User:CN=my-cluster-kafka,O=io.strimzi,
>>>>>>>>>>>>> listenerName=ListenerName(CONTROLPLANE-9090),
>>>>> securityProtocol=SSL,
>>>>>>>>>>>>>
>>>>> clientInformation=ClientInformation(softwareName=apache-kafka-java,
>>>>>>>>>>>>> softwareVersion=3.7.0), fromPrivilegedListener=false,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> principalSerde=Optional[org.apache.kafka.common.security.authenticator.DefaultKafkaPrincipalBuilder@71004ad2
>>>>>>>>>>>>> ])
>>>>>>>>>>>>> (kafka.server.ControllerApis) [quorum-controller-0-event-handler]
>>>>>>>>>>>>> java.util.concurrent.CompletionException:
>>>>>>>>>>>>> org.apache.kafka.common.errors.UnsupportedVersionException:
>>>>>>>>> Directory
>>>>>>>>>>>>> assignment is not supported yet.
>>>>>>>>>>>>>
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:636)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> org.apache.kafka.controller.QuorumController$ControllerWriteEvent.complete(QuorumController.java:880)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> org.apache.kafka.controller.QuorumController$ControllerWriteEvent.handleException(QuorumController.java:871)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> org.apache.kafka.queue.KafkaEventQueue$EventContext.completeWithException(KafkaEventQueue.java:148)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:137)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181)
>>>>>>>>>>>>> at java.base/java.lang.Thread.run(Thread.java:840)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Caused by:
>>>>>>>>> org.apache.kafka.common.errors.UnsupportedVersionException:
>>>>>>>>>>>>> Directory assignment is not supported yet.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is that expected? I guess with the metadata version set to
>>>>>>>>> 3.6-IV2, it
>>>>>>>>>>>>> makes sense that the request is not supported. But shouldn't then
>>>>>>>>> the
>>>>>>>>>>>>> request not be sent at all by the brokers? (I did not opened a
>>>>> JIRA
>>>>>>>>>> for
>>>>>>>>>>>> it,
>>>>>>>>>>>>> but I can open one if you agree this is not expected)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks & Regards
>>>>>>>>>>>>> Jakub
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Jan 13, 2024 at 8:03 AM Luke Chen <showuon@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Stanislav,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I commented in the "Apache Kafka 3.7.0 Release" thread, but
>>>>> maybe
>>>>>>>>>> you
>>>>>>>>>>>>>> missed it.
>>>>>>>>>>>>>> cross-posting here:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There is a bug KAFKA-16101
>>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/KAFKA-16101> reporting
>>>>>>>>> that
>>>>>>>>>>>>> "Kafka
>>>>>>>>>>>>>> cluster will be unavailable during KRaft migration rollback".
>>>>>>>>>>>>>> The impact for this issue is that if brokers try to rollback to
>>>>>>>>> ZK
>>>>>>>>>> mode
>>>>>>>>>>>>>> during KRaft migration process, there will be a period of time
>>>>>>>>> the
>>>>>>>>>>>>> cluster
>>>>>>>>>>>>>> is unavailable.
>>>>>>>>>>>>>> Since ZK migrating to KRaft feature is a production ready
>>>>>>>>> feature, I
>>>>>>>>>>>>> think
>>>>>>>>>>>>>> this should be addressed soon.
>>>>>>>>>>>>>> Do you think this is a blocker for v3.7.0?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>> Luke
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, Jan 13, 2024 at 8:36 AM Chris Egerton <
>>>>>>>>>> fearthecellos@gmail.com
>>>>>>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks, Kirk!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> @Stanislav--do you believe that this warrants a new RC?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jan 12, 2024, 19:08 Kirk True <kirk@kirktrue.pro>
>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Chris/Stanislav,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm working on the 'Unable to find FetchSessionHandler' log
>>>>>>>>>> problem
>>>>>>>>>>>>>>>> (KAFKA-16029) and have put out a draft PR (
>>>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/15186). I will use the
>>>>>>>>>>>>> quickstart
>>>>>>>>>>>>>>>> approach as a second means to reproduce/verify while I wait
>>>>>>>>> for
>>>>>>>>>> the
>>>>>>>>>>>>>> PR's
>>>>>>>>>>>>>>>> Jenkins job to finish.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Kirk
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jan 12, 2024, at 11:31 AM, Chris Egerton wrote:
>>>>>>>>>>>>>>>>> Hi Stanislav,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for running this release!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To verify, I:
>>>>>>>>>>>>>>>>> - Built from source using Java 11 with both:
>>>>>>>>>>>>>>>>> - - the 3.7.0-rc2 tag on GitHub
>>>>>>>>>>>>>>>>> - - the kafka-3.7.0-src.tgz artifact from
>>>>>>>>>>>>>>>>>
>>>>>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
>>>>>>>>>>>>>>>>> - Checked signatures and checksums
>>>>>>>>>>>>>>>>> - Ran the quickstart using both:
>>>>>>>>>>>>>>>>> - - The kafka_2.13-3.7.0.tgz artifact from
>>>>>>>>>>>>>>>>>
>>>>>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
>>>>>>>>>>>> with
>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>> 11
>>>>>>>>>>>>>>>>> and Scala 13 in KRaft mode
>>>>>>>>>>>>>>>>> - - Our shiny new broker Docker image,
>>>>>>>>> apache/kafka:3.7.0-rc2
>>>>>>>>>>>>>>>>> - Ran all unit tests
>>>>>>>>>>>>>>>>> - Ran all integration tests for Connect and MM2
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I found two minor areas for concern:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1. (Possibly a blocker)
>>>>>>>>>>>>>>>>> When running the quickstart, I noticed this ERROR-level log
>>>>>>>>>>>> message
>>>>>>>>>>>>>>> being
>>>>>>>>>>>>>>>>> emitted frequently (not not every time) when I killed my
>>>>>>>>>> console
>>>>>>>>>>>>>>> consumer
>>>>>>>>>>>>>>>>> via ctrl-C:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [2024-01-12 11:00:31,088] ERROR [Consumer
>>>>>>>>>>>>>> clientId=console-consumer,
>>>>>>>>>>>>>>>>> groupId=console-consumer-74388] Unable to find
>>>>>>>>>>>> FetchSessionHandler
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> node
>>>>>>>>>>>>>>>>> 1. Ignoring fetch response
>>>>>>>>>>>>>>>>> (org.apache.kafka.clients.consumer.internals.AbstractFetch)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I see that this error message is already reported in
>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-16029. I
>>>>>>>>> think we
>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>> prioritize fixing it for this release. I know it's probably
>>>>>>>>>>>> benign
>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>> it's
>>>>>>>>>>>>>>>>> really not a good look for us when basic operations log
>>>>>>>>> error
>>>>>>>>>>>>>> messages,
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> it may give new users some headaches.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2. (Probably not a blocker)
>>>>>>>>>>>>>>>>> The following unit tests failed the first time around, and
>>>>>>>>>> all of
>>>>>>>>>>>>>> them
>>>>>>>>>>>>>>>>> passed the second time I ran them:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - (clients)
>>>>>>>>>>>>>>>>
>>>>>>>>> ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup()
>>>>>>>>>>>>>>>>> - (clients) SelectorTest.testConnectionsByClientMetric()
>>>>>>>>>>>>>>>>> - (clients)
>>>>>>>>> Tls13SelectorTest.testConnectionsByClientMetric()
>>>>>>>>>>>>>>>>> - (connect)
>>>>>>>>>>>>>> TopicAdminTest.retryEndOffsetsShouldRetryWhenTopicNotFound
>>>>>>>>>>>>>>> (I
>>>>>>>>>>>>>>>>> thought I fixed this one! 🤬🤬)
>>>>>>>>>>>>>>>>> - (core)
>>>>>>>>>> ProducerIdManagerTest.testUnrecoverableErrors(Errors)[2]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks again for your work on this release, and
>>>>>>>>>> congratulations
>>>>>>>>>>>> to
>>>>>>>>>>>>>>> Kafka
>>>>>>>>>>>>>>>>> Streams for having zero flaky unit tests during my
>>>>>>>>>>>>>> highly-experimental
>>>>>>>>>>>>>>>>> single laptop run!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Jan 11, 2024 at 1:33 PM Stanislav Kozlovski
>>>>>>>>>>>>>>>>> <stanislav@confluent.io.invalid> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hello Kafka users, developers, and client-developers,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This is the first candidate for release of Apache Kafka
>>>>>>>>>> 3.7.0.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Note it's named "RC2" because I had a few "failed" RCs
>>>>>>>>> that
>>>>>>>>>> I
>>>>>>>>>>>> had
>>>>>>>>>>>>>>>>>> cut/uploaded but ultimately had to scrap prior to
>>>>>>>>> announcing
>>>>>>>>>>>> due
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>> blockers arriving before I could even announce them.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Further - I haven't yet been able to set up the system
>>>>>>>>> tests
>>>>>>>>>>>>>>>> successfully.
>>>>>>>>>>>>>>>>>> And the integration/unit tests do have a few failures
>>>>>>>>> that I
>>>>>>>>>>>> have
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> spend
>>>>>>>>>>>>>>>>>> time triaging. I would appreciate any help in case anyone
>>>>>>>>>>>> notices
>>>>>>>>>>>>>> any
>>>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>>>>>> failing that they're subject matters experts in. Expect
>>>>>>>>> me
>>>>>>>>>> to
>>>>>>>>>>>>>> follow
>>>>>>>>>>>>>>>> up in
>>>>>>>>>>>>>>>>>> a day or two with more detailed analysis.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Major changes include:
>>>>>>>>>>>>>>>>>> - Early Access to KIP-848 - the next generation of the
>>>>>>>>>> consumer
>>>>>>>>>>>>>>>> rebalance
>>>>>>>>>>>>>>>>>> protocol
>>>>>>>>>>>>>>>>>> - KIP-858: Adding JBOD support to KRaft
>>>>>>>>>>>>>>>>>> - KIP-714: Observability into Client metrics via a
>>>>>>>>>> standardized
>>>>>>>>>>>>>>>> interface
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Check more information in the WIP blog post:
>>>>>>>>>>>>>>>>>> https://github.com/apache/kafka-site/pull/578
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Release notes for the 3.7.0 release:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/RELEASE_NOTES.html
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> *** Please download, test and vote by Thursday, January
>>>>>>>>> 18,
>>>>>>>>>> 9am
>>>>>>>>>>>>> PT
>>>>>>>>>>>>>>> ***
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Usually these deadlines tend to be 2-3 days, but due to
>>>>>>>>> this
>>>>>>>>>>>>> being
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> first RC and the tests not having ran yet, I am giving
>>>>>>>>> it a
>>>>>>>>>> bit
>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>> time.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Kafka's KEYS file containing PGP keys we use to sign the
>>>>>>>>>>>> release:
>>>>>>>>>>>>>>>>>> https://kafka.apache.org/KEYS
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Release artifacts to be voted upon (source and binary):
>>>>>>>>>>>>>>>>>>
>>>>>>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Docker release artifact to be voted upon:
>>>>>>>>>>>>>>>>>> apache/kafka:3.7.0-rc2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Maven artifacts to be voted upon:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Javadoc:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/javadoc/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Tag to be voted upon (off 3.7 branch) is the 3.7.0 tag:
>>>>>>>>>>>>>>>>>> https://github.com/apache/kafka/releases/tag/3.7.0-rc2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Documentation:
>>>>>>>>>>>>>>>>>> https://kafka.apache.org/37/documentation.html
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Protocol:
>>>>>>>>>>>>>>>>>> https://kafka.apache.org/37/protocol.html
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Successful Jenkins builds for the 3.7 branch:
>>>>>>>>>>>>>>>>>> Unit/integration tests:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>> https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.7/58/
>>>>>>>>>>>>>>>>>> There are failing tests here. I have to follow up with
>>>>>>>>>> triaging
>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>> the failures and figuring out if they're actual problems
>>>>>>>>> or
>>>>>>>>>>>>> simply
>>>>>>>>>>>>>>>> flakes.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> System tests:
>>>>>>>>>>>>>>>> https://jenkins.confluent.io/job/system-test-kafka/job/3.7/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> No successful system test runs yet. I am working on
>>>>>>>>> getting
>>>>>>>>>> the
>>>>>>>>>>>>> job
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> run.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Successful Docker Image Github Actions Pipeline for 3.7
>>>>>>>>>>>> branch:
>>>>>>>>>>>>>>>>>> Attached are the scan_report and report_jvm output files
>>>>>>>>>> from
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> Docker
>>>>>>>>>>>>>>>>>> Build run:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>> https://github.com/apache/kafka/actions/runs/7486094960/job/20375761673
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> And the final docker image build job - Docker Build Test
>>>>>>>>>>>>> Pipeline:
>>>>>>>>>>>>>>>>>> https://github.com/apache/kafka/actions/runs/7486178277
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The image is apache/kafka:3.7.0-rc2 -
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> https://hub.docker.com/layers/apache/kafka/3.7.0-rc2/images/sha256-5b4707c08170d39549fbb6e2a3dbb83936a50f987c0c097f23cb26b4c210c226?context=explore
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> /**************************************
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Stanislav Kozlovski
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Stanislav
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best,
>>>> Stanislav
>>>
>>
>
Just to clarify some points mentioned here before
KAFKA-14616: I raised a year ago so it's not related to JBOD work. It is rather a blocker bug for KRAFT in general. The PR from Colin should fix this. Am not sure if it is a blocker for 3.7 per-say as it was a major bug since 3.3 and got missed from all other releases.
Regarding the JBOD's work:
KAFKA-16082: Is not a blocker for 3.7 instead it's nice fix. The pr https://github.com/apache/kafka/pull/15136 is quite a small one and was approved by Proven and I but it is waiting for a committer's approval.
KAFKA-16162: This is a blocker for 3.7. Same it's a small pr https://github.com/apache/kafka/pull/15270 and it is approved Proven and I and the PR is waiting for committer's approval.
KAFKA-16157: This is a blocker for 3.7. There is one small suggestion for the pr https://github.com/apache/kafka/pull/15263 but I don't think any of the current feedback is blocking the pr from getting approved. Assuming we get a committer's approval on it.
KAFKA-16195: Same it's a blocker but it has approval from Proven and I and we are waiting for committer's approval on the pr https://github.com/apache/kafka/pull/15262.
If we can't get a committer approval for KAFKA-16162, KAFKA-16157 and KAFKA-16195 in time for 3.7 then we can mark JBOD as early release assuming we merge at least KAFKA-16195.
Regards,
Omnia
> On 26 Jan 2024, at 15:39, kafka@gnarula.com wrote:
>
> Apologies, I duplicated KAFKA-16157 twice in my previous message. I intended to mention KAFKA-16195
> with the PR at https://github.com/apache/kafka/pull/15262 as the second JIRA.
>
> Thanks,
> Gaurav
>
>> On 26 Jan 2024, at 15:34, kafka@gnarula.com wrote:
>>
>> Hi Stan,
>>
>> I wanted to share some updates about the bugs you shared earlier.
>>
>> - KAFKA-14616: I've reviewed and tested the PR from Colin and have observed
>> the fix works as intended.
>> - KAFKA-16162: I reviewed Proven's PR and found some gaps in the proposed fix. I've
>> therefore raised https://github.com/apache/kafka/pull/15270 following a discussion with Luke in JIRA.
>> - KAFKA-16082: I don't think this is marked as a blocker anymore. I'm awaiting
>> feedback/reviews at https://github.com/apache/kafka/pull/15136
>>
>> In addition to the above, there are 2 JIRAs I'd like to bring everyone's attention to:
>>
>> - KAFKA-16157: This is similar to KAFKA-14616 and is marked as a blocker. I've raised
>> https://github.com/apache/kafka/pull/15263 and am awaiting reviews on it.
>> - KAFKA-16157: I raised this yesterday and have addressed feedback from Luke. This should
>> hopefully get merged soon.
>>
>> Regards,
>> Gaurav
>>
>>
>>> On 24 Jan 2024, at 11:51, kafka@gnarula.com wrote:
>>>
>>> Hi Stanislav,
>>>
>>> Thanks for bringing these JIRAs/PRs up.
>>>
>>> I'll be testing the open PRs for KAFKA-14616 and KAFKA-16162 this week and I hope to have some feedback
>>> by Friday. I gather the latter JIRA is marked as a WIP by Proven and he's away. I'll try to build on his work in the meantime.
>>>
>>> As for KAFKA-16082, we haven't been able to deduce a data loss scenario. There's a PR open
>>> by me for promoting an abandoned future replica with approvals from Omnia and Proven,
>>> so I'd appreciate a committer reviewing it.
>>>
>>> Regards,
>>> Gaurav
>>>
>>> On 23 Jan 2024, at 20:17, Stanislav Kozlovski <stanislav@confluent.io.INVALID> wrote:
>>>>
>>>> Hey all, I figured I'd give an update about what known blockers we have
>>>> right now:
>>>>
>>>> - KAFKA-16101: KRaft migration rollback documentation is incorrect -
>>>> https://github.com/apache/kafka/pull/15193; This need not block RC
>>>> creation, but we need the docs updated so that people can test properly
>>>> - KAFKA-14616: Topic recreation with offline broker causes permanent URPs -
>>>> https://github.com/apache/kafka/pull/15230 ; I am of the understanding that
>>>> this is blocking JBOD for 3.7
>>>> - KAFKA-16162: New created topics are unavailable after upgrading to 3.7 -
>>>> a strict blocker with an open PR https://github.com/apache/kafka/pull/15232
>>>> - although I understand Proveen is out of office
>>>> - KAFKA-16082: JBOD: Possible dataloss when moving leader partition - I am
>>>> hearing mixed opinions on whether this is a blocker (
>>>> https://github.com/apache/kafka/pull/15136)
>>>>
>>>> Given that there are 3 JBOD blocker bugs, and I am not confident they will
>>>> all be merged this week - I am on the edge of voting to revert JBOD from
>>>> this release, or mark it early access.
>>>>
>>>> By all accounts, it seems that if we keep with JBOD the release will have
>>>> to spill into February, which is a month extra from the time-based release
>>>> plan we had of start of January.
>>>>
>>>> Can I ask others for an opinion?
>>>>
>>>> Best,
>>>> Stan
>>>>
>>>> On Thu, Jan 18, 2024 at 1:21 PM Luke Chen <showuon@gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I think I've found another blocker issue: KAFKA-16162
>>>>> <https://issues.apache.org/jira/browse/KAFKA-16162> .
>>>>> The impact is after upgrading to 3.7.0, any new created topics/partitions
>>>>> will be unavailable.
>>>>> I've put my findings in the JIRA.
>>>>>
>>>>> Thanks.
>>>>> Luke
>>>>>
>>>>> On Thu, Jan 18, 2024 at 9:50 AM Matthias J. Sax <mjsax@apache.org> wrote:
>>>>>
>>>>>> Stan, thanks for driving this all forward! Excellent job.
>>>>>>
>>>>>> About
>>>>>>
>>>>>>> StreamsStandbyTask - https://issues.apache.org/jira/browse/KAFKA-16141
>>>>>>> StreamsUpgradeTest - https://issues.apache.org/jira/browse/KAFKA-16139
>>>>>>
>>>>>> For `StreamsUpgradeTest` it was a test setup issue and should be fixed
>>>>>> now in trunk and 3.7 (and actually also in 3.6...)
>>>>>>
>>>>>> For `StreamsStandbyTask` the failing test exposes a regression bug, so
>>>>>> it's a blocker. I updated the ticket accordingly. We already have an
>>>>>> open PR that reverts the code introducing the regression.
>>>>>>
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>> On 1/17/24 9:44 AM, Proven Provenzano wrote:
>>>>>>> We have another blocking issue for the RC :
>>>>>>> https://issues.apache.org/jira/browse/KAFKA-16157. This bug is similar
>>>>>> to
>>>>>>> https://issues.apache.org/jira/browse/KAFKA-14616. The new issue
>>>>> however
>>>>>>> can lead to the new topic having partitions that a producer cannot
>>>>> write
>>>>>> to.
>>>>>>>
>>>>>>> --Proven
>>>>>>>
>>>>>>> On Tue, Jan 16, 2024 at 12:04 PM Proven Provenzano <
>>>>>> pprovenzano@confluent.io>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> I have a PR https://github.com/apache/kafka/pull/15197 for
>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-16131 that is building
>>>>> now.
>>>>>>>> --Proven
>>>>>>>>
>>>>>>>> On Mon, Jan 15, 2024 at 5:03 AM Jakub Scholz <jakub@scholz.cz> wrote:
>>>>>>>>
>>>>>>>>> *> Hi Jakub,> > Thanks for trying the RC. I think what you found is a
>>>>>>>>> blocker bug because it *
>>>>>>>>> *> will generate huge amount of logspam. I guess we didn't find it in
>>>>>>>>> junit
>>>>>>>>> tests *
>>>>>>>>> *> since logspam doesn't fail the automated tests. But certainly it's
>>>>>> not
>>>>>>>>> suitable *
>>>>>>>>> *> for production. Did you file a JIRA yet?*
>>>>>>>>>
>>>>>>>>> Hi Colin,
>>>>>>>>>
>>>>>>>>> I opened https://issues.apache.org/jira/browse/KAFKA-16131.
>>>>>>>>>
>>>>>>>>> Thanks & Regards
>>>>>>>>> Jakub
>>>>>>>>>
>>>>>>>>> On Mon, Jan 15, 2024 at 8:57 AM Colin McCabe <cmccabe@apache.org>
>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Stanislav,
>>>>>>>>>>
>>>>>>>>>> Thanks for making the first RC. The fact that it's titled RC2 is
>>>>>> messing
>>>>>>>>>> with my mind a bit. I hope this doesn't make people think that we're
>>>>>>>>>> farther along than we are, heh.
>>>>>>>>>>
>>>>>>>>>> On Sun, Jan 14, 2024, at 13:54, Jakub Scholz wrote:
>>>>>>>>>>> *> Nice catch! It does seem like we should have gated this behind
>>>>> the
>>>>>>>>>>> metadata> version as KIP-858 implies. Is the cluster configured
>>>>> with
>>>>>>>>>>> multiple log> dirs? What is the impact of the error messages?*
>>>>>>>>>>>
>>>>>>>>>>> I did not observe any obvious impact. I was able to send and
>>>>> receive
>>>>>>>>>>> messages as normally. But to be honest, I have no idea what else
>>>>>>>>>>> this might impact, so I did not try anything special.
>>>>>>>>>>>
>>>>>>>>>>> I think everyone upgrading an existing KRaft cluster will go
>>>>> through
>>>>>>>>> this
>>>>>>>>>>> stage (running Kafka 3.7 with an older metadata version for at
>>>>> least
>>>>>> a
>>>>>>>>>>> while). So even if it is just a logged exception without any other
>>>>>>>>>> impact I
>>>>>>>>>>> wonder if it might scare users from upgrading. But I leave it to
>>>>>>>>> others
>>>>>>>>>> to
>>>>>>>>>>> decide if this is a blocker or not.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Jakub,
>>>>>>>>>>
>>>>>>>>>> Thanks for trying the RC. I think what you found is a blocker bug
>>>>>>>>> because
>>>>>>>>>> it will generate huge amount of logspam. I guess we didn't find it
>>>>> in
>>>>>>>>> junit
>>>>>>>>>> tests since logspam doesn't fail the automated tests. But certainly
>>>>>> it's
>>>>>>>>>> not suitable for production. Did you file a JIRA yet?
>>>>>>>>>>
>>>>>>>>>>> On Sun, Jan 14, 2024 at 10:17 PM Stanislav Kozlovski
>>>>>>>>>>> <stanislav@confluent.io.invalid> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hey Luke,
>>>>>>>>>>>>
>>>>>>>>>>>> This is an interesting problem. Given the fact that the KIP for
>>>>>>>>> having a
>>>>>>>>>>>> 3.8 release passed, I think it weights the scale towards not
>>>>> calling
>>>>>>>>>> this a
>>>>>>>>>>>> blocker and expecting it to be solved in 3.7.1.
>>>>>>>>>>>>
>>>>>>>>>>>> It is unfortunate that it would not seem safe to migrate to KRaft
>>>>> in
>>>>>>>>>> 3.7.0
>>>>>>>>>>>> (given the inability to rollback safely), but if that's true - the
>>>>>>>>> same
>>>>>>>>>>>> case would apply for 3.6.0. So in any case users w\ould be
>>>>> expected
>>>>>>>>> to
>>>>>>>>>> use a
>>>>>>>>>>>> patch release for this.
>>>>>>>>>>
>>>>>>>>>> Hi Luke,
>>>>>>>>>>
>>>>>>>>>> Thanks for testing rollback. I think this is a case where the
>>>>>>>>>> documentation is wrong. The intention was to for the steps to
>>>>>> basically
>>>>>>>>> be:
>>>>>>>>>>
>>>>>>>>>> 1. roll all the brokers into zk mode, but with migration enabled
>>>>>>>>>> 2. take down the kraft quorum
>>>>>>>>>> 3. rmr /controller, allowing a hybrid broker to take over.
>>>>>>>>>> 4. roll all the brokers into zk mode without migration enabled (if
>>>>>>>>> desired)
>>>>>>>>>>
>>>>>>>>>> With these steps, there isn't really unavailability since a ZK
>>>>>>>>> controller
>>>>>>>>>> can be elected quickly after the kraft quorum is gone.
>>>>>>>>>>
>>>>>>>>>>>> Further, since we will have a 3.8 release - it is
>>>>>>>>>>>> likely we will ultimately recommend users upgrade from that
>>>>> version
>>>>>>>>>> given
>>>>>>>>>>>> its aim is to have strategic KRaft feature parity with ZK.
>>>>>>>>>>>> That being said, I am not 100% on this. Let me know whether you
>>>>>> think
>>>>>>>>>> this
>>>>>>>>>>>> should block the release, Luke. I am also tagging Colin and David
>>>>> to
>>>>>>>>>> weigh
>>>>>>>>>>>> in with their opinions, as they worked on the migration logic.
>>>>>>>>>>
>>>>>>>>>> The rollback docs are new in 3.7 so the fact that they're wrong is a
>>>>>>>>> clear
>>>>>>>>>> blocker, I think. But easy to fix, I believe. I will create a PR.
>>>>>>>>>>
>>>>>>>>>> best,
>>>>>>>>>> Colin
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hey Kirk and Chris,
>>>>>>>>>>>>
>>>>>>>>>>>> Unless I'm missing something - KAFKALESS-16029 is simply a bad log
>>>>>>>>> due
>>>>>>>>>> to
>>>>>>>>>>>> improper closing. And the PR description implies this has been
>>>>>>>>> present
>>>>>>>>>>>> since 3.5. While annoying, I don't see a strong reason for this to
>>>>>>>>> block
>>>>>>>>>>>> the release.
>>>>>>>>>>>>
>>>>>>>>>>>> Hey Jakub,
>>>>>>>>>>>>
>>>>>>>>>>>> Nice catch! It does seem like we should have gated this behind the
>>>>>>>>>> metadata
>>>>>>>>>>>> version as KIP-858 implies. Is the cluster configured with
>>>>> multiple
>>>>>>>>> log
>>>>>>>>>>>> dirs? What is the impact of the error messages?
>>>>>>>>>>>>
>>>>>>>>>>>> Tagging Igor (the author of the KIP) to weigh in.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Stanislav
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Jan 13, 2024 at 7:22 PM Jakub Scholz <jakub@scholz.cz>
>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I was trying the RC2 and run into the following issue ... when I
>>>>>>>>> run
>>>>>>>>>>>>> 3.7.0-RC2 KRaft cluster with metadata version set to 3.6-IV2
>>>>>>>>> metadata
>>>>>>>>>>>>> version, I seem to be getting repeated errors like this in the
>>>>>>>>>> controller
>>>>>>>>>>>>> logs:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2024-01-13 16:58:01,197 INFO [QuorumController id=0]
>>>>>>>>>>>> assignReplicasToDirs:
>>>>>>>>>>>>> event failed with UnsupportedVersionException in 15 microseconds.
>>>>>>>>>>>>> (org.apache.kafka.controller.QuorumController)
>>>>>>>>>>>>> [quorum-controller-0-event-handler]
>>>>>>>>>>>>> 2024-01-13 16:58:01,197 ERROR [ControllerApis nodeId=0]
>>>>> Unexpected
>>>>>>>>>> error
>>>>>>>>>>>>> handling request RequestHeader(apiKey=ASSIGN_REPLICAS_TO_DIRS,
>>>>>>>>>>>>> apiVersion=0, clientId=1000, correlationId=14, headerVersion=2)
>>>>> --
>>>>>>>>>>>>> AssignReplicasToDirsRequestData(brokerId=1000, brokerEpoch=5,
>>>>>>>>>>>>> directories=[DirectoryData(id=w_uxN7pwQ6eXSMrOKceYIQ,
>>>>>>>>>>>>> topics=[TopicData(topicId=bvAKLSwmR7iJoKv2yZgygQ,
>>>>>>>>>>>>> partitions=[PartitionData(partitionIndex=2),
>>>>>>>>>>>>> PartitionData(partitionIndex=1)]),
>>>>>>>>>>>>> TopicData(topicId=uNe7f5VrQgO0zST6yH1jDQ,
>>>>>>>>>>>>> partitions=[PartitionData(partitionIndex=0)])])]) with context
>>>>>>>>>>>>>
>>>>> RequestContext(header=RequestHeader(apiKey=ASSIGN_REPLICAS_TO_DIRS,
>>>>>>>>>>>>> apiVersion=0, clientId=1000, correlationId=14, headerVersion=2),
>>>>>>>>>>>>> connectionId='172.16.14.219:9090-172.16.14.217:53590-7',
>>>>>>>>>> clientAddress=/
>>>>>>>>>>>>> 172.16.14.217, principal=User:CN=my-cluster-kafka,O=io.strimzi,
>>>>>>>>>>>>> listenerName=ListenerName(CONTROLPLANE-9090),
>>>>> securityProtocol=SSL,
>>>>>>>>>>>>>
>>>>> clientInformation=ClientInformation(softwareName=apache-kafka-java,
>>>>>>>>>>>>> softwareVersion=3.7.0), fromPrivilegedListener=false,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> principalSerde=Optional[org.apache.kafka.common.security.authenticator.DefaultKafkaPrincipalBuilder@71004ad2
>>>>>>>>>>>>> ])
>>>>>>>>>>>>> (kafka.server.ControllerApis) [quorum-controller-0-event-handler]
>>>>>>>>>>>>> java.util.concurrent.CompletionException:
>>>>>>>>>>>>> org.apache.kafka.common.errors.UnsupportedVersionException:
>>>>>>>>> Directory
>>>>>>>>>>>>> assignment is not supported yet.
>>>>>>>>>>>>>
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:636)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> org.apache.kafka.controller.QuorumController$ControllerWriteEvent.complete(QuorumController.java:880)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> org.apache.kafka.controller.QuorumController$ControllerWriteEvent.handleException(QuorumController.java:871)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> org.apache.kafka.queue.KafkaEventQueue$EventContext.completeWithException(KafkaEventQueue.java:148)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:137)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210)
>>>>>>>>>>>>> at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181)
>>>>>>>>>>>>> at java.base/java.lang.Thread.run(Thread.java:840)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Caused by:
>>>>>>>>> org.apache.kafka.common.errors.UnsupportedVersionException:
>>>>>>>>>>>>> Directory assignment is not supported yet.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is that expected? I guess with the metadata version set to
>>>>>>>>> 3.6-IV2, it
>>>>>>>>>>>>> makes sense that the request is not supported. But shouldn't then
>>>>>>>>> the
>>>>>>>>>>>>> request not be sent at all by the brokers? (I did not opened a
>>>>> JIRA
>>>>>>>>>> for
>>>>>>>>>>>> it,
>>>>>>>>>>>>> but I can open one if you agree this is not expected)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks & Regards
>>>>>>>>>>>>> Jakub
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Jan 13, 2024 at 8:03 AM Luke Chen <showuon@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Stanislav,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I commented in the "Apache Kafka 3.7.0 Release" thread, but
>>>>> maybe
>>>>>>>>>> you
>>>>>>>>>>>>>> missed it.
>>>>>>>>>>>>>> cross-posting here:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There is a bug KAFKA-16101
>>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/KAFKA-16101> reporting
>>>>>>>>> that
>>>>>>>>>>>>> "Kafka
>>>>>>>>>>>>>> cluster will be unavailable during KRaft migration rollback".
>>>>>>>>>>>>>> The impact for this issue is that if brokers try to rollback to
>>>>>>>>> ZK
>>>>>>>>>> mode
>>>>>>>>>>>>>> during KRaft migration process, there will be a period of time
>>>>>>>>> the
>>>>>>>>>>>>> cluster
>>>>>>>>>>>>>> is unavailable.
>>>>>>>>>>>>>> Since ZK migrating to KRaft feature is a production ready
>>>>>>>>> feature, I
>>>>>>>>>>>>> think
>>>>>>>>>>>>>> this should be addressed soon.
>>>>>>>>>>>>>> Do you think this is a blocker for v3.7.0?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>> Luke
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, Jan 13, 2024 at 8:36 AM Chris Egerton <
>>>>>>>>>> fearthecellos@gmail.com
>>>>>>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks, Kirk!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> @Stanislav--do you believe that this warrants a new RC?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jan 12, 2024, 19:08 Kirk True <kirk@kirktrue.pro>
>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Chris/Stanislav,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm working on the 'Unable to find FetchSessionHandler' log
>>>>>>>>>> problem
>>>>>>>>>>>>>>>> (KAFKA-16029) and have put out a draft PR (
>>>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/15186). I will use the
>>>>>>>>>>>>> quickstart
>>>>>>>>>>>>>>>> approach as a second means to reproduce/verify while I wait
>>>>>>>>> for
>>>>>>>>>> the
>>>>>>>>>>>>>> PR's
>>>>>>>>>>>>>>>> Jenkins job to finish.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Kirk
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jan 12, 2024, at 11:31 AM, Chris Egerton wrote:
>>>>>>>>>>>>>>>>> Hi Stanislav,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for running this release!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To verify, I:
>>>>>>>>>>>>>>>>> - Built from source using Java 11 with both:
>>>>>>>>>>>>>>>>> - - the 3.7.0-rc2 tag on GitHub
>>>>>>>>>>>>>>>>> - - the kafka-3.7.0-src.tgz artifact from
>>>>>>>>>>>>>>>>>
>>>>>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
>>>>>>>>>>>>>>>>> - Checked signatures and checksums
>>>>>>>>>>>>>>>>> - Ran the quickstart using both:
>>>>>>>>>>>>>>>>> - - The kafka_2.13-3.7.0.tgz artifact from
>>>>>>>>>>>>>>>>>
>>>>>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
>>>>>>>>>>>> with
>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>> 11
>>>>>>>>>>>>>>>>> and Scala 13 in KRaft mode
>>>>>>>>>>>>>>>>> - - Our shiny new broker Docker image,
>>>>>>>>> apache/kafka:3.7.0-rc2
>>>>>>>>>>>>>>>>> - Ran all unit tests
>>>>>>>>>>>>>>>>> - Ran all integration tests for Connect and MM2
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I found two minor areas for concern:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1. (Possibly a blocker)
>>>>>>>>>>>>>>>>> When running the quickstart, I noticed this ERROR-level log
>>>>>>>>>>>> message
>>>>>>>>>>>>>>> being
>>>>>>>>>>>>>>>>> emitted frequently (not not every time) when I killed my
>>>>>>>>>> console
>>>>>>>>>>>>>>> consumer
>>>>>>>>>>>>>>>>> via ctrl-C:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [2024-01-12 11:00:31,088] ERROR [Consumer
>>>>>>>>>>>>>> clientId=console-consumer,
>>>>>>>>>>>>>>>>> groupId=console-consumer-74388] Unable to find
>>>>>>>>>>>> FetchSessionHandler
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> node
>>>>>>>>>>>>>>>>> 1. Ignoring fetch response
>>>>>>>>>>>>>>>>> (org.apache.kafka.clients.consumer.internals.AbstractFetch)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I see that this error message is already reported in
>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-16029. I
>>>>>>>>> think we
>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>> prioritize fixing it for this release. I know it's probably
>>>>>>>>>>>> benign
>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>> it's
>>>>>>>>>>>>>>>>> really not a good look for us when basic operations log
>>>>>>>>> error
>>>>>>>>>>>>>> messages,
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> it may give new users some headaches.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2. (Probably not a blocker)
>>>>>>>>>>>>>>>>> The following unit tests failed the first time around, and
>>>>>>>>>> all of
>>>>>>>>>>>>>> them
>>>>>>>>>>>>>>>>> passed the second time I ran them:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - (clients)
>>>>>>>>>>>>>>>>
>>>>>>>>> ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup()
>>>>>>>>>>>>>>>>> - (clients) SelectorTest.testConnectionsByClientMetric()
>>>>>>>>>>>>>>>>> - (clients)
>>>>>>>>> Tls13SelectorTest.testConnectionsByClientMetric()
>>>>>>>>>>>>>>>>> - (connect)
>>>>>>>>>>>>>> TopicAdminTest.retryEndOffsetsShouldRetryWhenTopicNotFound
>>>>>>>>>>>>>>> (I
>>>>>>>>>>>>>>>>> thought I fixed this one! 🤬🤬)
>>>>>>>>>>>>>>>>> - (core)
>>>>>>>>>> ProducerIdManagerTest.testUnrecoverableErrors(Errors)[2]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks again for your work on this release, and
>>>>>>>>>> congratulations
>>>>>>>>>>>> to
>>>>>>>>>>>>>>> Kafka
>>>>>>>>>>>>>>>>> Streams for having zero flaky unit tests during my
>>>>>>>>>>>>>> highly-experimental
>>>>>>>>>>>>>>>>> single laptop run!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Jan 11, 2024 at 1:33 PM Stanislav Kozlovski
>>>>>>>>>>>>>>>>> <stanislav@confluent.io.invalid> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hello Kafka users, developers, and client-developers,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This is the first candidate for release of Apache Kafka
>>>>>>>>>> 3.7.0.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Note it's named "RC2" because I had a few "failed" RCs
>>>>>>>>> that
>>>>>>>>>> I
>>>>>>>>>>>> had
>>>>>>>>>>>>>>>>>> cut/uploaded but ultimately had to scrap prior to
>>>>>>>>> announcing
>>>>>>>>>>>> due
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>> blockers arriving before I could even announce them.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Further - I haven't yet been able to set up the system
>>>>>>>>> tests
>>>>>>>>>>>>>>>> successfully.
>>>>>>>>>>>>>>>>>> And the integration/unit tests do have a few failures
>>>>>>>>> that I
>>>>>>>>>>>> have
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> spend
>>>>>>>>>>>>>>>>>> time triaging. I would appreciate any help in case anyone
>>>>>>>>>>>> notices
>>>>>>>>>>>>>> any
>>>>>>>>>>>>>>>> tests
>>>>>>>>>>>>>>>>>> failing that they're subject matters experts in. Expect
>>>>>>>>> me
>>>>>>>>>> to
>>>>>>>>>>>>>> follow
>>>>>>>>>>>>>>>> up in
>>>>>>>>>>>>>>>>>> a day or two with more detailed analysis.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Major changes include:
>>>>>>>>>>>>>>>>>> - Early Access to KIP-848 - the next generation of the
>>>>>>>>>> consumer
>>>>>>>>>>>>>>>> rebalance
>>>>>>>>>>>>>>>>>> protocol
>>>>>>>>>>>>>>>>>> - KIP-858: Adding JBOD support to KRaft
>>>>>>>>>>>>>>>>>> - KIP-714: Observability into Client metrics via a
>>>>>>>>>> standardized
>>>>>>>>>>>>>>>> interface
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Check more information in the WIP blog post:
>>>>>>>>>>>>>>>>>> https://github.com/apache/kafka-site/pull/578
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Release notes for the 3.7.0 release:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/RELEASE_NOTES.html
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> *** Please download, test and vote by Thursday, January
>>>>>>>>> 18,
>>>>>>>>>> 9am
>>>>>>>>>>>>> PT
>>>>>>>>>>>>>>> ***
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Usually these deadlines tend to be 2-3 days, but due to
>>>>>>>>> this
>>>>>>>>>>>>> being
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> first RC and the tests not having ran yet, I am giving
>>>>>>>>> it a
>>>>>>>>>> bit
>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>> time.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Kafka's KEYS file containing PGP keys we use to sign the
>>>>>>>>>>>> release:
>>>>>>>>>>>>>>>>>> https://kafka.apache.org/KEYS
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Release artifacts to be voted upon (source and binary):
>>>>>>>>>>>>>>>>>>
>>>>>>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Docker release artifact to be voted upon:
>>>>>>>>>>>>>>>>>> apache/kafka:3.7.0-rc2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Maven artifacts to be voted upon:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Javadoc:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/javadoc/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Tag to be voted upon (off 3.7 branch) is the 3.7.0 tag:
>>>>>>>>>>>>>>>>>> https://github.com/apache/kafka/releases/tag/3.7.0-rc2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Documentation:
>>>>>>>>>>>>>>>>>> https://kafka.apache.org/37/documentation.html
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Protocol:
>>>>>>>>>>>>>>>>>> https://kafka.apache.org/37/protocol.html
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Successful Jenkins builds for the 3.7 branch:
>>>>>>>>>>>>>>>>>> Unit/integration tests:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>> https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.7/58/
>>>>>>>>>>>>>>>>>> There are failing tests here. I have to follow up with
>>>>>>>>>> triaging
>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>> the failures and figuring out if they're actual problems
>>>>>>>>> or
>>>>>>>>>>>>> simply
>>>>>>>>>>>>>>>> flakes.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> System tests:
>>>>>>>>>>>>>>>> https://jenkins.confluent.io/job/system-test-kafka/job/3.7/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> No successful system test runs yet. I am working on
>>>>>>>>> getting
>>>>>>>>>> the
>>>>>>>>>>>>> job
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> run.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> * Successful Docker Image Github Actions Pipeline for 3.7
>>>>>>>>>>>> branch:
>>>>>>>>>>>>>>>>>> Attached are the scan_report and report_jvm output files
>>>>>>>>>> from
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> Docker
>>>>>>>>>>>>>>>>>> Build run:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>> https://github.com/apache/kafka/actions/runs/7486094960/job/20375761673
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> And the final docker image build job - Docker Build Test
>>>>>>>>>>>>> Pipeline:
>>>>>>>>>>>>>>>>>> https://github.com/apache/kafka/actions/runs/7486178277
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The image is apache/kafka:3.7.0-rc2 -
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>> https://hub.docker.com/layers/apache/kafka/3.7.0-rc2/images/sha256-5b4707c08170d39549fbb6e2a3dbb83936a50f987c0c097f23cb26b4c210c226?context=explore
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> /**************************************
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Stanislav Kozlovski
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Stanislav
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best,
>>>> Stanislav
>>>
>>
>
Comments
Post a Comment