On 2023-04-22 11:23, David Arthur wrote:
> Akshay, this looks a lot like
> https://issues.apache.org/jira/browse/KAFKA-14035 which was fixed for
> 3.3.0. Can you upload complete controller logs to that JIRA (or a new
> one
> if you prefer)?
>
> Thanks!
> David
>
> On Sat, Apr 22, 2023 at 2:54 AM Luke Chen <showuon@gmail.com> wrote:
>
>> Hi Akshay,
>>
>> Thanks for reporting the issue.
>> It looks like a bug.
>> Could you open a JIRA <https://issues.apache.org/jira/browse/KAFKA>
>> ticket
>> to track it?
>>
>> Thank you.
>> Luke
>>
>>
>> On Fri, Apr 21, 2023 at 10:16 PM Akshay Kumar
>> <akshaykumar@ameyo.com.invalid>
>> wrote:
>>
>> > Hello team,
>> >
>> > - We are using the zookeeper less Kafka (kafka Kraft).
>> > - The cluster is having 3 nodes.
>> > - One of the nodes gets automatically shut down randomly.
>> > - Checked the logs but didn't get the exact reason.
>> > - Sharing the logs below. Kafka version - 3.3.1
>> >
>> > *Logs - *
>> >
>> > [2023-04-13 01:49:17,411] WARN [Controller 1] Renouncing the leadership
>> > due to a metadata log event. We were the leader at epoch 37110, but in
>> the
>> > new epoch 37111, the leader is (none). Reverting to last committed offset
>> > 28291464. (org.apache.kafka.controller.QuorumController)
>> > [2023-04-13 01:49:17,531] INFO [RaftManager nodeId=1] Completed
>> transition
>> > to Unattached(epoch=37112, voters=[1, 2, 3], electionTimeoutMs=982)
>> > (org.apache.kafka.raft.QuorumState)
>> >
>> > [2023-04-13 02:00:33,902] WARN [Controller 1] Renouncing the leadership
>> > due to a metadata log event. We were the leader at epoch 37116, but in
>> the
>> > new epoch 37117, the leader is (none). Reverting to last committed offset
>> > 28292807. (org.apache.kafka.controller.QuorumController)
>> > [2023-04-13 02:00:33,936] INFO [RaftManager nodeId=1] Completed
>> transition
>> > to Unattached(epoch=37118, voters=[1, 2, 3], electionTimeoutMs=1497)
>> > (org.apache.kafka.raft.QuorumState)
>> >
>> > [2023-04-13 02:00:35,014] ERROR [Controller 1] processBrokerHeartbeat:
>> > unable to start processing because of NotControllerException.
>> > (org.apache.kafka.controller.QuorumController)
>> >
>> > [2023-04-13 02:12:21,883] WARN [Controller 1] Renouncing the leadership
>> > due to a metadata log event. We were the leader at epoch 37129, but in
>> the
>> > new epoch 37131, the leader is (none). Reverting to last committed offset
>> > 28294206. (org.apache.kafka.controller.QuorumController)
>> >
>> > [2023-04-13 02:13:41,328] WARN [Controller 1] Renouncing the leadership
>> > due to a metadata log event. We were the leader at epoch 37141, but in
>> the
>> > new epoch 37142, the leader is (none). Reverting to last committed offset
>> > 28294325. (org.apache.kafka.controller.QuorumController)
>> >
>> > [2023-04-13 02:13:41,328] INFO [Controller 1] writeNoOpRecord: failed
>> with
>> > NotControllerException in 16561838 us
>> > (org.apache.kafka.controller.QuorumController)
>> >
>> > [2023-04-13 02:13:41,328] INFO [Controller 1] maybeFenceReplicas: failed
>> > with NotControllerException in 8520846 us
>> > (org.apache.kafka.controller.QuorumController)
>> >
>> > [2023-04-13 02:13:41,328] INFO [BrokerToControllerChannelManager broker=1
>> > name=heartbeat] Client requested disconnect from node 1
>> > (org.apache.kafka.clients.NetworkClient)
>> > [2023-04-13 02:13:41,329] INFO [BrokerLifecycleManager id=1] Unable to
>> > send a heartbeat because the RPC got timed out before it could be sent.
>> > (kafka.server.BrokerLifecycleManager)
>> > [2023-04-13 02:13:41,351] ERROR Encountered fatal fault: exception while
>> > renouncing leadership
>> > (org.apache.kafka.server.fault.ProcessExitingFaultHandler)
>> > java.lang.NullPointerException
>> > at
>> >
>> org.apache.kafka.timeline.SnapshottableHashTable$HashTier.mergeFrom(SnapshottableHashTable.java:125)
>> > at org.apache.kafka.timeline.Snapshot.mergeFrom(Snapshot.java:68)
>> > at
>> >
>> org.apache.kafka.timeline.SnapshotRegistry.deleteSnapshot(SnapshotRegistry.java:236)
>> > at
>> >
>> org.apache.kafka.timeline.SnapshotRegistry$SnapshotIterator.remove(SnapshotRegistry.java:67)
>> > at
>> >
>> org.apache.kafka.timeline.SnapshotRegistry.revertToSnapshot(SnapshotRegistry.java:214)
>> > at
>> >
>> org.apache.kafka.controller.QuorumController.renounce(QuorumController.java:1232)
>> > at
>> >
>> org.apache.kafka.controller.QuorumController.access$3300(QuorumController.java:150)
>> > at
>> >
>> org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$handleLeaderChange$3(QuorumController.java:1076)
>> > at
>> >
>> org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$appendRaftEvent$4(QuorumController.java:1101)
>> > at
>> >
>> org.apache.kafka.controller.QuorumController$ControlEvent.run(QuorumController.java:496)
>> > at
>> >
>> org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
>> > at
>> >
>> org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
>> > at
>> >
>> org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
>> > at java.lang.Thread.run(Thread.java:750)
>> > [2023-04-13 02:13:41,385] INFO [BrokerServer id=1] Transition from
>> STARTED
>> > to SHUTTING_DOWN (kafka.server.BrokerServer)
>> >
>> >
>> >
>> > Regards,
>> > *[image: Inline image 1] *
>> > *Akshay Kumar*
>> >
>> > *Senior Software Engineer ll | AMEYO
>> > <
>> http://www.ameyo.com/?utm_source=signature&utm_medium=email&utm_campaign=Email-Signature
>> >+91-8556063696*
>> > *[image: Facebook] <https://www.facebook.com/AmeyoCIM> [image: Twitter]
>> > <https://twitter.com/AmeyoCIM> [image: Google Plus]
>> > <https://plus.google.com/+AmeyoCIM/> [image: Linkedin]
>> > <https://www.linkedin.com/company/ameyocim> *
>> > *Latest from the Ameyo Blog*
>> > <
>> http://www.ameyo.com/blog/?utm_source=signature&utm_medium=email&utm_campaign=Email-Signature
>> >
>> >
>> >
>> > *[image:
>> >
>> https://exotel.com/virtual-telecom-operator-india/?utm_campaign=ulvno&utm_medium=email-signature&utm_source=corp-emails
>> ]
>> > <
>> https://exotel.com/virtual-telecom-operator-india/?utm_campaign=ulvno&utm_medium=email-signature&utm_source=corp-emails
>> >*
>> >
>> > *Disclaimer:* The information in this communication is confidential and
>> > may be legally privileged. It is intended solely for the use of the
>> > individual or entity to whom it is addressed and others authorized to
>> > receive it. If you are not the intended recipient you are hereby notified
>> > that any disclosure, copying, distribution or taking action in reliance
>> of
>> > the contents of this information is strictly prohibited and may be
>> > unlawful. Drishti is neither liable for the improper, incomplete
>> > transmission of the information contained in this communication nor any
>> > delay in its receipt. The communication is not intended to operate as an
>> > electronic signature under any applicable law. Drishti assumes no
>> > responsibility for any loss or damage resulting from the use of e-mails.
>> >
>>
Hi,
New information.
Kind regards,
> Akshay, this looks a lot like
> https://issues.apache.org/jira/browse/KAFKA-14035 which was fixed for
> 3.3.0. Can you upload complete controller logs to that JIRA (or a new
> one
> if you prefer)?
>
> Thanks!
> David
>
> On Sat, Apr 22, 2023 at 2:54 AM Luke Chen <showuon@gmail.com> wrote:
>
>> Hi Akshay,
>>
>> Thanks for reporting the issue.
>> It looks like a bug.
>> Could you open a JIRA <https://issues.apache.org/jira/browse/KAFKA>
>> ticket
>> to track it?
>>
>> Thank you.
>> Luke
>>
>>
>> On Fri, Apr 21, 2023 at 10:16 PM Akshay Kumar
>> <akshaykumar@ameyo.com.invalid>
>> wrote:
>>
>> > Hello team,
>> >
>> > - We are using the zookeeper less Kafka (kafka Kraft).
>> > - The cluster is having 3 nodes.
>> > - One of the nodes gets automatically shut down randomly.
>> > - Checked the logs but didn't get the exact reason.
>> > - Sharing the logs below. Kafka version - 3.3.1
>> >
>> > *Logs - *
>> >
>> > [2023-04-13 01:49:17,411] WARN [Controller 1] Renouncing the leadership
>> > due to a metadata log event. We were the leader at epoch 37110, but in
>> the
>> > new epoch 37111, the leader is (none). Reverting to last committed offset
>> > 28291464. (org.apache.kafka.controller.QuorumController)
>> > [2023-04-13 01:49:17,531] INFO [RaftManager nodeId=1] Completed
>> transition
>> > to Unattached(epoch=37112, voters=[1, 2, 3], electionTimeoutMs=982)
>> > (org.apache.kafka.raft.QuorumState)
>> >
>> > [2023-04-13 02:00:33,902] WARN [Controller 1] Renouncing the leadership
>> > due to a metadata log event. We were the leader at epoch 37116, but in
>> the
>> > new epoch 37117, the leader is (none). Reverting to last committed offset
>> > 28292807. (org.apache.kafka.controller.QuorumController)
>> > [2023-04-13 02:00:33,936] INFO [RaftManager nodeId=1] Completed
>> transition
>> > to Unattached(epoch=37118, voters=[1, 2, 3], electionTimeoutMs=1497)
>> > (org.apache.kafka.raft.QuorumState)
>> >
>> > [2023-04-13 02:00:35,014] ERROR [Controller 1] processBrokerHeartbeat:
>> > unable to start processing because of NotControllerException.
>> > (org.apache.kafka.controller.QuorumController)
>> >
>> > [2023-04-13 02:12:21,883] WARN [Controller 1] Renouncing the leadership
>> > due to a metadata log event. We were the leader at epoch 37129, but in
>> the
>> > new epoch 37131, the leader is (none). Reverting to last committed offset
>> > 28294206. (org.apache.kafka.controller.QuorumController)
>> >
>> > [2023-04-13 02:13:41,328] WARN [Controller 1] Renouncing the leadership
>> > due to a metadata log event. We were the leader at epoch 37141, but in
>> the
>> > new epoch 37142, the leader is (none). Reverting to last committed offset
>> > 28294325. (org.apache.kafka.controller.QuorumController)
>> >
>> > [2023-04-13 02:13:41,328] INFO [Controller 1] writeNoOpRecord: failed
>> with
>> > NotControllerException in 16561838 us
>> > (org.apache.kafka.controller.QuorumController)
>> >
>> > [2023-04-13 02:13:41,328] INFO [Controller 1] maybeFenceReplicas: failed
>> > with NotControllerException in 8520846 us
>> > (org.apache.kafka.controller.QuorumController)
>> >
>> > [2023-04-13 02:13:41,328] INFO [BrokerToControllerChannelManager broker=1
>> > name=heartbeat] Client requested disconnect from node 1
>> > (org.apache.kafka.clients.NetworkClient)
>> > [2023-04-13 02:13:41,329] INFO [BrokerLifecycleManager id=1] Unable to
>> > send a heartbeat because the RPC got timed out before it could be sent.
>> > (kafka.server.BrokerLifecycleManager)
>> > [2023-04-13 02:13:41,351] ERROR Encountered fatal fault: exception while
>> > renouncing leadership
>> > (org.apache.kafka.server.fault.ProcessExitingFaultHandler)
>> > java.lang.NullPointerException
>> > at
>> >
>> org.apache.kafka.timeline.SnapshottableHashTable$HashTier.mergeFrom(SnapshottableHashTable.java:125)
>> > at org.apache.kafka.timeline.Snapshot.mergeFrom(Snapshot.java:68)
>> > at
>> >
>> org.apache.kafka.timeline.SnapshotRegistry.deleteSnapshot(SnapshotRegistry.java:236)
>> > at
>> >
>> org.apache.kafka.timeline.SnapshotRegistry$SnapshotIterator.remove(SnapshotRegistry.java:67)
>> > at
>> >
>> org.apache.kafka.timeline.SnapshotRegistry.revertToSnapshot(SnapshotRegistry.java:214)
>> > at
>> >
>> org.apache.kafka.controller.QuorumController.renounce(QuorumController.java:1232)
>> > at
>> >
>> org.apache.kafka.controller.QuorumController.access$3300(QuorumController.java:150)
>> > at
>> >
>> org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$handleLeaderChange$3(QuorumController.java:1076)
>> > at
>> >
>> org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$appendRaftEvent$4(QuorumController.java:1101)
>> > at
>> >
>> org.apache.kafka.controller.QuorumController$ControlEvent.run(QuorumController.java:496)
>> > at
>> >
>> org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
>> > at
>> >
>> org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
>> > at
>> >
>> org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
>> > at java.lang.Thread.run(Thread.java:750)
>> > [2023-04-13 02:13:41,385] INFO [BrokerServer id=1] Transition from
>> STARTED
>> > to SHUTTING_DOWN (kafka.server.BrokerServer)
>> >
>> >
>> >
>> > Regards,
>> > *[image: Inline image 1] *
>> > *Akshay Kumar*
>> >
>> > *Senior Software Engineer ll | AMEYO
>> > <
>> http://www.ameyo.com/?utm_source=signature&utm_medium=email&utm_campaign=Email-Signature
>> >+91-8556063696*
>> > *[image: Facebook] <https://www.facebook.com/AmeyoCIM> [image: Twitter]
>> > <https://twitter.com/AmeyoCIM> [image: Google Plus]
>> > <https://plus.google.com/+AmeyoCIM/> [image: Linkedin]
>> > <https://www.linkedin.com/company/ameyocim> *
>> > *Latest from the Ameyo Blog*
>> > <
>> http://www.ameyo.com/blog/?utm_source=signature&utm_medium=email&utm_campaign=Email-Signature
>> >
>> >
>> >
>> > *[image:
>> >
>> https://exotel.com/virtual-telecom-operator-india/?utm_campaign=ulvno&utm_medium=email-signature&utm_source=corp-emails
>> ]
>> > <
>> https://exotel.com/virtual-telecom-operator-india/?utm_campaign=ulvno&utm_medium=email-signature&utm_source=corp-emails
>> >*
>> >
>> > *Disclaimer:* The information in this communication is confidential and
>> > may be legally privileged. It is intended solely for the use of the
>> > individual or entity to whom it is addressed and others authorized to
>> > receive it. If you are not the intended recipient you are hereby notified
>> > that any disclosure, copying, distribution or taking action in reliance
>> of
>> > the contents of this information is strictly prohibited and may be
>> > unlawful. Drishti is neither liable for the improper, incomplete
>> > transmission of the information contained in this communication nor any
>> > delay in its receipt. The communication is not intended to operate as an
>> > electronic signature under any applicable law. Drishti assumes no
>> > responsibility for any loss or damage resulting from the use of e-mails.
>> >
>>
Hi,
New information.
Kind regards,
Comments
Post a Comment