Skip to main content

Re: Kafka connect timestamp header serialization issue of Midnight UTC Timestamps

hi Vinayak thanks for your report. Yes, it is definitely an issue. I have opened https://issues.apache.org/jira/browse/KAFKA-20752 and my team is going to fix it. If you have free cycles, please feel free to take it over. Best, Chia-Ping On 2026/01/02 13:59:05 Vinayak Gaikwad wrote: > Hi team, > > Can anyone provide any suggestions on this issue? > > On Sun, Nov 30, 2025 at 12:29 PM Vinayak Gaikwad < > gaikwadvinayak291@gmail.com> wrote: > > > Hi, > > I am encountering an issue with how timestamp headers are serialized in > > Kafka Connect and would appreciate clarification on whether this is the > > intended behavior or a bug. > > > > Problem: Incorrect Serialization of Midnight UTC Timestamps > > When using connectors that produce timestamp headers (e.g., the MongoDB > > source connector), timestamps that fall exactly at midnight UTC are being > > serialized incorrectly, losing their time component. Timestamps at any > > other time of day are serialized correctly. > > Example: > > - Input Timestamp: 2025-01-01T00:00:00.000Z (midnight UTC) > > - Expected Serialized Value: "2025-01-01T00:00:00.000Z" > > - Actual Serialized Value: "2025-01-01" (time component lost) > > > > Analysis > > I traced this to the org.apache.kafka.connect.data.Values.dateFormatFor() > > <https://github.com/apache/kafka/blob/0e1c6fb6bb503aeda27ce1d73cd827b7a227d769/connect/api/src/main/java/org/apache/kafka/connect/data/Values.java#L769-L777> which > > determines the format based on the millisecond value: > > > > // Simplified code for reference > > if (value.getTime() % MILLIS_PER_DAY == 0) { > > return DATE_FORMAT; // "yyyy-MM-dd" > > } else { > > return TIMESTAMP_FORMAT; // "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'" > > } > > > > For midnight UTC timestamps, the condition value.getTime() % > > MILLIS_PER_DAY == 0 is true, causing the method to return the DATE format, > > even if the value's schema is a Timestamp logical type. The issue is that > > dateFormatFor() is not schema-aware and makes an assumption based solely on > > the millisecond value. > > I have created a test case that reproduces and confirms this failure. > > > > Questions*:* > > 1. Is this behavior intended? Should timestamps at midnight UTC be > > formatted differently than other timestamps? > > 2. Should the logical type be respected? When a value has > > Timestamp.SCHEMA, should it always be formatted with the full timestamp > > format (yyyy-MM-dd'T'HH:mm:ss.SSS'Z') regardless of the millisecond value? > > > > Environment > > - Kafka Version: 4.1.1 > > - Component: connect-api module (org.apache.kafka.connect.data.Values) > > > > I would appreciate any guidance on whether this is expected behavior or if > > I should file a JIRA issue. I am happy to help with a patch if needed. > > Thank you for your time and for maintaining Kafka. > > > > Best regards, > > Vinayak Gaikwad. > > > > > -- > Thanks, > Vinayak Gaikwad. >

Comments