Skip to main content

Transactional markers are not deleted from log segments when policy is compact

Hi all,

We use Kafka version 2.11-1.1.1. We produce and consume transactional
messages and recently we noticed that 2 partitions of the __consumer_offset
topic have very high disk usage (256GB)
When we looked at the log segments for these 2 partitions, there were files
that were 6 months old.
By dumping the content of an old log segment using the following command

kafka-run-class.sh kafka.tools.DumpLogSegments --deep-iteration
--print-data-log --files 00000000003949894887.log | less


we found that all the records were COMMIT transaction markers.

offset: 1924582627 position: 183 CreateTime: 1548972578376 isvalid:
true keysize: 4 valuesize: 6 magic: 2 compresscodec: NONE producerId:
126015 producerEpoch: 0 sequence: -1 isTransactional: true headerKeys:
[] endTxnMarker: COMMIT coordinatorEpoch: 28


Why are the commit transaction markers not compacted and deleted?


Log cleaner config

max.message.bytes 10000120
min.cleanable.dirty.ratio 0.1
compression.type uncompressed
cleanup.policy compact
retention.ms 2160000000
segment.bytes 104857600

# By default the log cleaner is disabled and the log retention policy
will default to just delete segments after their retention expires.
# If log.cleaner.enable=true is set the cleaner will be enabled and
individual logs can then be marked for log compaction.
log.cleaner.enable=true
# give larger heap space to log cleaner
log.cleaner.dedupe.buffer.size=1342177280

Comments