Return to site

Understanding Kafkaโ€™s Cleanup Policies: Delete, Compact, and Combination ๐Ÿงน๐Ÿ”„

ยท kafka,java

Apache Kafka is a powerful distributed streaming platform that handles trillions of events a day. A crucial aspect of managing Kafka topics is understanding the cleanup policies: Delete, Compact, and how to combine them. Letโ€™s dive in!

broken image

The Delete policy is the default cleanup policy in Kafka. It dictates that old log segments are discarded when they reach their retention limit, either by time or size. This policy is straightforward and ensures that the disk space is not overwhelmed by old data.

๐“๐ก๐ž ๐‚๐จ๐ฆ๐ฉ๐š๐œ๐ญ ๐๐จ๐ฅ๐ข๐œ๐ฒ ๐Ÿ“ฆ

The Compact policy, on the other hand, retains only the latest value for each key within a log segment. This means that even if a key has been updated multiple times, only the most recent value is kept. Itโ€™s ideal for scenarios where the complete history of updates for a key isnโ€™t necessary.

๐‚๐จ๐ฆ๐›๐ข๐ง๐ข๐ง๐  ๐ƒ๐ž๐ฅ๐ž๐ญ๐ž ๐š๐ง๐ ๐‚๐จ๐ฆ๐ฉ๐š๐œ๐ญ ๐Ÿค

Kafka allows you to set both Delete and Compact policies together. When combined, the log segments are first compacted, and then the delete policy is applied based on retention settings. This hybrid approach offers the benefits of both policies.

๐ƒ๐ž๐Ÿ๐š๐ฎ๐ฅ๐ญ ๐๐จ๐ฅ๐ข๐œ๐ฒ ๐š๐ง๐ ๐‡๐จ๐ฐ ๐ญ๐จ ๐‚๐ก๐š๐ง๐ ๐ž ๐ˆ๐ญ ๐Ÿ”ง

As mentioned, the default policy is Delete. However, you can change the cleanup policy for a topic when creating it with the kafka-topics tool using the --config option, or alter it later with the kafka-configs tool.

Key Points to Remember ๐Ÿ”‘

๐ƒ๐ž๐Ÿ๐š๐ฎ๐ฅ๐ญ ๐๐จ๐ฅ๐ข๐œ๐ฒ: ๐ƒ๐ž๐ฅ๐ž๐ญ๐ž ๐Ÿ—‘๏ธ

Compact Policy: Retains the latest message per key ๐Ÿ“ฆ

Combination: Offers benefits of compaction and controlled retention ๐Ÿค

Changing Policy: Use Kafkaโ€™s topic and config tools ๐Ÿ”ง

Remember, choosing the right cleanup policy is essential for efficient Kafka management and ensuring data integrity. ๐Ÿš€

๐ƒ๐จ๐œ๐ฎ๐ฆ๐ž๐ง๐ญ๐š๐ญ๐ข๐จ๐ง: ๐œ๐ฅ๐ž๐š๐ง๐ฎ๐ฉ.๐ฉ๐จ๐ฅ๐ข๐œ๐ฒ ๐Ÿฅผ

This config designates the retention policy to use on log segments. The โ€œdeleteโ€ policy (which is the default) will discard old segments when their retention time or size limit has been reached. The โ€œcompactโ€ policy will enable log compaction, which retains the latest value for each key. It is also possible to specify both policies in a comma-separated list (e.g. โ€œdelete,compactโ€). In this case, old segments will be discarded per the retention time and size configuration, while retained segments will be compacted.

Type:list

Default:delete

Valid Values:[compact, delete]

Server Default Property:log.cleanup.policy

Importance:medium

#kafka #cleanup #policy #delete #compact

Doc:https://docs.confluent.io/platform/current/installation/configuration/topic-configs.html