Understanding Kafkaโs Cleanup Policies: Delete, Compact, and Combination ๐งน๐
Understanding Kafkaโs Cleanup Policies: Delete, Compact, and Combination ๐งน๐
Apache Kafka is a powerful distributed streaming platform that handles trillions of events a day. A crucial aspect of managing Kafka topics is understanding the cleanup policies: Delete, Compact, and how to combine them. Letโs dive in!
The Delete policy is the default cleanup policy in Kafka. It dictates that old log segments are discarded when they reach their retention limit, either by time or size. This policy is straightforward and ensures that the disk space is not overwhelmed by old data.
๐๐ก๐ ๐๐จ๐ฆ๐ฉ๐๐๐ญ ๐๐จ๐ฅ๐ข๐๐ฒ ๐ฆ
The Compact policy, on the other hand, retains only the latest value for each key within a log segment. This means that even if a key has been updated multiple times, only the most recent value is kept. Itโs ideal for scenarios where the complete history of updates for a key isnโt necessary.
๐๐จ๐ฆ๐๐ข๐ง๐ข๐ง๐ ๐๐๐ฅ๐๐ญ๐ ๐๐ง๐ ๐๐จ๐ฆ๐ฉ๐๐๐ญ ๐ค
Kafka allows you to set both Delete and Compact policies together. When combined, the log segments are first compacted, and then the delete policy is applied based on retention settings. This hybrid approach offers the benefits of both policies.
๐๐๐๐๐ฎ๐ฅ๐ญ ๐๐จ๐ฅ๐ข๐๐ฒ ๐๐ง๐ ๐๐จ๐ฐ ๐ญ๐จ ๐๐ก๐๐ง๐ ๐ ๐๐ญ ๐ง
As mentioned, the default policy is Delete. However, you can change the cleanup policy for a topic when creating it with the kafka-topics tool using the --config option, or alter it later with the kafka-configs tool.
Key Points to Remember ๐
๐๐๐๐๐ฎ๐ฅ๐ญ ๐๐จ๐ฅ๐ข๐๐ฒ: ๐๐๐ฅ๐๐ญ๐ ๐๏ธ
Compact Policy: Retains the latest message per key ๐ฆ
Combination: Offers benefits of compaction and controlled retention ๐ค
Changing Policy: Use Kafkaโs topic and config tools ๐ง
Remember, choosing the right cleanup policy is essential for efficient Kafka management and ensuring data integrity. ๐
๐๐จ๐๐ฎ๐ฆ๐๐ง๐ญ๐๐ญ๐ข๐จ๐ง: ๐๐ฅ๐๐๐ง๐ฎ๐ฉ.๐ฉ๐จ๐ฅ๐ข๐๐ฒ ๐ฅผ
This config designates the retention policy to use on log segments. The โdeleteโ policy (which is the default) will discard old segments when their retention time or size limit has been reached. The โcompactโ policy will enable log compaction, which retains the latest value for each key. It is also possible to specify both policies in a comma-separated list (e.g. โdelete,compactโ). In this case, old segments will be discarded per the retention time and size configuration, while retained segments will be compacted.
Type:list
Default:delete
Valid Values:[compact, delete]
Server Default Property:log.cleanup.policy
Importance:medium
#kafka #cleanup #policy #delete #compact
Doc:https://docs.confluent.io/platform/current/installation/configuration/topic-configs.html