SGLang Memory Management & KV Cache (Part 1)

SGLang Memory Management System

TL;DR

Cache Class Module When to Use / Condition
RadixCache mem_cache/radix_cache.py Default
ChunkCache mem_cache/chunk_cache.py disable_radix_cache=True
SWAChunkCache mem_cache/chunk_cache.py disable_radix_cache=True + sliding window
HiRadixCache mem_cache/hiradix_cache.py enable_hierarchical_cache=True
SWARadixCache mem_cache/swa_radix_cache.py Sliding-window attention models
MambaRadixCache mem_cache/mamba_radix_cache.py Mamba / SSM-hybrid models
LMCRadixCache mem_cache/storage/lmcache/ enable_lmcache=True
RadixCacheCpp mem_cache/radix_cache_cpp.py Experimental C++ radix tree
Table of Contents