KV Cache Analyzer

Trace
Block size
Customize trace

Example

{"block_size":64,"hash_ids":[2001,2002],"input_length":128}
{"block_size":64,"hash_ids":[2003,2004],"input_length":128}
{"block_size":64,"hash_ids":[2005,2006],"input_length":128}

Do you know you can deploy it locally? Check our repo.

KV Cache Eviction

When adding a new block would exceed the selected memory budget, the simulator evicts one cached block based on the following policies: FIFO, LRU, or Optimal.

Want to see the KV cache memory breakdown? Try our KV Cache Size Calculator.

Want to scale KV cache storage across your inference cluster? Try Mooncake.

Found an issue or have feedback? Open an issue and let us know.

Calculating...

KV Cache Hit Rate