KV Cache Analyzer Trace ? Block size ? Customize trace Choose JSONL / JSONL.GZ Drop or choose a JSONL / JSONL.GZ trace. Each valid line needs hash_ids and input_length; block_size is optional when set above. hash_ids must be decimal integers or decimal strings. Requests are replayed in file order, so sort production traces by timestamp first. Computed locally in your browser. Example {"block_size":64,"hash_ids":[2001,2002],"input_length":128} {"block_size":64,"hash_ids":[2003,2004],"input_length":128} {"block_size":64,"hash_ids":[2005,2006],"input_length":128} Do you know you can deploy it locally? Check our repo. Model family Model KV precision Indexer precision Include draft KV cache KV Cache Eviction When adding a new block would exceed the selected memory budget, the simulator evicts one cached block based on the following policies: FIFO, LRU, or Optimal. Want to see the KV cache memory breakdown? Try our KV Cache Size Calculator. Want to scale KV cache storage across your inference cluster? Try Mooncake. Found an issue or have feedback? Open an issue and let us know. Calculating... KV Cache Hit Rate ? Ideal Prefill Throughput Speedup ? KV Cache Size Calculator May 20, 2026 →