KVCache.ai

Essentially, a decoder-only Transformer model transforms data from any modality into KVCache, positioning it as a central element in LLM serving optimizations. These optimizations include, but are not limited to, caching, scheduling, compression, and offloading. KVCache.AI is a collaborative endeavor with leading industry partners such as Approaching.AI and Moonshot AI. The project focuses on developing effective and practical techniques that enhance both academic research and open-source development.

SubProjects

KTransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations. Made with icon Approaching AI !

Get start

Mooncake

Mooncake, delicious mooncake made with icon Moonshot AI !

A KVCache-centric disaggregated architecture for LLM serving.

Get start