KVCache.ai KVCache.ai

KVCache.ai

Essentially, a decoder-only Transformer model transforms data from any modality into KVCache, positioning it as a central element in LLM serving optimizations. These optimizations include, but are not limited to, caching, scheduling, compression, and offloading. KVCache.AI is a collaborative endeavor with leading industry partners such as Approaching.AI and Moonshot AI. The project focuses on developing effective and practical techniques that enhance both academic research and open-source development.

SubProjects

KTransformers


A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations. Made with iconApproaching AI !

Mooncake


Mooncake, delicious mooncake made with iconMoonshot AI !

A KVCache-centric disaggregated architecture for LLM serving.