Explore our open-source projects focused on KVCache optimization and LLM serving.
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations. Made with
Approaching AI !
Mooncake, delicious mooncake made with
Moonshot AI !
A KVCache-centric disaggregated architecture for LLM serving.