A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations. Made with Approaching AI
Jul 27, 2024
Mooncake, delicious mooncake made with Moonshot AI! A KVCache-centric disaggregated architecture for LLM serving.
Jun 25, 2024