LightLLM v1.1.0 Release!

Latest

Latest

shihaobai released this 03 Sep 07:54

· 19 commits to main since this release

0f5d0cc

🎉 Announcing LightLLM v1.1.0: More Efficient, More Powerful!

We are thrilled to introduce LightLLM v1.1.0, featuring major architectural and optimization upgrades for higher performance and broader applicability.

✨ Key Highlights

🚀 CPU-GPU Unified Folding Architecture

Drastically reduces system-level CPU overhead

⚡ Deep Model Optimizations

Enhanced support for DeepSeek and Qwen3-MoE
Integration of DeepEP / DeepGEMM and fused MoE Triton optimizations
New balanced DP request scheduler.
Added support for MTP

⚙️ Autotuner for Triton Kernels

Automatically tunes kernel operators used by the model at service startup

🏆 ACL Outstanding Award: Pre^3

🖼️ Improved Multimodal Inference

Further optimizations for faster and more reliable multimodal model inference

📖 Learn More

More details can be found in the LightLLM v1.1.0 blog post.

Assets 2