DeepSeek v3, a new 671B parameter Mixture-of-Experts model, has been released and is currently the top-performing open-weights model available. Serving such large models presents significant challenges, but inference startup Baseten has successfully deployed DeepSeek v3 using NVIDIA H200 GPUs and the SGLang framework. This deployment highlights the critical factors for running mission-critical AI inference at scale, which include model-level performance, efficient serving infrastructure, and robust orchestration. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
RANK_REASON New open-weights model release from a significant lab (DeepSeek) that achieves top benchmark performance.