The DS4 model is reportedly running on NVIDIA's DGX Spark hardware, utilizing GB10 and CUDA. Initial performance metrics indicate a speed of 12 tokens per second, with observed memory throughput limited to 270 GB/s. This setup is currently confined to a private branch, suggesting it is in an experimental or developmental phase. AI
Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →
IMPACT This indicates potential advancements in AI hardware utilization and performance benchmarks for large models.
RANK_REASON The cluster describes a model running on specific hardware, with performance metrics, which constitutes a research milestone or technical report.