ENTITY cuDNN: Efficient Primitives for Deep Learning

cuDNN: Efficient Primitives for Deep Learning

PulseAugur coverage of cuDNN: Efficient Primitives for Deep Learning — every cluster mentioning cuDNN: Efficient Primitives for Deep Learning across labs, papers, and developer communities, ranked by signal.

Total · 30d

2 over 90d

Releases · 30d

0 over 90d

Papers · 30d

0 over 90d

TIER MIX · 90D

SENTIMENT · 30D

1 day(s) with sentiment data

RECENT · PAGE 1/1 · 2 TOTAL

TOOL · CL_44358 · May 22 · 15:59

Together AI optimizes attention for Blackwell GPUs with FlashAttention-4

Together AI has released FlashAttention-4, an optimized algorithm and kernel co-design tailored for NVIDIA's Blackwell GPUs. This new version addresses the asymmetric hardware scaling of modern accelerators, where tenso…
RESEARCH · CL_18472 · May 6 · 04:00

NVIDIA open-sources cuDNN kernels after 12 years, including MoE and sparse attention

NVIDIA has open-sourced parts of its cuDNN library, a significant move after 12 years of it being closed-source. This release includes over 20 Mixture-of-Experts (MoE) kernels and NSA sparse attention kernels. The codeb…

Together AI optimizes attention for Blackwell GPUs with FlashAttention-4

NVIDIA open-sources cuDNN kernels after 12 years, including MoE and sparse attention