ENTITY Cutlass

Cutlass

PulseAugur coverage of Cutlass — every cluster mentioning Cutlass across labs, papers, and developer communities, ranked by signal.

Total · 30d

4

4 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

1

1 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 4 TOTAL

TOOL · CL_75452 · Jun 6 · 22:04

CUDA/C++ inference engine built for NVIDIA's DVLT 3D model

A new inference engine called dvlt.cu has been developed from scratch using CUDA/C++ for NVIDIA's DVLT 3D transformer model. This standalone 5MB binary has minimal dependencies, relying only on cuBLASLt and the header-o…
TOOL · CL_51969 · May 26 · 08:50

TileLang simplifies GPU kernel writing with Python interface

A new programming language called TileLang aims to simplify GPU kernel development by offering a middle ground between high-level frameworks like Triton and low-level control like CUTLASS. TileLang allows developers to …
RESEARCH · CL_13517 · May 3 · 08:26

CuTeDSL emerges as new GPU kernel path for LLM inference, challenging CUTLASS

The landscape of GPU kernel engineering for LLM inference is shifting, with CuTeDSL emerging as a potential successor to C++ CuTe/CUTLASS. This evolution is highlighted by industry trends in technologies like FlashAtten…
RESEARCH · CL_11176 · May 1 · 01:38

Moonshot AI open-sources FlashKDA, boosting Kimi Delta Attention 2.5x on H200 GPUs

Moonshot AI has released FlashKDA, an open-source implementation of Kimi Delta Attention. This new kernel achieves up to 2.5 times faster inference speeds on NVIDIA H200 GPUs. It is built using CUTLASS and optimized for…