ENTITY ik_llama.cpp

ik_llama.cpp

PulseAugur coverage of ik_llama.cpp — every cluster mentioning ik_llama.cpp across labs, papers, and developer communities, ranked by signal.

Total · 30d

3

3 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

0

0 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 3 TOTAL

TOOL · CL_55274 · May 27 · 19:26

Qwen 3.5 35B model runs at 10.33 t/s on $300 laptop

A user on Reddit's r/LocalLLaMA subreddit has detailed their experience running the Qwen 3.5 35B model on a budget laptop. They achieved an inference speed of 10.33 tokens per second on a $300 Lenovo Ideapad Slim 3i wit…
TOOL · CL_43106 · May 21 · 21:33

Qwen 3.6 model hits 110 tokens/sec on consumer GPUs via llama.cpp

The open-weight model Qwen 3.6, in its 35 billion parameter version, has achieved an impressive 110 tokens per second inference speed on consumer GPUs with 12GB of VRAM. This performance was enabled by a specialized var…
RESEARCH · CL_03577 · Apr 25 · 15:42

llama.cpp and ik_llama.cpp add FP4 inference support for VRAM savings

The llama.cpp and ik_llama.cpp projects have both integrated support for FP4 (4-bit floating-point) inference, a significant advancement for model quantization. llama.cpp now includes NVFP4, an Nvidia-specific format, w…