PulseAugur
LIVE 06:48:29
ENTITY GPTQ

GPTQ

PulseAugur coverage of GPTQ — every cluster mentioning GPTQ across labs, papers, and developer communities, ranked by signal.

Total · 30d
6
6 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
5
5 over 90d
TIER MIX · 90D
RELATIONSHIPS
SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 6 TOTAL
  1. TOOL · CL_30718 ·

    New paper details improved quantization for LLM matrix multiplication

    Researchers have published a paper detailing advancements in quantized matrix multiplication, specifically for large language models (LLMs). This second part of their work focuses on scenarios where the covariance matri…

  2. TOOL · CL_27223 ·

    ExLlamaV3, Unsloth Qwen, and Phi3 agent see major local AI updates

    This week's local AI news highlights significant updates to the ExLlamaV3 inference library, enhancing efficiency for running quantized Llama models on consumer GPUs. Additionally, new GGUF-quantized versions of Qwen 3.…

  3. RESEARCH · CL_15961 ·

    New methods accelerate LLMs via efficient sparsification, quantization, and compression

    Researchers have developed several new methods for compressing and optimizing large language models (LLMs) to improve efficiency and reduce computational costs. SparseForge focuses on efficient semi-structured sparsific…

  4. RESEARCH · CL_11807 ·

    New methods tackle LLM quantization for improved efficiency and accuracy

    Researchers have developed several new methods to improve the efficiency of large language models (LLMs) through quantization. OSAQ focuses on suppressing weight outliers using a low-rank Hessian property for accurate l…

  5. RESEARCH · CL_01274 ·

    Hugging Face introduces advanced quantization techniques for efficient LLMs

    Researchers are developing advanced quantization techniques to make large language models (LLMs) more efficient. New methods like AutoRound, LATMiX, and GSQ aim to reduce model size and computational requirements, enabl…

  6. RESEARCH · CL_01035 ·

    Optimizing Transformer Inference: Techniques for Faster, Cheaper Large Models

    Large transformer models present significant inference challenges due to their substantial memory footprint and computation costs, which scale quadratically with input length. Researchers and practitioners are exploring…