PulseAugur
LIVE 10:17:59
research · [3 sources] ·
0
research

New HGQ-LUT and da4ml methods speed up DNN training and FPGA deployment

Researchers have developed HGQ-LUT, a new method for training lookup-table (LUT) based neural networks that significantly speeds up the training process, making it over 100 times faster on modern GPUs. This approach introduces specialized layers and fine-grained quantization to automatically explore accuracy-resource trade-offs without manual tuning. HGQ-LUT is integrated into open-source toolchains, enabling practical deployment of these efficient DNNs for applications like those at the CERN Large Hadron Collider. AI

Summary written by gemini-2.5-flash-lite from 3 sources. How we write summaries →

IMPACT Accelerates DNN training for FPGAs, enabling more efficient real-time inference for demanding applications.

RANK_REASON This is a research paper detailing a new training method for DNNs on FPGAs.

Read on arXiv cs.LG →

COVERAGE [3]

  1. arXiv cs.LG TIER_1 · Chang Sun, Zhiqiang Que, Bakhtiar Zadeh, Qibin Liu, Kevin H. Alvarez, Wayne Luk, Maria Spiropulu ·

    HGQ-LUT: Fast LUT-Aware Training and Efficient Architectures for DNN Inference

    arXiv:2604.22293v1 Announce Type: cross Abstract: Lookup-table (LUT) based neural networks can deliver ultra-low latency and excellent hardware efficiency on FPGAs by mapping arithmetic operations directly onto the logic primitives. However, state-of-the-art LUT-aware training (L…

  2. arXiv cs.LG TIER_1 · Chang Sun, Zhiqiang Que, Vladimir Loncar, Wayne Luk, Maria Spiropulu ·

    da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs

    arXiv:2507.04535v2 Announce Type: replace-cross Abstract: Neural networks with a latency requirement on the order of microseconds, like the ones used at the CERN Large Hadron Collider, are typically deployed on FPGAs fully unrolled and pipelined. A bottleneck for the deployment o…

  3. arXiv cs.LG TIER_1 · Maria Spiropulu ·

    HGQ-LUT: Fast LUT-Aware Training and Efficient Architectures for DNN Inference

    Lookup-table (LUT) based neural networks can deliver ultra-low latency and excellent hardware efficiency on FPGAs by mapping arithmetic operations directly onto the logic primitives. However, state-of-the-art LUT-aware training (LAT) approaches remain difficult to use in practice…