PulseAugur
EN
LIVE 20:12:18

New methods enhance LLM control without sacrificing performance or reasoning

Researchers have developed new methods for steering large language model (LLM) behaviors at inference time without sacrificing generation quality. One approach, Prompt-only SV (PrOSV), intervenes only on prompt tokens, outperforming traditional full-sequence steering vectors on benchmarks like AxBench. Another method, FLAS (Flow-based Activation Steering), learns a concept-conditioned velocity field to transport activations, consistently outperforming prompting on Gemma models. A third technique, SKOP (Steering via Key-Orthogonal Projections), constrains attention rerouting to preserve reasoning and retrieval performance, achieving a better trade-off between utility and steering efficacy. AI

IMPACT New techniques for inference-time LLM control could enable more nuanced and reliable AI applications by improving steering accuracy and reducing performance degradation.

RANK_REASON Three new arXiv papers introduce novel methods for controlling LLM behavior at inference time.

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 5 sources. How we write summaries →

New methods enhance LLM control without sacrificing performance or reasoning

COVERAGE [5]

  1. arXiv cs.LG TIER_1 English(EN) · Yuntai Bao, Qinfeng Li, Xinyan Yu, Xuhong Zhang, Ge Su, Wenqi Zhang, Liu Yan, Haiqin Weng, Jianwei Yin ·

    Towards Steering without Sacrifice: Principled Training of Steering Vectors for Prompt-only Interventions

    arXiv:2605.05983v1 Announce Type: new Abstract: Recently, steering vectors (SVs) have emerged as an effective and lightweight approach to steer behaviors of large language models (LLMs), among which fine-tuned SVs are more effective than optimization-free ones. However, current a…

  2. arXiv cs.LG TIER_1 English(EN) · Zehao Jin, Ruixuan Deng, Junran Wang, Xinjie Shen, Chao Zhang ·

    Beyond Steering Vector: Flow-based Activation Steering for Inference-Time Intervention

    arXiv:2605.05892v1 Announce Type: cross Abstract: Activation steering has emerged as a promising alternative for controlling language-model behavior at inference time by modifying intermediate representations while keeping model parameters frozen. However, large-scale evaluations…

  3. arXiv cs.CL TIER_1 English(EN) · Haoyan Luo, Mateo Espinosa Zarlenga, Mateja Jamnik ·

    Don't Lose Focus: Activation Steering via Key-Orthogonal Projections

    arXiv:2605.06342v1 Announce Type: new Abstract: Activation steering controls LLM behaviour towards target behaviour by intervening in internal representations, yet it often degrades reasoning and retrieval performance. We argue that a primary cause of this trade-off is attention …

  4. arXiv cs.CL TIER_1 English(EN) · Mateja Jamnik ·

    Don't Lose Focus: Activation Steering via Key-Orthogonal Projections

    Activation steering controls LLM behaviour towards target behaviour by intervening in internal representations, yet it often degrades reasoning and retrieval performance. We argue that a primary cause of this trade-off is attention rerouting: steering vectors alter query-key matc…

  5. arXiv cs.CL TIER_1 English(EN) · Chao Zhang ·

    Beyond Steering Vector: Flow-based Activation Steering for Inference-Time Intervention

    Activation steering has emerged as a promising alternative for controlling language-model behavior at inference time by modifying intermediate representations while keeping model parameters frozen. However, large-scale evaluations such as AxBench show that existing steering metho…