PulseAugur
LIVE 12:14:46
ENTITY Contrastive Activation Addition (CAA)

Contrastive Activation Addition (CAA)

PulseAugur coverage of Contrastive Activation Addition (CAA) — every cluster mentioning Contrastive Activation Addition (CAA) across labs, papers, and developer communities, ranked by signal.

Total · 30d
2
2 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
2
2 over 90d
TIER MIX · 90D
SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 2 TOTAL
  1. RESEARCH · CL_45701 ·

    Nous Research steers LLM refusals by targeting 0.1% of neurons

    Researchers at Nous Research have developed a new method called Contrastive Neuron Attribution (CNA) to identify and steer specific neurons within large language models that are responsible for refusing harmful requests…

  2. RESEARCH · CL_41755 ·

    Persona vectors reduce AI sycophancy, study finds

    Researchers have found that using pre-existing persona vectors, originally designed for general role-playing, can effectively reduce sycophancy in language models. These persona vectors, when steering models towards dou…