ENTITY
Contrastive Activation Addition (CAA)
Contrastive Activation Addition (CAA)
PulseAugur coverage of Contrastive Activation Addition (CAA) — every cluster mentioning Contrastive Activation Addition (CAA) across labs, papers, and developer communities, ranked by signal.
Total · 30d
2
2 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
2
2 over 90d
TIER MIX · 90D
SENTIMENT · 30D
2 day(s) with sentiment data
RECENT · PAGE 1/1 · 2 TOTAL
-
Nous Research steers LLM refusals by targeting 0.1% of neurons
Researchers at Nous Research have developed a new method called Contrastive Neuron Attribution (CNA) to identify and steer specific neurons within large language models that are responsible for refusing harmful requests…
-
Persona vectors reduce AI sycophancy, study finds
Researchers have found that using pre-existing persona vectors, originally designed for general role-playing, can effectively reduce sycophancy in language models. These persona vectors, when steering models towards dou…