PulseAugur
LIVE 09:56:15
tool · [1 source] ·
0
tool

eNTK eigenanalysis surfaces features in trained neural networks

Researchers have demonstrated that analyzing the empirical Neural Tangent Kernel (eNTK) can reveal feature directions within trained neural networks. This method was tested on a 1-layer MLP and a 1-layer Transformer, showing that the top eigenspaces of the eNTK align with ground-truth or interpretable features. For a pretrained language model, Gemma-3-270M, eNTK eigendirections aligned with grammatical features better than PCA on model activations, suggesting eNTK eigenanalysis as a tool for mechanistic interpretability. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Introduces a novel technique for understanding internal model representations, potentially aiding in interpretability research.

RANK_REASON Academic paper detailing a new method for analyzing neural network features. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

COVERAGE [1]

  1. arXiv cs.LG TIER_1 · Jennifer Lin ·

    Feature Identification via the Empirical NTK

    arXiv:2510.00468v4 Announce Type: replace Abstract: We provide evidence that eigenanalysis of the empirical neural tangent kernel (eNTK) can surface feature directions in trained neural networks. Across three increasingly realistic settings -- a 1-layer MLP trained on modular add…