New theory uses statistical mechanics to interpret neural networks

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have introduced a new theory called susceptibilities to interpret neural networks, drawing parallels to statistical mechanics. This theory defines the susceptibility of an observable to data perturbations as a derivative of posterior expectation, which is equivalent to posterior covariance via the fluctuation-dissipation theorem. Different choices of observables yield distinct results, such as the influence matrix for per-sample losses and the structural susceptibility matrix for component-localized observables, offering insights into model behavior and the geometry of loss landscapes. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Introduces a novel theoretical framework for understanding neural network behavior, potentially aiding in model interpretability and debugging.

RANK_REASON The cluster contains a single academic paper detailing a new theoretical framework for interpreting neural networks. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

paper
other

COVERAGE [1]

arXiv cs.LG TIER_1 · Daniel Murfet · 2026-05-08 16:43

Susceptibilities and Patterning: A Primer on Linear Response in Bayesian Learning

These notes introduce the theory of susceptibilities as developed in [arXiv:2504.18274, arXiv:2601.12703] for interpreting neural networks. The susceptibility of an observable $φ$ to a data perturbation is defined as a derivative of a posterior expectation, which by the fluctuati…

COVERAGE [1]

Susceptibilities and Patterning: A Primer on Linear Response in Bayesian Learning

RELATED TOPICS