Brief

last 24h

[2/2] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · arXiv cs.AI English(EN) · 22h · [2 sources]

Adaptive inference and function vectors in deep transformers

Researchers have developed a new theory explaining the internal workings of deep transformers, viewing them as mean-field interacting systems that perform distributed inference. This theory introduces 'function vectors' as internal state representations that allow transformers to infer latent context variables at progressively finer scales through their layers. The research demonstrates that transformer depth and feedforward blocks enable more sophisticated in-context learning algorithms than previously understood. AI

IMPACT Provides a theoretical framework for understanding and potentially improving the in-context learning capabilities of deep transformer models.
RESEARCH · arXiv cs.CL English(EN) · 1w · [2 sources]

Fast & Faithful Function Vectors

Researchers have developed a new method for creating function vectors (FVs) to steer Large Language Models (LLMs) during in-context learning. The study explores variations in FV definitions, focusing on attention head selection and steering techniques. By employing gradient-based attributions with Layer-wise Relevance Propagation (LRP) for head selection and a distributed approach for steering, the method significantly enhances both efficiency and accuracy in guiding LLMs. AI

IMPACT Introduces a more efficient and accurate method for controlling LLM behavior, potentially improving performance on various downstream tasks.

Brief

Adaptive inference and function vectors in deep transformers

Fast & Faithful Function Vectors