PulseAugur
EN
LIVE 06:00:11

New method recovers input text from LLM hidden states

Researchers have developed a new method to recover input text from the hidden states of decoder-only language models. This approach treats the inversion as a continuous embedding-space optimization, driving a soft proxy towards the leaked target without hard-token projection until the end. The study reveals that while content-bearing tokens are recovered almost perfectly, space-prefixed, high-frequency function words in dense embedding regions are more prone to breaking reconstructions. This continuous formulation allows for observable optimization and detectable failures, showing that last-layer hidden states of GPT-2 are as sensitive as the original text. AI

IMPACT Highlights potential vulnerabilities in LLM privacy and security by demonstrating input text recovery from hidden states.

RANK_REASON Academic paper detailing a new method for recovering input text from language model hidden states. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New method recovers input text from LLM hidden states

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Maciej Witold Majewski ·

    Recovering Input Text from Hidden States: Study of Gradient-Based Inversion of Decoder-Only Language Models

    This work studies the hidden-state inversion problem: recovering the original input token sequence of a decoder-only language model from its last-layer hidden states. Rather than treating inversion as a one-shot reconstruction, we study it as a continuous embedding-space optimisa…