Echelon: Auditable Aggregate-Only Language-Model Adaptation Across Privacy Boundaries
Two new research papers introduce novel methods for enhancing privacy in large language model (LLM) adaptation and generation. Echelon focuses on auditable, aggregate-only adaptation across privacy boundaries, ensuring device-level model states are never exported. Privacy-Aware Decoding (PAD) is an inference-time technique that injects calibrated noise into token logits to prevent private information leakage in Retrieval-Augmented Generation (RAG) systems. Both approaches aim to balance model utility with stringent privacy requirements without necessitating model retraining. AI
IMPACT These methods offer new ways to deploy LLMs in sensitive environments by addressing privacy concerns during adaptation and generation.