The Reservoir Attention Network: Cross-Pass State in Pretrained Transformers via Content-Addressable Reservoir Injection
Researchers have introduced the Reservoir Attention Network (RAN), a novel architecture designed to enhance pretrained transformers. RAN injects a fixed, randomly initialized reservoir into the mid-layer attention mechanism, enabling state to be carried across forward passes. Experiments were conducted using models ranging from GPT-2 to Qwen2.5, demonstrating the potential of untrained recurrent dynamics to manage cross-pass state, with future work exploring trained recurrence for more complex agent capabilities. AI
IMPACT Introduces a novel method for state management in transformers, potentially improving agent capabilities.