Alibi
PulseAugur coverage of Alibi — every cluster mentioning Alibi across labs, papers, and developer communities, ranked by signal.
2 day(s) with sentiment data
-
Accumulated transformations improve LLM length extrapolation, but degrade at extremes
Researchers have investigated the extrapolation capabilities of accumulated transformations in attention mechanisms, specifically examining how replacing RoPE's position-indexed rotations with accumulated data-dependent…
-
xFormers library enables memory-efficient Transformer models on GPUs
This tutorial demonstrates how to build memory-efficient Transformer models using the xFormers library on GPUs. It covers implementing and comparing memory-efficient attention with standard attention, analyzing techniqu…
-
Jordan-RoPE: Non-Semisimple Relative Positional Encoding via Complex Jordan Blocks
Researchers have introduced Jordan-RoPE, a novel relative positional encoding method for transformer models that utilizes complex Jordan blocks. This approach generates oscillatory-polynomial features, enabling a distan…
-
Eugene Yan shares guide to running weekly AI paper club for learning communities
Eugene Yan details a successful weekly paper club that has met for 18 months, discussing at least 80 AI-related papers. The club focuses on foundational concepts, models, training, and inference techniques within machin…