Researchers have developed a new method called absorption-aware correction to improve the efficiency of large language models (LLMs) through pruning. This technique characterizes how different layers in LLMs respond to perturbations, finding that early layers tend to amplify them while later layers absorb them. By incorporating this understanding, the method enhances existing pruning techniques like OWL and AlphaPruning, leading to a 7.13% reduction in perplexity and a 1.02% increase in zero-shot accuracy at 70% sparsity across various model families. AI
IMPACT Enhances LLM efficiency by improving pruning techniques, leading to better performance at higher sparsity levels.
RANK_REASON Academic paper published on arXiv detailing a new method for LLM pruning. [lever_c_demoted from research: ic=1 ai=1.0]
- absorption-aware correction
- AlphaPruning
- alphaXiv
- arXiv
- CatalyzeX
- DagsHub
- Gotit.pub
- Hugging Face
- large language models
- ScienceCast
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →