A new research paper explores the trade-offs between performance, energy consumption, and privacy when running large language models on mobile devices. The study developed an experimental pipeline to measure these factors on an Android device, testing eight LLMs. Findings indicate that model architecture, rather than quantization, is key for energy efficiency, with Mixture-of-Experts models showing promise for balancing storage and power usage. AI
IMPACT Quantifies the energy and performance costs of running LLMs on edge devices, guiding future model optimization for mobile deployment.
RANK_REASON The cluster contains an academic paper detailing empirical research on LLM performance trade-offs. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →