Brief

last 24h

[3/3] 222 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.AI English(EN) · 4d

Sustainability Is Not Linear: Quantifying Performance, Energy, and Privacy Trade-offs in On-Device Intelligence

A new research paper explores the trade-offs between performance, energy consumption, and privacy when running large language models on mobile devices. The study developed an experimental pipeline to measure these factors on an Android device, testing eight LLMs. Findings indicate that model architecture, rather than quantization, is key for energy efficiency, with Mixture-of-Experts models showing promise for balancing storage and power usage. AI

IMPACT Quantifies the energy and performance costs of running LLMs on edge devices, guiding future model optimization for mobile deployment.
TOOL · arXiv cs.LG English(EN) · 4d

MemReward: Graph-Based Experience Memory for LLM Reward Prediction with Limited Labels

Researchers have developed MemReward, a novel graph-based framework designed to improve reinforcement learning for large language models (LLMs) when labeled data is scarce. This method uses a graph neural network (GNN) to propagate reward signals from a small set of labeled examples to a larger pool of unlabeled data. Experiments show that MemReward can achieve performance close to that of an oracle (fully labeled data) even with only 20% of the data labeled, demonstrating its effectiveness across various tasks like mathematics, question answering, and code generation. AI

IMPACT Enables more efficient fine-tuning of LLMs in data-scarce environments, potentially accelerating development across various AI applications.
RESEARCH · arXiv cs.AI English(EN) · 5d · [2 sources]

Search-E1: Self-Distillation Drives Self-Evolution in Search-Augmented Reasoning

Researchers have introduced Search-E1, a novel self-evolution method for search-augmented reasoning agents that bypasses complex external supervision. This approach utilizes vanilla GRPO combined with offline self-distillation (OFSD) to enable agents to improve independently. The method achieved a $0.440$ average EM score on seven QA benchmarks using the Qwen2.5-3B model, outperforming existing open-source baselines. AI

IMPACT Simplifies training for search-augmented reasoning agents, potentially making them more accessible and efficient.
- Qwen2.5-3B
- Search-E1

Brief

Sustainability Is Not Linear: Quantifying Performance, Energy, and Privacy Trade-offs in On-Device Intelligence

MemReward: Graph-Based Experience Memory for LLM Reward Prediction with Limited Labels

Search-E1: Self-Distillation Drives Self-Evolution in Search-Augmented Reasoning