Brief · PulseAugur

RESEARCH · 量子位 (QbitAI) 中文(ZH) · 8h

HuggingFace CEO strongly recommends, Bengio team also bets on: Why is this HRM model, trained with $1500, so popular?

A new language model called HRM-Text, developed by Sapient Intelligence, is gaining attention for its innovative architecture that focuses on internal reasoning rather than simply increasing model size or training data. This model, with only 1 billion parameters and a training cost of approximately $1500, has achieved impressive scores on benchmarks like MATH and GSM8K. The architecture, known as Hierarchical Reasoning Model (HRM), emphasizes latent reasoning, allowing the model to perform multi-round, hierarchical, and recursive computations within its internal state before producing an output, a concept also explored in research by Yoshua Bengio's team. AI

IMPACT This model's focus on internal reasoning could shift future LLM development towards more efficient computation over sheer scale.

Yoshua Bengio
Transformer
GSM8K
DROP
ARC-Challenge
HRM-Text
Clem Delangue
GRAM
Sapient Intelligence
HRM-Symbolic