HuggingFace CEO strongly recommends, Bengio team also bets on: Why is this HRM model, trained with $1500, so popular?
A new language model called HRM-Text, developed by Sapient Intelligence, is gaining attention for its innovative architecture that focuses on internal reasoning rather than simply increasing model size or training data. This model, with only 1 billion parameters and a training cost of approximately $1500, has achieved impressive scores on benchmarks like MATH and GSM8K. The architecture, known as Hierarchical Reasoning Model (HRM), emphasizes latent reasoning, allowing the model to perform multi-round, hierarchical, and recursive computations within its internal state before producing an output, a concept also explored in research by Yoshua Bengio's team. AI
IMPACT This model's focus on internal reasoning could shift future LLM development towards more efficient computation over sheer scale.