Tencent's Hunyuan-Large model outperforms competitors using less training data

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Tencent's Hunyuan-Large model has reportedly surpassed key competitors like DeepSeek-V2 and Meta's Llama3-405B in performance. Notably, Hunyuan-Large achieved these results while utilizing significantly less training data. This development suggests potential advancements in data efficiency for large language models. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

RANK_REASON A new model release from a major tech company claiming superior performance on benchmarks.

Read on Smol AINews →

COVERAGE [1]

Smol AINews TIER_1 · 2024-11-06 06:22

Tencent's Hunyuan-Large claims to beat DeepSeek-V2 and Llama3-405B with LESS Data

**Tencent** released a notable >300B parameter MoE model pretrained on **7T tokens**, including **1.5T synthetic data** generated via **Evol-Instruct**. The model introduces novel techniques like "recycle routing" and expert-specific learning rates, alongside a compute-efficient …

COVERAGE [1]

Tencent's Hunyuan-Large claims to beat DeepSeek-V2 and Llama3-405B with LESS Data

RELATED TOPICS