ENTITY Belebele

Belebele

PulseAugur coverage of Belebele — every cluster mentioning Belebele across labs, papers, and developer communities, ranked by signal.

Total · 30d

3

3 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

3

3 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

3 day(s) with sentiment data

RECENT · PAGE 1/1 · 3 TOTAL

TOOL · CL_105152 · Jun 22 · 16:32

LangMAP tokenization improves multilingual model performance

Researchers have introduced LangMAP, a novel language-adaptive tokenization approach that generates language-specific tokenization from a single shared vocabulary. This method, based on the UnigramLM algorithm, can be a…
TOOL · CL_58715 · May 29 · 04:00

Multilingual Code-Switching Boosts LLM Performance Across Four Languages

Researchers have explored the impact of multilingual code-switching data (CSD) on large language models (LLMs) across four languages: English, Japanese, Korean, and Chinese. Their experiments demonstrated that incorpora…
RESEARCH · CL_55944 · May 27 · 11:01

New research tackles multilingual adaptation in Mixture-of-Experts models

Two new research papers explore the adaptation of Mixture-of-Experts (MoE) models for multilingual tasks. One paper analyzes how language specialization emerges in MoE models during continual pre-training, finding that …