PulseAugur / Brief
EN
LIVE 09:15:11

Brief

last 24h
[1/1] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. SkMTEB: Slovak Massive Text Embedding Benchmark and Model Adaptation

    Researchers have introduced SkMTEB, a new benchmark designed to evaluate text embedding models specifically for the Slovak language. This benchmark includes 31 datasets across 7 task types, significantly expanding coverage for this low-resource language. The study found that large multilingual models performed best, while existing Slovak-specific NLU models did not transfer well to embedding tasks. To address this, the team developed two open-source Slovak embedding models, \texttt{e5-sk-small} and \texttt{e5-sk-large}, which offer competitive performance with proprietary APIs while being locally deployable. AI

    IMPACT Provides a new evaluation framework and open-source models for Slovak language AI applications, potentially enabling better semantic search and RAG.