PulseAugur
LIVE 12:22:47
research · [4 sources] ·
0
research

New benchmarks advance tabular ML for imbalanced, string, and multimodal data

Researchers have introduced new benchmarks to advance tabular machine learning. TILBench addresses imbalanced learning across diverse data characteristics, revealing that no single method is universally superior. STRABLE tackles the understudied area of tabular data containing strings, finding that simple string embeddings paired with advanced tabular learners perform well on categorical-dominant tables. MulTaBench focuses on multimodal tabular learning, evaluating text and image data alongside tabular information, and highlights the benefits of task-specific tuning for embeddings. AI

Summary written by gemini-2.5-flash-lite from 4 sources. How we write summaries →

IMPACT Establishes new evaluation frameworks for tabular data, pushing research in imbalanced learning, string handling, and multimodal integration.

RANK_REASON Multiple research papers introduce new benchmarks for tabular machine learning tasks.

Read on Hugging Face Daily Papers →

COVERAGE [4]

  1. arXiv cs.LG TIER_1 · Jiaqi Luo ·

    TILBench: A Systematic Benchmark for Tabular Imbalanced Learning Across Data Regimes

    Imbalanced learning remains a fundamental challenge in tabular data applications. Despite decades of research and numerous proposed algorithms, a systematic empirical understanding of how different imbalanced learning methods behave across diverse data characteristics is still la…

  2. arXiv cs.LG TIER_1 · Gaël Varoquaux ·

    STRABLE: Benchmarking Tabular Machine Learning with Strings

    Benchmarking tabular learning has revealed the benefit of dedicated architectures, pushing the state of the art. But real-world tables often contain string entries, beyond numbers, and these settings have been understudied due to a lack of a solid benchmarking suite. They lead to…

  3. Hugging Face Daily Papers TIER_1 ·

    MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image

    Tabular Foundation Models have recently established the state of the art in supervised tabular learning, by leveraging pretraining to learn generalizable representations of numerical and categorical structured data. However, they lack native support for unstructured modalities su…

  4. arXiv cs.CV TIER_1 · Roi Reichart ·

    MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image

    Tabular Foundation Models have recently established the state of the art in supervised tabular learning, by leveraging pretraining to learn generalizable representations of numerical and categorical structured data. However, they lack native support for unstructured modalities su…