PulseAugur
LIVE 09:55:23
research · [2 sources] ·
0
research

Embedding models enhance race prediction for uncommon surnames

Researchers have developed an embedding-powered approach to improve probabilistic race prediction, addressing limitations in existing methods like Bayesian Improved Surname Geocoding (BISG). Standard BISG relies on Census data that omits uncommon surnames, leading to degraded performance for a significant portion of the population. The new method, eBISG, utilizes pre-trained text embeddings and neural networks to estimate race probabilities for names not covered by the Census, showing substantial gains particularly for Hispanic and Asian individuals. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Enhances demographic analysis by improving race prediction for underrepresented surnames, potentially aiding in disparity studies.

RANK_REASON This is a research paper detailing a new methodology for improving race prediction using embedding models.

Read on arXiv cs.CL →

COVERAGE [2]

  1. arXiv cs.CL TIER_1 · Noan Dasanaike, Kosuke Imai ·

    Using Embedding Models to Improve Probabilistic Race Prediction

    arXiv:2604.22555v1 Announce Type: new Abstract: Estimating racial disparity requires individual-level race data, which are often unavailable due to the sensitivity of collecting such information. To address this problem, many researchers utilize Bayesian Improved Surname Geocodin…

  2. arXiv cs.CL TIER_1 · Kosuke Imai ·

    Using Embedding Models to Improve Probabilistic Race Prediction

    Estimating racial disparity requires individual-level race data, which are often unavailable due to the sensitivity of collecting such information. To address this problem, many researchers utilize Bayesian Improved Surname Geocoding (BISG), which have critically relied on Census…