PulseAugur
EN
LIVE 21:41:15

Embedding models enhance race prediction for uncommon surnames

Researchers have developed an embedding-powered approach to improve probabilistic race prediction, addressing limitations in existing methods like Bayesian Improved Surname Geocoding (BISG). Standard BISG relies on Census data that omits uncommon surnames, leading to degraded performance for a significant portion of the population. The new method, eBISG, utilizes pre-trained text embeddings and neural networks to estimate race probabilities for names not covered by the Census, showing substantial gains particularly for Hispanic and Asian individuals. AI

IMPACT Enhances demographic analysis by improving race prediction for underrepresented surnames, potentially aiding in disparity studies.

RANK_REASON This is a research paper detailing a new methodology for improving race prediction using embedding models.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

Embedding models enhance race prediction for uncommon surnames

COVERAGE [2]

  1. arXiv cs.CL TIER_1 English(EN) · Noan Dasanaike, Kosuke Imai ·

    Using Embedding Models to Improve Probabilistic Race Prediction

    arXiv:2604.22555v1 Announce Type: new Abstract: Estimating racial disparity requires individual-level race data, which are often unavailable due to the sensitivity of collecting such information. To address this problem, many researchers utilize Bayesian Improved Surname Geocodin…

  2. arXiv cs.CL TIER_1 English(EN) · Kosuke Imai ·

    Using Embedding Models to Improve Probabilistic Race Prediction

    Estimating racial disparity requires individual-level race data, which are often unavailable due to the sensitivity of collecting such information. To address this problem, many researchers utilize Bayesian Improved Surname Geocoding (BISG), which have critically relied on Census…