Protein language models fail to recover allergen epitopes, study finds

By PulseAugur Editorial · [1 sources] · 2026-06-20 18:25

A new study has found that residue-level attributions in protein language models do not accurately recover allergen epitopes, despite the models' robustness in protein-level allergenicity prediction. Researchers developed a benchmark to evaluate attribution faithfulness, revealing that explanations from models like ESM-2 and DeepPlantAllergy did not significantly align with annotated epitopes. The findings suggest that these models may rely on general sequence features rather than specific immunological mechanisms, cautioning against interpreting attribution signals as direct immunological explanations for safety screening or hypoallergen design without quantitative validation. AI

IMPACT Highlights limitations in current protein language models' interpretability for safety screening and hypoallergen design.

RANK_REASON Research paper detailing a new benchmark for evaluating protein language models' attribution faithfulness. [lever_c_demoted from research: ic=1 ai=1.0]

Read on Hugging Face Daily Papers →

paper
safety

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Protein language models fail to recover allergen epitopes, study finds

COVERAGE [1]

Hugging Face Daily Papers TIER_1 English(EN) · 2026-06-20 18:25

Residue-Level Attributions in Protein Language Models Do Not Recover Allergen Epitopes

Deep allergenicity classifiers are increasingly used in safety screening of novel foods, and recent protein language models have substantially improved protein-level allergenicity prediction. However, whether their explanations capture biologically meaningful information remains …

COVERAGE [1]

Residue-Level Attributions in Protein Language Models Do Not Recover Allergen Epitopes

RELATED ENTITIES

RELATED TOPICS