A new study has found that residue-level attributions in protein language models do not accurately recover allergen epitopes, despite the models' robustness in protein-level allergenicity prediction. Researchers developed a benchmark to evaluate attribution faithfulness, revealing that explanations from models like ESM-2 and DeepPlantAllergy did not significantly align with annotated epitopes. The findings suggest that these models may rely on general sequence features rather than specific immunological mechanisms, cautioning against interpreting attribution signals as direct immunological explanations for safety screening or hypoallergen design without quantitative validation. AI
IMPACT Highlights limitations in current protein language models' interpretability for safety screening and hypoallergen design.
RANK_REASON Research paper detailing a new benchmark for evaluating protein language models' attribution faithfulness. [lever_c_demoted from research: ic=1 ai=1.0]
Read on Hugging Face Daily Papers →
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →