PulseAugur
LIVE 07:02:05
research · [12 sources] ·
0
research

LLM research tackles skill poisoning, persona discovery, and fragility

Researchers have developed "S^2IT," a new framework for tuning large language models to better integrate syntactic structure knowledge for aspect sentiment quad prediction. Separately, a system called VeriLLMed uses interactive visual debugging with knowledge graphs to audit and debug medical LLMs, identifying recurring error patterns. Another study presents a deployed system that automates enterprise customer support workflows by learning selective autonomy from copilot feedback, achieving significant automation and reduced handling times. Additionally, a new analytical framework called "The Pragmatic Persona" uses bridging inference to discover LLM personas by modeling discourse-level structures rather than surface-level cues. AI

Summary written by gemini-2.5-flash-lite from 12 sources. How we write summaries →

IMPACT New research explores advanced tuning techniques, debugging tools for medical LLMs, automated customer support, and persona discovery, potentially improving LLM reliability and application.

RANK_REASON The cluster contains multiple research papers detailing new methods and systems for LLM development and evaluation.

Read on arXiv cs.CL →

COVERAGE [12]

  1. arXiv cs.AI TIER_1 · Wenjie Xiao, Xuehai Tang, Biyu Zhou, Songlin Hu, Jizhong Han ·

    RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents

    arXiv:2604.22888v1 Announce Type: cross Abstract: Agent skills introduce a new and more severe form of indirect injection for LLM agents: unlike traditional indirect prompt injection, attackers can hide malicious instructions inside a dense, action-oriented skill that already fun…

  2. arXiv cs.CL TIER_1 · Jisoo Yang (Chung-Ang University), Jongwon Ryu (Chung-Ang University), Minuk Ma (University of British Columbia), Trung X. Pham (Van Lang University), Junyeong Kim (Chung-Ang University) ·

    The Pragmatic Persona: Discovering LLM Persona through Bridging Inference

    arXiv:2604.24079v1 Announce Type: new Abstract: Large Language Models (LLMs) reveal inherent and distinctive personas through dialogue. However, most existing persona discovery approaches rely on surface-level lexical or stylistic cues, treating dialogue as a flat sequence of tok…

  3. arXiv cs.CL TIER_1 · Erfan Baghaei Potraghloo, Seyedarmin Azizi, Souvik Kundu, Massoud Pedram ·

    One Token Away from Collapse: The Fragility of Instruction-Tuned Helpfulness

    arXiv:2604.13006v2 Announce Type: replace Abstract: Instruction-tuned large language models produce helpful, structured responses, but how robust is this helpfulness under trivial constraints? We show that simple lexical constraints (banning a single punctuation character or comm…

  4. arXiv cs.AI TIER_1 · Abid Talukder, Maruf Ahmed Mridul, Oshani Seneviratne ·

    Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach

    arXiv:2604.23090v1 Announce Type: new Abstract: Automatically generating formal ontologies from unstructured natural language remains a central challenge in knowledge engineering. While large language models (LLMs) show promise, it remains unclear which architectural design choic…

  5. arXiv cs.AI TIER_1 · Ashmi Banerjee, Adithi Satish, Wolfgang W\"orndl, Yashar Deldjoo ·

    Multi-Dimensional Evaluation of Sustainable City Trips with LLM-as-a-Judge and Human-in-the-Loop

    arXiv:2604.24158v1 Announce Type: new Abstract: Evaluating nuanced conversational travel recommendations is challenging when human annotations are costly and standard metrics ignore stakeholder-centric goals. We study LLMs-as-Judges for sustainable city-trip lists across four dim…

  6. arXiv cs.CL TIER_1 · Bingfeng Chen, Chenjie Qiu, Yifeng Xie, Boyan Xu, Ruichu Cai, Zhifeng Hao ·

    $\mathcal{S}^2$IT: Stepwise Syntax Integration Tuning for Large Language Models in Aspect Sentiment Quad Prediction

    arXiv:2604.23296v1 Announce Type: new Abstract: Aspect Sentiment Quad Prediction (ASQP) has seen significant advancements, largely driven by the powerful semantic understanding and generative capabilities of large language models (LLMs). However, while syntactic structure informa…

  7. arXiv cs.CL TIER_1 · Yurui Xiang, Xingyi Mao, Rui Sheng, Zixin Chen, Zelin Zang, Yuyang Wu, Haipeng Zeng, Huamin Qu, Yushi Sun, Yanna Lin ·

    VeriLLMed: Interactive Visual Debugging of Medical Large Language Models with Knowledge Graphs

    arXiv:2604.23356v1 Announce Type: new Abstract: Large language models (LLMs) show promise in medical diagnosis, but real-world deployment remains challenging due to high-stakes clinical decisions and imperfect reasoning reliability. As a result, careful inspection of model behavi…

  8. arXiv cs.CL TIER_1 · Nikita Borovkov, Elisei Rykov, Olga Tsymboi, Sergei Filimonov, Nikita Surnachev, Dmitry Bitman, Anatolii Potapov ·

    Learning Selective LLM Autonomy from Copilot Feedback in Enterprise Customer Support Workflows

    arXiv:2604.23855v1 Announce Type: new Abstract: We present a deployed system that automates end-to-end customer support workflows inside an enterprise Business Process Management (BPM) platform. The approach is scalable in production and reaches selective automation within two we…

  9. arXiv cs.CL TIER_1 · Junyeong Kim ·

    The Pragmatic Persona: Discovering LLM Persona through Bridging Inference

    Large Language Models (LLMs) reveal inherent and distinctive personas through dialogue. However, most existing persona discovery approaches rely on surface-level lexical or stylistic cues, treating dialogue as a flat sequence of tokens and failing to capture the deeper discourse-…

  10. arXiv cs.CL TIER_1 · Anatolii Potapov ·

    Learning Selective LLM Autonomy from Copilot Feedback in Enterprise Customer Support Workflows

    We present a deployed system that automates end-to-end customer support workflows inside an enterprise Business Process Management (BPM) platform. The approach is scalable in production and reaches selective automation within two weeks for a new process, leveraging supervision al…

  11. Hugging Face Daily Papers TIER_1 ·

    On Reasoning Behind Next Occupation Recommendation

    In this work, we develop a novel reasoning approach to enhance the performance of large language models (LLMs) in future occupation prediction. In this approach, a reason generator first derives a ``reason'' for a user using his/her past education and career history. The reason s…

  12. arXiv cs.CL TIER_1 · Ee-Peng Lim ·

    On Reasoning Behind Next Occupation Recommendation

    In this work, we develop a novel reasoning approach to enhance the performance of large language models (LLMs) in future occupation prediction. In this approach, a reason generator first derives a `"reason'' for a user using his/her past education and career history. The reason s…