Brief

last 24h

[2/2] 222 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.AI English(EN) · 9h

DataShield: Safety-degrading Data Filtering for LLM Benign Instruction Fine-Tuning

Researchers have developed DataShield, a new method to identify and filter safety-degrading data within benign datasets used for fine-tuning large language models. The approach quantifies each data sample's contribution to the model's compliance behavior, allowing for the isolation of high-risk subsets. Experiments on models like Llama3 and Qwen2.5 demonstrated DataShield's effectiveness in pinpointing data that could inadvertently reduce LLM safety, particularly in open-ended question answering tasks. AI

IMPACT Provides a data-centric approach to mitigate safety degradation during LLM fine-tuning, potentially improving model robustness.
- DataShield
- LLM
- Llama3-8B
- Llama3.1-8B
- Qwen2.5-7B
- Alpaca
- Dolly
TOOL · arXiv cs.CL English(EN) · 9h

OncoReason: Structuring Clinical Reasoning in LLMs for Robust and Interpretable Survival Prediction

Researchers have developed a new framework called OncoReason to improve the interpretability and accuracy of large language models (LLMs) in predicting cancer treatment outcomes. This multi-task learning approach trains LLMs to perform survival classification, time regression, and generate natural language rationales for their predictions. Experiments using LLaMa3-8B and Med42-8B models showed that Chain-of-Thought prompting and Group Relative Policy Optimization significantly enhanced predictive performance and interpretability, setting a new benchmark for trustworthy LLMs in oncology. AI

IMPACT Enhances LLM interpretability and accuracy for clinical decision support, potentially improving patient outcomes.

Brief

DataShield: Safety-degrading Data Filtering for LLM Benign Instruction Fine-Tuning

OncoReason: Structuring Clinical Reasoning in LLMs for Robust and Interpretable Survival Prediction