Researchers have developed a novel framework called CoRT for red-teaming Large Language Models (LLMs) specifically within the financial sector. This framework is designed to identify regulatory risks by progressively concealing the risk in prompts across multiple turns, rather than focusing on overtly harmful content. CoRT includes components for generating these multi-turn prompts and scoring their risk concealment, achieving a high attack success rate on nine tested LLMs. A new benchmark, FinRisk-Bench, was also created to support this research. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Introduces a new method for identifying subtle regulatory risks in financial LLMs, potentially improving model safety and compliance.
RANK_REASON This is a research paper detailing a new method for red-teaming LLMs in a specific domain.