Asking For An Old Friend: Diagnosing and Mitigating Temporal Failure Modes in LLM-based Statutory Question Answering
Researchers have developed a benchmark to test Large Language Models' ability to handle temporal changes in legal statutes, identifying issues like outdated information and recency bias. Meanwhile, the AI industry is seeing a significant shift as model labs increasingly focus on building agent-based products rather than just foundational models. This strategic pivot is exemplified by companies like AI21 and DeepSeek, and is further underscored by DeepSeek's aggressive pricing strategy for its V4-Pro model, making advanced AI more accessible. AI
IMPACT The industry's focus is shifting from foundational models to agent-based products, with aggressive pricing making advanced AI more accessible and competitive.