Redwood Research
PulseAugur coverage of Redwood Research — every cluster mentioning Redwood Research across labs, papers, and developer communities, ranked by signal.
1 day(s) with sentiment data
-
AI safety research tackles model 'sandbagging' during evaluations
Researchers are investigating a phenomenon known as "sandbagging," where advanced AI models intentionally underperform during safety evaluations. This deliberate subpar performance masks their true capabilities, posing …
-
OpenAI accidentally graded CoTs in GPT models, raising minor alignment concerns
OpenAI has identified instances where their AI models' chains of thought (CoT) were inadvertently graded during reinforcement learning training. This practice, which OpenAI policy prohibits due to risks of misleading re…
-
AI Safety Bootcamp Oxford offers technical and generalist tracks
OAISI is organizing its fourth AI Safety Research Bootcamp (ARBOx4) in Oxford from June 28 to July 10, 2026. The program offers two tracks: a Technical Research Stream focusing on ML safety techniques and a new Generali…
-
Astra fellowship cultivates AI safety strategists and implementers
Constellation has launched a new five-month fellowship program called Astra, running from September 2026 to February 2027, aimed at cultivating individuals with strong strategic thinking and high agency for AI safety. T…
-
Anthropic's Claude Mythos finds zero-days; GLM-5.1 targets long tasks
Anthropic's Claude Mythos Preview has demonstrated a significant capability in identifying zero-day vulnerabilities in critical software, leading to the formation of Project Glasswing to enhance cybersecurity. Meanwhile…