OpenAI has released gpt-oss-safeguard, a set of open-weight reasoning models designed for safety classification tasks. Available in 120B and 20B parameter sizes, these models are fine-tuned versions of existing gpt-oss models and are released under the Apache 2.0 license. The key innovation is their ability to interpret and apply developer-provided policies at inference time, offering a more flexible and explainable approach to content moderation compared to traditional methods. AI
Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →
RANK_REASON OpenAI released open-weight models for safety classification tasks, accompanied by a technical report, fitting the research bucket.