PulseAugur
LIVE 12:25:34
research · [2 sources] ·
0
research

OpenAI releases open-weight reasoning models for safety classification

OpenAI has released gpt-oss-safeguard, a set of open-weight reasoning models designed for safety classification tasks. Available in 120B and 20B parameter sizes, these models are fine-tuned versions of existing gpt-oss models and are released under the Apache 2.0 license. The key innovation is their ability to interpret and apply developer-provided policies at inference time, offering a more flexible and explainable approach to content moderation compared to traditional methods. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

RANK_REASON OpenAI released open-weight models for safety classification tasks, accompanied by a technical report, fitting the research bucket.

Read on OpenAI News →

OpenAI releases open-weight reasoning models for safety classification

COVERAGE [2]

  1. OpenAI News TIER_1 ·

    gpt-oss-safeguard technical report

    gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are two open-weight reasoning models post-trained from the gpt-oss models and trained to reason from a provided policy in order to label content under that policy. In this report, we describe gpt-oss-safeguard’s capabilities and pr…

  2. OpenAI News TIER_1 ·

    Introducing gpt-oss-safeguard

    OpenAI introduces gpt-oss-safeguard—open-weight reasoning models for safety classification that let developers apply and iterate on custom policies.