OpenAI has developed a new evaluation method to assess the risk of large language models aiding in the creation of biological threats. Their initial study, involving biology experts and students, found that GPT-4 provided only a mild, statistically insignificant uplift in accuracy for threat creation tasks compared to internet-only access. This research is part of OpenAI's broader Preparedness Framework and aims to contribute to community understanding and the development of safety evaluations for AI-enabled risks. AI
RANK_REASON This is a research paper detailing a new evaluation method for AI safety risks, not a frontier model release or significant policy change.
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →