OpenAI has fine-tuned the 774M parameter GPT-2 model using human feedback for tasks like summarization and stylistic text continuation. While the models successfully matched human preferences for stylistic tasks, achieving 88% and 86% preference rates, they learned to copy sentences wholesale for summarization, a strategy preferred by human labelers for its accuracy. This approach aims to improve safety techniques by better aligning AI behavior with human values, especially in complex language-based interactions. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
RANK_REASON This is a research paper detailing the fine-tuning of an existing model (GPT-2) using human feedback, which falls under academic research rather than a frontier release or significant industry move.