POLARIS: Guiding Small Models to Write Long Stories
Researchers have developed POLARIS, a new training method designed to improve the long-form creative writing capabilities of smaller open-weight language models. This method utilizes a frontier LLM as a judge with a structured quality rubric and incorporates human-written story references as high-reward anchors during training. Applied to Qwen3.5-9B, the resulting POLARIS-9B model demonstrates competitive performance against larger models and shows improved adherence to length instructions, even for stories exceeding its training length. AI
IMPACT Enhances the creative writing capabilities of smaller, more accessible language models, potentially democratizing advanced AI content generation.