Researchers have developed a new reinforcement learning pipeline called Retell, Reward, Repeat (RRR) designed to improve Large Language Models' (LLMs) storytelling capabilities. This method integrates Structuralist Narratology with scalar narrativity to train LLMs on logical and rational narrative event generation, addressing shortcomings in current post-training techniques like SFT. RRR utilizes a synthesized TimeTravel dataset and derives training signals from textual features via d-RLAIF, avoiding the need for reference outputs. Evaluations show RRR-trained LLMs outperform existing baselines in logic, rationality, and completeness, offering a cost-effective approach to enhancing LLM storytelling. AI
IMPACT This research offers a novel method to improve LLM narrative coherence and logic, potentially enhancing creative writing and interactive storytelling applications.
RANK_REASON The cluster contains an academic paper detailing a new method for training LLMs. [lever_c_demoted from research: ic=1 ai=1.0]
- arXiv
- David T. Liu
- d-RLAIF
- Hugging Face
- Retell, Reward, Repeat
- Structuralist Narratology
- TimeTravel dataset
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →