Phase-Aware Guidance Injection for Recurrent MAPPO in Assembly-Line Disruption Recovery
Researchers have developed a new framework to improve decision-making for assembly line disruption recovery. This phase-aware guidance injection system augments a trained recurrent Multi-Agent Proximal Policy Optimization (RMAPPO) policy by biasing action choices at the logit level during evaluation. The framework allows for the integration of various external recovery knowledge sources, including rule-based, replay-based, and LLM-based guidance, and is activated only during abnormal or recovery phases of operation. Experiments demonstrated that rule-based guidance provided the most significant improvements, while LLM guidance offered useful intermediate gains. AI
IMPACT This research could lead to more efficient and adaptive recovery strategies in industrial settings, reducing downtime and improving delivery times.