Graph-GRPO: Dependency-Aware Credit Assignment for Generative E-commerce Search Relevance
Researchers have developed Graph-GRPO, a novel framework for improving e-commerce search relevance by leveraging large language models and reinforcement learning. This method constructs a dependency graph of reasoning steps, allowing for more accurate credit assignment to individual components of the search relevance process. Online A/B tests on a major e-commerce platform showed improvements in both relevance classification and user engagement metrics. AI
IMPACT Enhances e-commerce search relevance, potentially improving user experience and sales through more accurate product matching.