Researchers have developed new benchmarks and models to improve the performance of AI agents in real-world local life service scenarios. One benchmark, LocalSearchBench, includes over 1.3 million merchant entries and 900 multi-hop question-answering tasks, revealing that even state-of-the-art models struggle with accuracy and faithfulness. Another approach, LocalSUG, uses a city-preference-enhanced LLM to improve query suggestions on local-life platforms, demonstrating a reduction in low-result rates and an increase in click-through rates in live testing. AI
IMPACT These advancements aim to improve AI agent performance in specialized domains, potentially leading to more effective local service discovery and user interaction.
RANK_REASON The cluster contains two research papers introducing new benchmarks and models for AI in local life services.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →