PulseAugur
EN
LIVE 13:26:42

New benchmarks and LLMs tackle AI challenges in local services

Researchers have developed new benchmarks and models to improve the performance of AI agents in real-world local life service scenarios. One benchmark, LocalSearchBench, includes over 1.3 million merchant entries and 900 multi-hop question-answering tasks, revealing that even state-of-the-art models struggle with accuracy and faithfulness. Another approach, LocalSUG, uses a city-preference-enhanced LLM to improve query suggestions on local-life platforms, demonstrating a reduction in low-result rates and an increase in click-through rates in live testing. AI

IMPACT These advancements aim to improve AI agent performance in specialized domains, potentially leading to more effective local service discovery and user interaction.

RANK_REASON The cluster contains two research papers introducing new benchmarks and models for AI in local life services.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New benchmarks and LLMs tackle AI challenges in local services

COVERAGE [2]

  1. arXiv cs.AI TIER_1 English(EN) · Hang He, Chuhuai Yue, Chengqi Dong, Mingxue Tian, Hao Chen, Zhenfeng Liu, Jiajun Chai, Xiaohan Wang, Yufei Zhang, Qun Liao, Guojun Yin, Wei Lin, Chengcheng Wan, Haiying Sun, Ting Su ·

    LocalSearchBench: Benchmarking Agentic Search in Real-World Local Life Services

    arXiv:2512.07436v3 Announce Type: replace Abstract: Recent advances in large reasoning models LRMs have enabled agentic search systems to perform complex multi-step reasoning across multiple sources. However, most studies focus on general information retrieval and rarely explores…

  2. arXiv cs.CL TIER_1 English(EN) · Jinwen Chen, Shiwen Zhang, Shuai Gong, Zheng Zhang, Yachao Zhao, Lingxiang Wang, Haibo Zhou, Wei Lin, Hainan Zhang ·

    LocalSUG: City-Preference-Enhanced LLM for Query Suggestion in Local-Life Services

    arXiv:2603.04946v2 Announce Type: replace Abstract: In local-life service platforms, query suggestion reduces user effort by generating candidate queries from input prefixes. Traditional multi-stage systems rely heavily on historical popular queries, limiting their ability to cap…