Brief

last 24h

[3/3] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.CL English(EN) · 12h

Quality Over Clicks: Iterative Reinforcement Learning for Early-Stage E-Commerce Query Suggestion

Researchers have developed QualEQS, a novel framework for improving e-commerce query suggestions in early-stage deployment scenarios where click data is scarce. This quality-first iterative reinforcement learning approach focuses on answerability, factuality, and information gain, rather than solely relying on click-through rates. The system identifies ambiguous contexts and difficult training cases through group-level disagreement among suggestions, leading to a 6.81% improvement in online performance in a real-world conversational shopping assistant. AI

IMPACT This framework offers a method for improving AI-driven e-commerce query suggestions in low-data environments, potentially enhancing user experience and conversion rates.
- Hugging Face
- DagsHub
- CatalyzeX
- Gotit.pub
- ScienceCast
- QualEQS
- Qi Sun
- EQS-Benchmark
- arXiv
RESEARCH · arXiv cs.AI English(EN) · 3d · [4 sources]

RetailBench: Benchmarking long horizon reasoning and coherent decision making of LLM agents in realistic retail environments

Researchers have developed new benchmarks to evaluate the capabilities of large language model (LLM) agents in complex, real-world scenarios. ShoppingBench and EComAgentBench focus on intricate shopping tasks that involve hidden intents, budget management, and multi-product sourcing, revealing that even advanced models like GPT-4.1 struggle to achieve high success rates. Similarly, RetailBench assesses LLM agents in long-horizon retail management simulations, highlighting significant gaps in their decision-making and policy consistency compared to optimal strategies. AI

IMPACT These benchmarks highlight the need for more robust LLM agents capable of handling complex, multi-step reasoning and decision-making in real-world applications.
- ScienceCast
- RetailBench
- arXiv
- DagsHub
- Gotit.pub
- Hugging Face
- alphaXiv
- CatalyzeX
- Amazon
- EComAgentBench
- arXivLabs
- ShoppingBench
- GPT-4.1
- Qi Sun
TOOL · arXiv cs.LG English(EN) · 1mo

Beyond Crash: Hijacking Your Autonomous Vehicle for Fun and Profit

Researchers have developed a novel framework called JackZebra that can hijack autonomous vehicles by subtly altering their routes over extended periods. Unlike previous attacks that caused immediate safety failures, this method gradually steers the vehicle to an attacker-chosen destination without triggering obvious errors. The system uses a physically plausible attacker vehicle with a display and camera to convert adversarial patches into steering commands, successfully diverting victim vehicles in both simulated and real-world tests. AI

IMPACT Demonstrates a new class of long-horizon attacks against autonomous systems, necessitating more robust safety and security measures.
- Qi Sun
- arXiv
- JackZebra

Brief

Quality Over Clicks: Iterative Reinforcement Learning for Early-Stage E-Commerce Query Suggestion

RetailBench: Benchmarking long horizon reasoning and coherent decision making of LLM agents in realistic retail environments

Beyond Crash: Hijacking Your Autonomous Vehicle for Fun and Profit