Brief

last 24h

[2/2] 221 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.AI English(EN) · 1d

Content-Aware Attack Detection in LLM Agent Tool-Call Traffic: An Empirical Study of Features, Architectures, and Evaluation Protocols

Researchers have developed a novel framework for detecting attacks within the tool-call traffic of Large Language Model (LLM) agents. This system represents agent sessions as graphs, incorporating sentence-embedding features from tool arguments and responses to classify traffic as benign or malicious. The study found that content-level features are crucial for effective detection, significantly outperforming metadata-only approaches, and highlighted a common evaluation pitfall that can inflate performance metrics. AI

IMPACT This research introduces a more robust method for securing LLM agents by detecting malicious tool-use, which could improve the safety and reliability of AI systems interacting with external services.
- Model Context Protocol
- LLM
- SBERT
- ATBench
- RAS-Eval
TOOL · dev.to — LLM tag English(EN) · 6d

I Spent 6 Months Fixing RAG. Here's What I Found (And Built)

A developer spent six months debugging a Retrieval-Augmented Generation (RAG) system for document Q&A, identifying two key failure modes: semantic drift in query reformulation and context poisoning by irrelevant but similar chunks. To address these issues, they developed a new framework called VORTEXRAG, featuring a seven-layer architecture. Key innovations include Tri-Vector Encoding for richer embeddings, Vortex Retrieval Cone for improved document ranking, and a Semantic Drift Corrector to maintain query intent across multiple hops. AI

IMPACT This new framework offers a potential solution to common RAG system failures, which could improve the reliability of document Q&A and other LLM applications.
- GPT
- FAISS
- SBERT
- VORTEXRAG

Brief

Content-Aware Attack Detection in LLM Agent Tool-Call Traffic: An Empirical Study of Features, Architectures, and Evaluation Protocols

I Spent 6 Months Fixing RAG. Here's What I Found (And Built)