Brief

last 24h

[2/2] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

PAWS: Preference Learning with Advantage-Weighted Segments

Researchers have introduced PAWS, a novel method for preference-based reinforcement learning that addresses a critical training-inference mismatch. By utilizing segment-level advantage functions for policy updates, PAWS aligns utility training with optimization, preserving preference information and avoiding unreliable per-step signals. Experiments on robotic manipulation and locomotion tasks show PAWS outperforming existing approaches, underscoring the significance of distribution-consistent preference learning. AI

IMPACT Enhances reinforcement learning by improving temporal credit assignment and policy optimization through distribution-consistent preference learning.
- PAWS
- Aleksandar Taranovic
RESEARCH · arXiv cs.CL English(EN) · 1w · [2 sources]

When Does Complexity Conditioning Help a Frozen Sentence Embedding? A Controlled Study of Per-Sentence and Pair-Level Difficulty Adaptation

Researchers have investigated how to adapt frozen sentence embeddings to input complexity, finding that per-sentence difficulty adaptation is largely ineffective. Their study, using a Qwen3-Embedding-0.6B encoder, revealed that complexity is more of a pair-level property than an individual sentence one. However, a pair-level residual gated by a cross-encoder difficulty signal did show consistent gains on specific tasks like STS-B and QQP. AI

IMPACT This research clarifies when and how adapting sentence embeddings to input complexity can improve performance on specific NLP tasks.
- PAWS
- Qwen3-Embedding-0.6B

Brief

PAWS: Preference Learning with Advantage-Weighted Segments

When Does Complexity Conditioning Help a Frozen Sentence Embedding? A Controlled Study of Per-Sentence and Pair-Level Difficulty Adaptation