Brief

last 24h

[4/4] 221 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.CL English(EN) · 4d

X-Token: Projection-Guided Cross-Tokenizer Knowledge Distillation

Researchers have developed X-Token, a novel knowledge distillation technique designed to improve student models by learning from teacher models with different tokenizers. The method addresses limitations in existing logit-based distillation, such as the uncommon-token failure and over-conservative matching, which can suppress critical tokens or exclude near-equivalent ones. X-Token utilizes a sparse projection matrix to align student and teacher distributions, outperforming current state-of-the-art methods on benchmarks like GSM8k and achieving significant gains with multi-teacher setups. AI

IMPACT Improves cross-tokenizer knowledge transfer, potentially enabling more efficient training of diverse language models.
- GSM8k
- Phi-4-Mini
- Qwen3-4B
- Llama-3.2-1B
- X-Token
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Unextractable Protocol Models: Collaborative Training and Inference without Weight Materialization

Researchers have introduced Unextractable Protocol Models (UPMs), a new framework for collaborative training and inference of neural networks where individual participants only process subsets of the model. This approach ensures that a complete set of model weights is never available to any single entity by periodically injecting time-varying transforms. UPMs demonstrate minimal impact on perplexity and add only a small overhead in latency, bandwidth, and memory during inference and training. AI

IMPACT Enables secure collaborative AI development by preventing model extraction, potentially facilitating community-driven training initiatives.
TOOL · arXiv cs.IR (Information Retrieval) English(EN) · 3d

TubiFM: Unified Item, Carousel, and Search Ranking for Streaming Discovery

Researchers have developed TubiFM, a new model that unifies item, carousel, and search ranking for streaming platforms. By representing user journeys as a single token sequence called "user stories," TubiFM leverages a Llama 3.2 1B base to perform next-token prediction for various discovery tasks. This approach significantly improves search and carousel viewing time while reducing latency and simplifying the overall ranking system. AI

IMPACT Unified discovery models like TubiFM can improve user engagement and reduce operational costs for streaming platforms.
SIGNIFICANT · Together AI blog (SW) · 2mo

Mamba-3

Together AI has released Mamba-3, a new state space model (SSM) prioritizing inference efficiency over training speed. This model features a more expressive recurrence formula, complex-valued state tracking, and a multi-input, multi-output (MIMO) variant that enhances accuracy without sacrificing decoding speed. Mamba-3 SISO has demonstrated superior performance in prefill and decode latency compared to previous Mamba versions and even the Llama-3.2-1B Transformer model at the 1.5B parameter scale. The team has also open-sourced the model's kernels, developed collaboratively with researchers from Carnegie Mellon University, Princeton University, and Cartesia AI. AI

IMPACT Sets a new benchmark for inference efficiency in state space models, potentially influencing future LLM architectures and deployment strategies.

Brief

X-Token: Projection-Guided Cross-Tokenizer Knowledge Distillation

Unextractable Protocol Models: Collaborative Training and Inference without Weight Materialization

TubiFM: Unified Item, Carousel, and Search Ranking for Streaming Discovery

Mamba-3