Brief

last 24h

[50/9095] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · Hugging Face Daily Papers English(EN) · 4d · [8 sources]

UniPET: a universal network for high-quality PET image denoising across varied dose reduction factors

Researchers have developed two novel deep learning approaches for improving Positron Emission Tomography (PET) image denoising. UniPET utilizes domain generalization and region-aware learning to create a universal model capable of denoising images across various dose reduction factors, addressing issues of style misalignment and over-smoothing. U-TTT employs test-time training with dual-domain adaptation (spatial and frequency) to dynamically adjust model parameters during inference, enabling robust generalization even with unseen dose levels or scanner types. AI

IMPACT These advancements in AI-driven PET image denoising could lead to more accurate diagnoses with lower radiation exposure for patients.
RESEARCH · Hugging Face Daily Papers English(EN) · 4d · [3 sources]

WorldOlympiad: Can Your World Model Survive a Triathlon?

A new benchmark called WorldOlympiad has been introduced to evaluate video-based world models. It assesses physical faithfulness, geometric consistency, and interaction fidelity, going beyond typical metrics like visual quality. The benchmark aims to reveal shortcomings in current models' ability to adhere to physical laws and maintain coherent 3D structures over extended periods. Experiments using WorldOlympiad on state-of-the-art models have exposed significant gaps in their reasoning and interaction capabilities. AI

IMPACT This benchmark could drive improvements in generative models' understanding of physics and 3D consistency, crucial for applications like robotics and gaming.
RESEARCH · Hugging Face Daily Papers English(EN) · 4d · [4 sources]

Next Forcing: Causal World Modeling with Multi-Chunk Prediction

Researchers have introduced "Next Forcing," a novel multi-chunk prediction framework designed to enhance autoregressive video generation. This method addresses limitations in current models by providing explicit signals about future dynamics, leading to faster training convergence and improved accuracy, particularly at high frame rates. The framework also accelerates inference and demonstrates better adherence to physical laws in generated videos. AI

IMPACT Accelerates training and inference for autoregressive video models, potentially enabling more complex and realistic video generation.
RESEARCH · arXiv stat.ML English(EN) · 5d · [2 sources]

Weighted universal approximation of differentiable maps on infinite-dimensional manifolds

Researchers have published a paper detailing a generalized universal approximation theorem for neural networks. This new theorem extends previous work by enabling the approximation of not only functions but also their derivatives. The findings are applicable to differentiable maps on infinite-dimensional manifolds and have implications for approximating non-anticipative functionals and path space functionals. AI

IMPACT Extends theoretical understanding of neural network capabilities, potentially enabling more complex function and derivative approximations.
- Philipp Schmocker
RESEARCH · arXiv cs.AI Italiano(IT) · 5d · [2 sources]

Topological Neural Operators

Researchers have introduced Topological Neural Operators (TNOs), a new framework for learning operators on cell complexes. TNOs extend existing neural operators by modeling interactions through Discrete Exterior Calculus, allowing for explicit cross-dimensional coupling. This approach respects the geometric properties of physical quantities and can improve accuracy on Partial Differential Equation benchmarks, especially for complex flow problems. AI

IMPACT Introduces a novel framework for operator learning that respects geometric properties and improves PDE benchmark accuracy.
RESEARCH · arXiv cs.CV English(EN) · 5d · [2 sources]

End-to-End Optimization of Incoherent Imaging for Classification Under Detector-Limited Readout

Researchers have developed a theoretical framework to understand when optimizing optical front-ends with neural network back-ends improves imaging classification performance. The study found that these gains are most significant under constrained detector readout, such as limited measurements or coarse sampling, by enhancing class separability. However, under full detector readout, conventional lenses perform comparably, and joint optimization offers no empirical advantage. The research also highlights that these optical-neural network co-designs are most effective with low detector noise and when discriminative content is concentrated at lower spatial frequencies. AI

IMPACT Provides a theoretical basis for co-designing optics and AI, potentially leading to more efficient imaging systems for classification tasks.
- SVHN
- MNIST
RESEARCH · arXiv cs.CV English(EN) · 5d · [2 sources]

POTATR: A Lightweight Image-to-Graph Model for Page-Level Table Extraction

Researchers have developed POTATR, a new lightweight image-to-graph model for extracting tables from documents. This 29 million parameter model significantly outperforms existing methods on the PubTables-v2 benchmark, achieving a GriTS_Con score of 0.964. POTATR is also considerably faster and more cost-effective than current large language models, with its output being spatially grounded for verification and further integration. AI

IMPACT Sets a new standard for efficient and accurate table extraction, potentially accelerating document processing workflows.
RESEARCH · arXiv cs.LG English(EN) · 5d · [2 sources]

Zero Touch Predictive Orchestration: Automating Time-Series Models for the Cloud-Edge Continuum

Researchers have developed a new automated architecture for time-series prediction in volatile cloud-edge environments. This system addresses the "cold start" problem for newly discovered nodes by merging sparse local telemetry data with a high-resolution public dataset called TimeTrack. A Neural Architecture Search engine then generates accurate baseline models, significantly improving forecasting accuracy and convergence speed. AI

IMPACT Introduces a novel data-mixing methodology to improve time-series forecasting accuracy in volatile cloud-edge environments.
- Cloud-Edge Continuum
- Abd Elghani Meliani
RESEARCH · arXiv cs.AI English(EN) · 5d · [2 sources]

Who Earns the Safety? Intervention-Aware Quantum Predictive Control with Safety Attribution

Researchers have developed a new method called Intervention-Aware Variational Quantum Differentiable Predictive Control (IA-VQC-DPC) to better measure the safety contributions of AI policies versus their protective layers. This approach trains quantum circuit policies with a budget that penalizes over-reliance on safety filters. Evaluations on building control emulators demonstrated that IA-VQC-DPC significantly reduces pre-filter violations and reliance on safety layers, indicating improved policy-level safety. AI

IMPACT Introduces a novel framework for evaluating and improving the intrinsic safety of AI policies, moving beyond simple compliance.
RESEARCH · arXiv cs.CV English(EN) · 5d · [2 sources]

SemDINO: A DINOv3-Driven Network for Cross-Temporal Semantic Alignment in Change Detection

Researchers have developed SemDINO, a new network designed for semantic change detection in remote sensing imagery. This model integrates a dual-branch encoder using CNNs and frozen DINOv3 features, along with a multi-scale temporal interaction module. SemDINO also incorporates modules for semantic purification and change enhancement to improve accuracy and robustness against pseudo-changes. AI

IMPACT Introduces a novel architecture for improved semantic change detection in remote sensing, potentially aiding in land-cover analysis and monitoring.
- DINOv3
- SemDINO
RESEARCH · arXiv cs.LG English(EN) · 5d · [2 sources]

Discovering Functionally Selective Brain Regions with a Deep Topographic Multimodal Model

Researchers have developed Topo-Omni, a novel topographic multimodal model that integrates visual, auditory, and language processing onto a single in-silico sheet. This model, fine-tuned from a pretrained foundation model, demonstrates that a unified spatial principle can organize representations across different modalities and processing stages. The model's clusters align with human neuroimaging findings, and manipulating these clusters selectively impacts perception, offering testable hypotheses about cortical organization. AI

IMPACT This model offers a new framework for understanding brain organization and can generate testable hypotheses for neuroscience research.
- arXiv
- Topo-Omni
RESEARCH · arXiv cs.AI English(EN) · 5d · [2 sources]

Data Synthesis and Parameter-Efficient Fine-Tuning for Low-Resource NMT: A Case Study on Q'eqchi' Mayan

Researchers have developed a novel data synthesis method to create neural machine translation (NMT) models for low-resource Indigenous languages, specifically Q'eqchi' Mayan. By transforming dictionaries into a synthetic corpus and using Parameter-Efficient Fine-Tuning (PEFT) with LoRA adapters on an mT5-base model, they achieved strong structural acquisition. However, the resulting model showed a significant gap in lexical grounding compared to organic language, indicating that while synthetic data is effective for learning grammar, authentic data is crucial for semantic refinement. AI

IMPACT Demonstrates a viable method for creating translation models for endangered languages, preserving linguistic data sovereignty.
RESEARCH · arXiv cs.AI English(EN) · 5d · [4 sources]

Transition-Based Digital Twin Modelling for Alzheimer's Disease under Sparse Longitudinal Data

Researchers have developed advanced machine learning models to predict Alzheimer's disease severity and progression. One approach uses multimodal data, including MRI scans and clinical information, with an ordinal regression framework to improve accuracy and interpretability in staging the disease. Another method introduces a personalized digital twin framework that leverages sparse longitudinal data to model disease transitions, enabling patient-specific trajectory analysis and uncertainty quantification. AI

IMPACT These AI models offer improved tools for early detection, personalized monitoring, and clinical decision support in neurodegenerative disease research.
RESEARCH · arXiv cs.AI English(EN) · 5d · [3 sources]

ATN3D: Density-Aware LiDAR-Radar Early 3D Object Detection Under Extreme Sparsity

Researchers have developed ATN3D, a new LiDAR-Radar framework designed for improved 3D object detection in sparse sensing conditions, crucial for autonomous vehicles. The system addresses challenges in long-range detection by employing density-aware early fusion and occupancy-gated aggregation to reduce noise and optimize detection of distant objects. ATN3D demonstrated significant performance gains on the VoD benchmark, particularly in foggy conditions and for objects over 30 meters away, indicating more reliable early detection in challenging environments. AI

IMPACT Enhances perception systems for autonomous vehicles, enabling earlier and more reliable detection of distant objects in challenging weather and sparse sensing scenarios.
- VoD benchmark
- Radar
- LiDAR
- ATN3D
RESEARCH · arXiv cs.AI English(EN) · 5d · [2 sources]

I Was Scrolling and Then I Saw a Pregnant Strawberry

A new academic paper analyzes the phenomenon of "AI minidramas" or "fruit dramas," short generative AI video series popular on social media. The research argues these seemingly cute videos perpetuate harmful gendered narratives, associating female characters with transgression and reproductive themes. Furthermore, the paper suggests these narratives can also encode racialization, with the videos' aesthetic serving to mask their ideological content. AI

IMPACT Highlights how generative AI can be used to perpetuate harmful social narratives, necessitating critical analysis of AI-generated content.
RESEARCH · arXiv cs.AI English(EN) · 5d · [3 sources]

Optical Reasoning: Rethinking Images as an Expressive Reasoning Medium Beyond Text

Researchers have introduced "optical reasoning," a novel approach that utilizes images as the primary medium for AI reasoning, moving beyond traditional text-based methods. This technique involves two variants: typographic-based optical reasoning for compact rationale rendering and graphical-based optical reasoning for structured visual rationales. Experiments show that optical reasoning can match or surpass text-based reasoning in various benchmarks, significantly reducing reasoning tokens and improving token efficiency. AI

IMPACT This approach could lead to more efficient and versatile AI models by leveraging visual data for complex reasoning tasks.
RESEARCH · arXiv cs.AI English(EN) · 5d · [3 sources]

LLM-Orchestrated Conformance Checking in Stroke Care Without Computer-Interpretable Guidelines

Researchers have developed a framework using Large Language Models (LLMs) to check if patient care aligns with clinical guidelines, even when those guidelines aren't in a machine-readable format. This system extracts patient data from unstructured texts and interprets rules from guideline documents to assess compliance. In a study of stroke care at Alessandria Hospital, the LLM-based approach found that over 86% of patient traces adhered to the guidelines. AI

IMPACT Enables automated healthcare compliance checks using unstructured data, potentially improving quality and efficiency.
- Alessandria Hospital
- Large Language Models
RESEARCH · arXiv cs.CV English(EN) · 5d · [3 sources]

Optical Music Recognition for Real-World Manuscripts with Synthetic Data

Researchers have developed a new approach to Optical Music Recognition (OMR) specifically for real-world handwritten musical manuscripts. Existing OMR systems struggle with the diversity of manuscripts, which differ significantly from the digital formats they are typically trained on. By using synthetic data generated with fine-grained music notation graph annotations and a tool called Smashcima, the new method shows significant improvement in recognizing complex piano notation, even without in-domain symbol annotation. AI

IMPACT Improves AI's ability to digitize and preserve historical musical documents, making them more accessible.
- Smashcima
- MuNG
RESEARCH · dev.to — LLM tag English(EN) · 4d · [2 sources]

LoRA and QLoRA fine-tuning: what they actually do under the hood

This article provides a practical guide to fine-tuning large language models like Llama 3 using Parameter-Efficient Fine-Tuning (PEFT) methods, specifically LoRA and QLoRA. It explains that while base LLMs are general, fine-tuning can adapt them for specific tasks, tones, or knowledge. LoRA achieves this by training only a small set of adapter weights instead of the entire model, significantly reducing computational cost. QLoRA further optimizes this by incorporating 4-bit quantization, enabling fine-tuning of very large models on limited hardware. AI

IMPACT Enables developers to adapt large language models for specific tasks and tones with reduced computational resources.
- LoRA
- Dettmers et al.
- Hu et al.
- A100-80
- RTX 4090
- Llama
- LLM
- QLoRA
- GPT
- Llama 3
- DeepSeek
RESEARCH · arXiv cs.LG English(EN) · 4d · [2 sources]

Sleep EEG Signal Criticality as a Non-Invasive Predictor of Cognitive Decline in Dementia

Researchers have found that sleep EEG signal criticality, measured using Multifractal Detrended Fluctuation Analysis (MFDFA), can predict future cognitive decline in dementia. Analysis of longitudinal data showed that cognitively healthy individuals exhibited sleep dynamics closer to an optimally critical state compared to those who later developed dementia. These findings suggest that MFDFA measures could be integrated into automated sleep-based screening tools for earlier intervention. AI

IMPACT Highlights potential for AI-driven tools to enable earlier diagnosis and intervention for neurodegenerative diseases.
RESEARCH · IEEE Spectrum — AI English(EN) · 4d · [2 sources]

AI Can Help Track the World’s Shrinking Glaciers

Researchers have developed a new AI approach to automate the tracking of glacier calving fronts, a critical but labor-intensive task for monitoring climate change. By adapting a deep learning model with minimal new data, including hand-labeled images and geological maps, the system significantly reduces the error in identifying glacier boundaries. This advancement promises to enable more comprehensive global glacier monitoring, with initial applications already underway in Norway's Svalbard archipelago. AI

IMPACT Enables more comprehensive global glacier monitoring, crucial for understanding climate change impacts and sea level rise.
RESEARCH · Hugging Face Daily Papers English(EN) · 4d · [2 sources]

FOGO: Forgetting-aware Orthogonalization Optimizer

Researchers have introduced FOGO, a novel optimizer designed to combat forgetting during AI model training. FOGO addresses both short-term forgetting at each training step and long-term forgetting common in continual learning by detecting and resolving gradient interference. The optimizer uses spectral orthogonalization and a compact codebook memory to preserve past update directions, demonstrating improved convergence and knowledge retention across various tasks, including fine-tuning LLaVA-7B and pretraining GPT-2, outperforming existing optimizers like Adam and Muon. AI

IMPACT FOGO's ability to reduce forgetting could lead to more efficient and effective AI model training, particularly in continual learning scenarios.
- GPT-2
- Muon
- LLaVA-7B
- Adam
RESEARCH · Hugging Face Daily Papers English(EN) · 4d · [3 sources]

Flash-GMM: A Memory-Efficient Kernel for Scalable Soft Clustering

Researchers have developed Flash-GMM, a new fused Triton kernel designed for efficient Gaussian Mixture Model (GMM) computations on GPUs. This kernel significantly reduces memory requirements by avoiding the materialization of the full responsibility matrix, leading to a 20x speedup and enabling the processing of datasets 100x larger than previously possible on a single device. Flash-GMM has been integrated into approximate nearest-neighbor search, offering a viable alternative to k-means clustering and improving recall rates. AI

IMPACT Accelerates GMM clustering for large-scale data, potentially improving performance in applications like ANN search.
RESEARCH · Hugging Face Daily Papers English(EN) · 4d · [4 sources]

Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

Researchers have identified that Chain-of-Thought (CoT) fine-tuning, while improving reasoning, significantly degrades long-context recall in hybrid linear-attention models. This issue, termed "attention amnesia," causes performance drops on tasks like Needle-In-A-Haystack. A new training-free method called QK-Restore has been proposed to fix this by restoring specific query-key projection weights from a pre-fine-tuning checkpoint, successfully recovering long-context capabilities without sacrificing reasoning performance. AI

IMPACT Addresses a critical issue in LLM fine-tuning, potentially enabling more robust long-context capabilities for advanced reasoning tasks.
RESEARCH · Hugging Face Daily Papers English(EN) · 4d · [3 sources]

Kwai Keye-VL-2.0 Technical Report

Kwai has released Keye-VL-2.0-30B-A3B, an open-source multimodal foundation model designed for long-video understanding and agentic intelligence. This model utilizes DeepSeek Sparse Attention to process up to 256K context, capturing essential frames and temporal dependencies in hour-long videos. It also incorporates Cross-Modal Multi-Teacher On-Policy Distillation to enhance multi-task alignment and agent collaboration across various scenarios. Evaluations show state-of-the-art performance on video understanding and temporal localization benchmarks. AI

IMPACT Enables advanced agent collaboration and improved long-video comprehension, potentially accelerating development in multimodal AI applications.
RESEARCH · Latent Space (swyx) English(EN) · 4d · [4 sources]

[AINews] FrontierCode: Benchmarking for Code Quality over Slop

A new benchmark called UOJ-Bench has been developed to evaluate Large Language Models (LLMs) on code generation, hacking, and repair tasks, moving beyond simple problem-solving. Initial tests show that even top-tier models struggle with identifying errors in human-written code, with success rates below 50% in one-shot evaluations. While test-time scaling improves performance significantly, it incurs substantial computational costs, limiting practical deployment. However, the best models can still identify errors in a small percentage of full-score submissions, suggesting potential for LLMs to offer complementary insights to existing judging systems. AI

IMPACT New benchmarks like UOJ-Bench and FrontierCode are pushing LLM evaluations beyond simple problem-solving to assess more nuanced capabilities like code repair and maintainability, highlighting current limitations.
RESEARCH · arXiv cs.NE (Neural & Evolutionary) English(EN) · 5d · [2 sources]

Spiking Neural Network inference on FPGAs with hls4ml

Researchers have developed an extension for the hls4ml toolkit to enable the deployment of Spiking Neural Networks (SNNs) on Field-Programmable Gate Arrays (FPGAs). This new capability allows for clock-driven inference of SNNs trained in PyTorch, offering low-latency temporal processing. The system demonstrated inference times of approximately 34 microseconds on a quantized SNN, paving the way for streamlined optimization and deployment of SNN models for real-time applications. AI

IMPACT Enables low-latency inference for Spiking Neural Networks on FPGAs, potentially improving real-time processing capabilities.
RESEARCH · arXiv cs.AI English(EN) · 5d · [2 sources]

An Agency-Transferring Model-Free Policy Enhancement Technique

Researchers have developed a new technique to enhance reinforcement learning (RL) policies by leveraging existing suboptimal baseline policies. This method gradually transfers control from the baseline to a trainable learning policy, improving training efficiency and ultimately producing a standalone policy that outperforms the original baseline. The approach is formalized with theoretical analysis and demonstrated through empirical results on continuous-control benchmarks, showing high goal-reaching rates throughout the training process. AI

IMPACT Introduces a more efficient method for training reinforcement learning agents, potentially reducing computational costs and improving performance on complex control tasks.
RESEARCH · arXiv cs.CV English(EN) · 5d · [2 sources]

iMaC: Translating Actions into Motion and Contact Images for Embodied World Models

Researchers have introduced iMaC (Image as Action Control), a new paradigm for embodied world models in robotics. This approach uses raw visual images as action representations, moving away from traditional low-dimensional vectors. iMaC aims to improve generalization, dynamic modeling, and control for diverse robotic agents by treating visual manipulation as image-based action tokens. AI

IMPACT This new approach could enable more flexible and universal control for heterogeneous embodied agents in robotics.
- iMaC
- Embodied world models
RESEARCH · arXiv cs.CV English(EN) · 5d · [2 sources]

Beyond Spherical Harmonics: Rethinking Appearance Models for Radiance Reconstruction

Researchers have developed a new method for representing complex appearance effects in 3D scene reconstruction, moving beyond traditional Spherical Harmonics (SH). Their work introduces the Normalized Anisotropic Spherical Gabor function, which efficiently models high-frequency details like specular reflections and glints. This new formulation offers improved reconstruction quality while being significantly more memory-efficient and faster to evaluate than existing approaches. AI

IMPACT Introduces a more efficient and effective method for modeling complex visual phenomena in 3D reconstruction, potentially improving realism in generated scenes.
- Normalized Anisotropic Spherical Gabor function
RESEARCH · arXiv cs.CL English(EN) · 5d · [2 sources]

iOSWorld: A Benchmark for Personally Intelligent Phone Agents

Researchers have introduced iOSWorld, a new benchmark designed to evaluate the personalization capabilities of AI agents on mobile devices. This benchmark features a simulated iOS environment with 26 interconnected apps that store user-specific data like messages and financial records. It includes 133 tasks, ranging from single-app operations to complex multi-app scenarios requiring memory and personalization inference. Initial evaluations show that even advanced models struggle with these tasks, with the best configuration achieving only 52% overall accuracy. AI

IMPACT This benchmark will drive the development of more personalized and context-aware AI agents for mobile devices.
RESEARCH · arXiv cs.AI English(EN) · 5d · [2 sources]

Difference-Aware Retrieval Policies for Imitation Learning

Researchers have developed a new imitation learning method called Difference-Aware Retrieval Policies (DARP). This approach improves generalization by using training data during inference, predicting actions based on k-nearest neighbors and their relative distances to query states. DARP achieves significant performance gains over standard behavior cloning in various domains, including robotics and continuous control. AI

IMPACT Enhances generalization in imitation learning, potentially improving robotic control and autonomous systems.
- arXiv
- Difference-Aware Retrieval Policies for Imitation Learning
RESEARCH · arXiv cs.LG English(EN) · 5d · [2 sources]

Perturbative Contrastive Physical Learning

Researchers have introduced Perturbative Contrastive Physical Learning (PCPL), a new framework where learning arises from contrasting how physical systems respond to slight variations. This approach unifies and extends existing methods like Equilibrium Propagation and Frequency Propagation. PCPL allows learning without centralized gradient computation, as the learning geometry emerges implicitly from the system's physical response. AI

IMPACT Introduces a novel learning paradigm that bypasses traditional gradient-based methods, potentially enabling new forms of physical AI systems.
RESEARCH · arXiv cs.AI English(EN) · 5d · [2 sources]

Hybrid Robustness Verification for Spatio-Temporal Neural Networks

Researchers have developed a new framework called Spatio-Temporal Bound Propagation (STBP) to improve the verification of neural networks used in safety-critical applications like autonomous driving and medical imaging. This method models adversarial perturbations with more realistic spatio-temporal constraints, leading to tighter approximations and better robustness guarantees than existing techniques. The framework also introduces ST-Bench, a new benchmark designed to systematically evaluate verifiable robustness in these domains. AI

IMPACT Enhances AI safety by providing more accurate robustness guarantees for models in critical systems.
RESEARCH · arXiv cs.LG English(EN) · 5d · [2 sources]

Learning Dynamics Reveal a Hierarchy of Weight-Induced Layerwise Gram Metrics

Researchers have developed a new framework for understanding the training dynamics of feed-forward ReLU neural networks. Their work rewrites gradient descent not as a weight-space dynamic, but as a collective dynamic on the training-set space. For deeper networks, this reveals a hierarchical structure of weight-induced operators that manage information flow between layers. AI

IMPACT Provides a new theoretical lens for analyzing and potentially optimizing neural network training.
RESEARCH · arXiv cs.LG English(EN) · 5d · [2 sources]

Tight Sample Complexity of Transformers

Researchers have precisely defined the VC dimension for depth-L Transformers with W parameters, establishing an upper bound of O(LW log(TW)) and a nearly matching lower bound. The study also characterizes the sample complexity for chain-of-thought learning with these Transformers, showing teacher forcing achieves O(LW log((T+T')W)) complexity. Any learning rule utilizing chain-of-thought data requires at least \Omega(LW log((T+T')W/L)) examples. AI

IMPACT Provides theoretical bounds on Transformer learning, potentially guiding future model design and efficiency.
RESEARCH · arXiv cs.LG English(EN) · 5d · [2 sources]

Disentanglement with Holographic Reduced Representations

Researchers have developed a novel unsupervised learning algorithm for neural disentanglement using holographic reduced representations (HRR). This approach treats disentangled representations as symbolic structures, moving away from continuous representations common in prior work. The HRR unbinding operation demonstrates an inductive bias for separating factors, achieving competitive results on disentanglement metrics and showing robustness to noise. AI

IMPACT Introduces a novel method for disentangling representations, potentially improving model interpretability and robustness.
RESEARCH · arXiv cs.AI English(EN) · 5d · [2 sources]

Proxy Reward Internalization and Mechanistic Exploitation: A Learned Precursor to Reward Hacking and Its Generalization

Researchers have introduced PRIME, a new capability that assesses task correctness and predicts proxy acceptance in AI models. This capability emerges before visible reward hacking occurs and can forecast the onset and severity of such issues. PRIME adapts to changing evaluators and can serve as an early warning signal for alignment risks in AI systems. AI

IMPACT Identifies a potential early-warning signal for AI alignment risks, enabling proactive mitigation strategies.
- arXiv
RESEARCH · arXiv cs.LG English(EN) · 5d · [2 sources]

When Do Local Score Models Extrapolate Across Size? A Diagnostic Theory and Benchmark

Researchers have developed a new diagnostic theory and benchmark to understand how well local score models can extrapolate across different system sizes. They found that architectural locality alone is insufficient for stable size extrapolation, which is instead governed by the quasi-locality of the Gaussian-smoothed score. The study introduces the Finite-Depth Local Flow (FDLF) benchmark to empirically validate these findings, demonstrating that stable extrapolation depends on the interplay between spatial mixing, score quasi-locality, and model receptive fields. AI

IMPACT Provides a theoretical framework and diagnostic tool to improve the reliability of AI models in scientific generative modeling tasks.
- Gaussian-smoothed score
- Finite-Depth Local Flow (FDLF)
RESEARCH · arXiv cs.AI English(EN) · 5d · [2 sources]

Learning to Attack and Defend: Adaptive Red Teaming of Language Models via GRPO

Researchers have developed AdvGRPO, a novel co-training framework designed to enhance the adaptive red teaming of language models. This method addresses the instability of GRPO in attacker-defender optimization by employing dense multi-channel rewards and decoupled advantage normalization. The training process follows a curriculum, starting with single-turn attacks and progressing to multi-turn scenarios before initiating co-training, ultimately producing more effective attacks and robust defenders. AI

IMPACT Introduces a more stable and effective method for testing and improving AI safety by simulating adversarial attacks and defenses.
RESEARCH · arXiv cs.LG English(EN) · 5d · [2 sources]

What the Eyes See, the LLMs Miss: Exploiting Human Perception for Adversarial Text Attacks

Researchers have developed a new method called Human-Perceptible Adversarial Attacks (HPAA) that exploits the difference between human and large language model (LLM) perception of harmful content. By using typographic manipulations like spacing and visual emphasis, these attacks can make harmful text easily recognizable to humans while remaining undetected by LLM-based moderation systems. In tests, HPAA achieved over 86% human recognition with less than 1% detection by moderation systems, revealing a significant vulnerability in current content moderation. AI

IMPACT Highlights a critical vulnerability in LLM-based content moderation, necessitating new approaches that better align with human perception.
RESEARCH · arXiv cs.CV English(EN) · 5d · [2 sources]

Cranio-Diff: Diffusion-based Cross-domain Craniofacial Reconstruction with 2D X-ray Skull Guidance and Structural Identity Constraints

Researchers have developed Cranio-Diff, a novel diffusion-based framework for reconstructing faces from 2D X-ray skull images. This method addresses limitations in existing generative models by integrating skull-conditioned structural guidance and biometric text conditioning to ensure semantic and structural alignment between the skull and the generated face. The framework was evaluated on a unique dataset of 120 subjects, generating synthesized faces across different age groups and BMI variations, and demonstrated superior performance in image quality and retrieval tasks compared to existing approaches, suggesting its utility in forensic investigations. AI

IMPACT This research offers a new tool for forensic investigations by improving the accuracy of facial reconstruction from skeletal remains.
RESEARCH · arXiv cs.AI English(EN) · 5d · [2 sources]

An 84-Format Numeric Catalog with Bit-Exact Conformance Vectors: A Vendor-Neutral Reference for FP8, BF16, MXFP4, and Microscaling Formats

A new paper introduces a comprehensive catalog of 84 numeric formats used in machine learning hardware, addressing the challenge of silent divergences when porting models across different accelerators. The catalog includes bit-exact conformance packs for various formats like FP8, BF16, and MXFP4, serving as a vendor-neutral reference. This work aims to provide a shared standard for engineers to diagnose and resolve discrepancies, ensuring greater consistency in model performance across diverse hardware. AI

IMPACT Standardizes numeric formats, potentially reducing model porting issues and improving cross-hardware compatibility for AI workloads.
- JAX
- FP8
- ml_dtypes
- MXFP4
- Google
RESEARCH · arXiv cs.CV English(EN) · 5d · [2 sources]

GenEyePose: Patient-Free, Knowledge-Based Saccadic Eye Movement Modeling for Digital Neurophysiologic Biomarker Development

Researchers have developed GenEyePose, a novel pipeline for generating synthetic eye movement data to train AI models for neurophysiologic biomarker development. This approach addresses the scarcity of real-world clinical data and privacy concerns associated with eye-tracking studies. A deep learning classifier trained on this synthetic data demonstrated promising performance in distinguishing normal from abnormal saccadic eye movements, showing potential for clinical applications in screening and localization of brain abnormalities. AI

IMPACT Synthetic data generation for AI models could accelerate the development of accessible diagnostic tools for neurological conditions.
- GenEyePose
- AI
RESEARCH · arXiv cs.CV English(EN) · 5d · [2 sources]

SoccerNet 2026 Player-Centric Ball-Action Spotting:Retraining and Post-Processing Extensions to the FOOTPASS Baselines

Researchers have developed an enhanced system for the SoccerNet 2026 Player-Centric Ball-Action Spotting Challenge. Their approach builds upon existing FOOTPASS baselines by incorporating gradient checkpointing for efficient fine-tuning, fusing graph neural network (GNN) outputs with visual features, and applying square-root frequency class weighting to balance imbalanced training data. The system achieved a Macro F1 score of 0.548 on the test set and 0.446 on the challenge set. AI

IMPACT This research advances AI capabilities in sports analytics by improving player action recognition in soccer.
RESEARCH · arXiv cs.AI English(EN) · 5d · [2 sources]

MeCo: One-Step MeanFlow-based Corrector for Multi-Channel Speech Separation

Researchers have introduced MeCo, a novel one-step generative corrector for multi-channel speech separation. This method uses a MeanFlow-based approach to map estimated audio directly to clean speech, aiming to improve human listening quality beyond traditional discriminative models. MeCo incorporates Data-Space Optimization with an $\mathbf{x}_r$-loss and an Endpoint SI-SDR loss to enhance both signal fidelity and subjective listening experience. AI

IMPACT Improves audio processing quality and efficiency for speech separation tasks.
RESEARCH · arXiv cs.CL English(EN) · 5d · [2 sources]

When Built-in Thinking Helps and Hurts: Constraint-Level Error Shifts in Instruction Following

A new research paper investigates how "thinking" mechanisms in large language models affect instruction following. The study found that while overall performance changes are minor, the "thinking" process alters error patterns, improving some instructions while worsening others. Specifically, "Planning" constraints benefit from thinking, whereas "Precision" constraints consistently degrade. Analysis of model traces revealed differing correlations between trace relevance and final answer compliance across these constraint types. AI

IMPACT Reveals nuanced effects of internal reasoning mechanisms on LLM instruction following, impacting prompt engineering and model development.
- Sai Adith Senthil Kumar
- Qwen3
RESEARCH · arXiv cs.AI English(EN) · 5d · [2 sources]

FMplex: Model Virtualization for Serving Extensible Foundation Models

Researchers have developed FMplex, a novel system designed to optimize the serving of foundation models (FMs) by treating them as a virtualization substrate. This approach allows multiple downstream tasks to share a single physical FM instance, reducing memory waste and amortizing costs associated with batching and loading. FMplex enables task-specific extensions and isolation while improving efficiency, demonstrated by significant reductions in latency and increased task hosting capacity. AI

IMPACT Optimizes foundation model deployment, potentially reducing infrastructure costs and improving latency for AI applications.
- foundation models
- FMplex
RESEARCH · arXiv cs.LG English(EN) · 5d · [2 sources]

Constrained user-item allocation for e-commerce marketing campaigns

Researchers have developed a new method called auto-targeting to optimize e-commerce marketing campaigns by jointly selecting users and products. The approach uses constrained spectral biclustering, greedy local search, and a multi-armed bandit framework to create disjoint campaigns with strong user-item affinities. Evaluations on synthetic and real-world datasets indicate that biclustering yields high-quality, fair campaigns, while bandit methods offer scalability for larger datasets. AI

IMPACT Introduces a novel approach to personalize marketing by jointly optimizing user and item selection for better campaign performance.
RESEARCH · arXiv cs.CL English(EN) · 5d · [6 sources]

PriFT: Prior-Support Guided Supervised Fine-Tuning

Researchers have developed new methods to improve supervised fine-tuning (SFT) for large language models. One approach, FisherAdapTune, uses the Fisher information geometry to dynamically select parameter groups for adaptation, enhancing in-distribution performance and zero-shot transfer. Another set of methods, including Target-SFT and PriFT, reinterprets SFT as target distribution design. These techniques aim to create more stable and effective training objectives by better aligning the fine-tuning process with the model's pretrained knowledge, leading to state-of-the-art results on various reasoning and code generation tasks. AI

IMPACT These advancements in fine-tuning techniques could lead to more efficient and effective adaptation of large language models for specific downstream tasks.