Brief

last 24h

[50/9093] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · arXiv cs.LG English(EN) · 3d · [3 sources]

When to Align, When to Predict: A Phase Diagram for Multimodal Learning

Researchers have developed a unified framework to understand when cross-modal alignment (CA) and cross-modal prediction (CP) are effective for multimodal learning. Their model identifies four distinct regimes: Both, CA only, CP only, and Neither, based on signal-to-noise ratios and cross-modal correlations. A data-driven procedure allows practitioners to diagnose their specific multimodal problem and select the appropriate objective before commencing training, potentially avoiding harmful cross-modal training in the 'Neither' regime. AI

IMPACT Provides a diagnostic tool for practitioners to choose optimal multimodal learning objectives, potentially improving performance in scientific domains.
RESEARCH · Hugging Face Daily Papers English(EN) · 3d · [2 sources]

Sparse probes and murky physics: a case study of interpretability challenges in a foundation model for continuum dynamics

A new research paper explores the interpretability challenges of using generative AI models in scientific domains with established theories. The study focuses on the 'Walrus' foundation model for continuum dynamics, employing sparse autoencoders to analyze its internal mechanisms. Researchers found that while the model can reproduce known dynamics, its internal representations are not always consistent with established physics, leading to discrepancies in output. AI

IMPACT Highlights challenges in aligning AI model internal states with physical principles, crucial for trustworthy scientific AI.
- Katherine Rosenfeld
- Walrus
RESEARCH · arXiv stat.ML English(EN) · 3d · [2 sources]

GraphGP: Scalable Gaussian Processes with Vecchia's Approximation

Researchers have developed GraphGP, a GPU-accelerated algorithm designed to make Gaussian processes more scalable. This new method utilizes Vecchia's approximation to reduce the computational complexity from cubic to linear, enabling the handling of nearly a billion parameters. Key innovations include a novel bit-reversed k-d tree ordering for efficient neighbor searches and parallel processing, alongside a differentiable CUDA implementation that significantly outperforms existing JAX baselines in speed and memory usage. AI

IMPACT Enables larger-scale applications of Gaussian processes in machine learning and scientific modeling.
RESEARCH · Hugging Face Blog English(EN) · 3d · [2 sources]

Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech

Hugging Face has developed a benchmark to evaluate how well automatic speech recognition (ASR) systems handle code-switched speech, where individuals switch between languages mid-sentence. This is crucial for voice agents serving bilingual customer bases. The benchmark, covering language pairs like Spanish-English and French-English, uses HR and IT service management scenarios. Top-performing models identified include ElevenLabs Scribe V2, Gemini 3 Flash, and Assembly AI Universal 3-Pro, with results reported using Word Error Rate (WER), Semantic Word Error Rate (SWER), and Answer Error Rate (AER). AI

IMPACT Sets a new standard for evaluating voice agents in multilingual enterprise environments, potentially driving improvements in ASR for global customer service.
RESEARCH · arXiv cs.LG English(EN) · 3d · [3 sources]

Do Transformers Actually Help Intrusion Detection? A Temporal Sequence Evaluation on CIC-IDS2017

A new research paper questions the effectiveness of Transformer models in network intrusion detection, particularly on the CIC-IDS2017 dataset. The study found that evaluation methodology, specifically padding conventions and data splitting, significantly impacts reported performance, often overestimating the Transformer's capabilities. When evaluated under realistic, leakage-free conditions without padding, the Transformer's performance drops considerably, suggesting that architectural choices are less critical than rigorous evaluation practices. AI

IMPACT Highlights the critical need for standardized, leakage-free evaluation protocols in AI security research to accurately assess model capabilities.
- CIC-IDS2017
- LSTM
- Transformer
- Random Forest
- 1D-CNN
RESEARCH · arXiv cs.AI English(EN) · 3d · [3 sources]

Optimizing 2D Input Representations and Sub-phase Fusion Strategies for Differential Diagnosis of Asthma and COPD Using CNN- and GRU-Based Networks

Researchers have developed deep learning models, specifically CNNs and GRUs, to differentiate between asthma and COPD using pulmonary sound data. The study optimized input representations like MFCC matrices and log-mel spectrograms, finding MFCCs to be superior. Adaptive-length windowing was crucial for handling inconsistent temporal dimensions in spectrograms, leading to the best cycle-based F1-score of 0.877 and subject-based F1-score of 0.855. AI

IMPACT Novel deep learning approaches show promise for more accurate differential diagnosis of respiratory conditions using audio data.
- COPD
- Asthma
- CNN
- log-mel spectrograms
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

NARRAS: Edge-Triggered Distributed Inference for CSI-Based Localization in Vehicular IoT Networks

Researchers have developed NARRAS, a novel system for CSI-based localization in vehicular IoT networks. NARRAS employs an Edge-Triggered Distributed Inference (ETDI) approach, allowing remote antenna arrays to intelligently decide which channel state information (CSI) to report to a fusion center. This method optimizes resource usage by only transmitting valuable data, improving localization accuracy compared to other sparse-reporting strategies at similar uplink activity levels. AI

IMPACT Enhances efficiency in vehicular networks by optimizing data transmission for localization tasks.
RESEARCH · arXiv cs.CV English(EN) · 3d · [2 sources]

Image Quality Assessment of Identity Cards Using Measures from Open Face Image Quality

Researchers have developed a method to assess the image quality of identity cards for remote verification systems. This approach adapts quality measures from the Open Face Image Quality (OFIQ) standard to ID card images. The study found that applying these OFIQ measures can significantly enhance the performance of presentation attack detection algorithms. AI

IMPACT This research could improve the accuracy and security of remote identity verification systems by enhancing image quality assessment.
- presentation attack detection (PAD) algorithms
- Open Face Image Quality (OFIQ)
RESEARCH · Hugging Face Daily Papers English(EN) · 3d · [4 sources]

Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

Researchers have developed Arbor, a novel AI framework designed for autonomous scientific research. Arbor utilizes a persistent knowledge tree called Hypothesis Tree Refinement (HTR) to link hypotheses, evidence, and insights, enabling cumulative learning across long-term projects. In evaluations across six research tasks, Arbor outperformed Codex and Claude Code, achieving over 2.5 times their average relative gain and reaching 86.36% Any Medal on MLE-Bench Lite with GPT-5.5. AI

IMPACT Arbor's approach to cumulative learning and autonomous optimization could accelerate scientific discovery and development across various AI-related fields.
RESEARCH · Hugging Face Daily Papers English(EN) · 3d · [3 sources]

Redesign Mixture-of-Experts Routers with Manifold Power Iteration

Researchers have developed a new method called Manifold Power Iteration (MPI) to redesign the routers in Mixture-of-Experts (MoE) models. This technique aligns each router row with the principal singular direction of its associated expert, aiming to improve how tokens are routed to experts. Theoretical analysis suggests MPI drives router rows towards these principal directions, and empirical tests on MoE models ranging from 1B to 11B parameters show that this alignment leads to more effective models. AI

IMPACT This research could lead to more efficient and effective Mixture-of-Experts models by improving their routing mechanisms.
RESEARCH · arXiv cs.LG English(EN) · 3d · [3 sources]

Generalized Conformal Predictive Systems Under Distributional Shifts

Researchers have developed generalized conformal predictive systems (CPS) capable of handling distributional shifts in data. These systems encode shifts using observation-specific permutation weights, enabling them to produce calibrated predictive bands that adapt to varying data distributions. The approach introduces weight-uncertainty boxes to ensure confidence guarantees and has demonstrated effectiveness in experiments involving covariate shift and biomolecular design. AI

IMPACT This research offers a method to improve the reliability and calibration of AI predictions when faced with changing data distributions, crucial for real-world applications.
- Conformal Predictive Systems
RESEARCH · arXiv cs.CL English(EN) · 3d · [3 sources]

Generative Archetype-Grounded Item Representations for Sequential Recommendation

Researchers have developed GenAIR, a new framework designed to improve sequential recommendation systems by creating more effective item representations. This approach uses large language models to infer an "Archetype" for each item, representing its ideal target audience, and then grounds these archetypes in actual user behavior through a calibration objective. Experiments show that GenAIR significantly enhances the performance of various recommendation models across multiple datasets, outperforming existing methods. AI

IMPACT GenAIR's approach could lead to more personalized and accurate recommendations by better understanding item appeal to specific user archetypes.
- Large Language Models
- arXiv
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Seeing Below the Limit of Detection: A Censored-Poisson Bayesian Latent-Growth Change-Point Detector (the Span Detector) for Serial ctDNA in HR+/HER2- Metastatic Breast Cancer

Researchers have developed a new Bayesian change-point detector called Span, designed to analyze serial circulating-tumour DNA (ctDNA) data. This method treats non-detects as left-censored observations, enabling the detection of drug resistance earlier than traditional methods. In simulations for metastatic breast cancer, Span approximately doubled the detection rate of impending progressions three months in advance compared to snapshot analyses. AI

IMPACT This new statistical method could improve early detection of drug resistance in cancer patients by leveraging intermittent ctDNA signals.
- Aarchi Singh Thakur
RESEARCH · Hugging Face Daily Papers English(EN) · 3d · [3 sources]

TreeSeeker: Tree-Structured Trial, Error, and Return in Deep Search

Researchers have introduced TreeSeeker, a novel framework designed to improve the efficiency of deep search agents. This system structures search processes as a tree, allowing agents to explore multiple potential paths for complex queries while managing trial-and-error effectively. By employing a branch-and-return strategy and utilizing signals for value, uncertainty, and risk, TreeSeeker aims to prevent agents from getting stuck on unproductive paths and ensures better synthesis of evidence. Experiments demonstrate that TreeSeeker surpasses existing open-source methods in deep search tasks. AI

IMPACT Enhances AI agent capabilities in complex web search and evidence synthesis.
RESEARCH · Hugging Face Daily Papers English(EN) · 3d · [3 sources]

On the Limits of LLM-as-Judge for Scientific Novelty Assessment

A new study published on arXiv evaluates the reliability of large language models (LLMs) in assessing the novelty of scientific research questions. Researchers developed a benchmark called RQ-Bench using recent arXiv papers to compare LLM-generated questions against author-anchored reference questions. The findings indicate that LLMs consistently overestimate the novelty of generated research questions, creating a "novelty mirage" that contradicts human expert evaluations. LLMs also tend to miss crucial dimensions like narrowness or source-binding in generated questions, raising concerns about their use in scientific evaluation. AI

IMPACT Raises concerns about the current capabilities of LLMs for nuanced scientific evaluation, potentially slowing adoption in research assessment.
- LLM
- RQ-Bench
- arXiv
- Hugging Face
- LLM-as-Judge
RESEARCH · Hugging Face Daily Papers English(EN) · 3d · [3 sources]

InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning

Researchers have introduced InternVideo3, a new framework designed to improve long-horizon video understanding and agentic capabilities. The system utilizes Multimodal Contextual Reasoning (MCR) to process video content as an evolving context, enabling evidence accumulation and verification over extended periods. To maintain efficiency, InternVideo3 incorporates Multimodal Multi-head Latent Attention (M^2LA), which compresses key-value cache states without losing token information. The model has demonstrated strong performance on various video understanding benchmarks and has been adapted into a video agent capable of evidence-grounded retrieval tasks. AI

IMPACT Introduces novel methods for long-horizon video understanding and agentic behavior, potentially advancing multimodal AI capabilities.
RESEARCH · Hugging Face Daily Papers English(EN) · 3d · [3 sources]

World Model Self-Distillation: Training World Models to Solve General Tasks

Researchers have developed a new framework for training video diffusion models to solve general tasks by combining self-distillation and reinforcement learning. This method allows the models to learn task-solving abilities from unlabeled data, bypassing the need for costly, curated task-video supervision. The approach uses a vision-language model to generate tasks and solutions, which then guide a video diffusion model to learn execution, further enhanced by reinforcement learning from the vision-language model's feedback. AI

IMPACT Enables video diffusion models to perform complex tasks without explicit task-video data, potentially accelerating robotics and planning applications.
RESEARCH · arXiv cs.IR (Information Retrieval) English(EN) · 3d · [2 sources]

A PubMed-Scale Dataset of Structured Biomedical Abstracts

Researchers have introduced "Structured PubMed," a large dataset containing over 23.2 million biomedical abstracts from PubMed. This dataset aims to improve information retrieval and text mining by providing section-labeled abstracts. It includes both author-structured abstracts and those automatically labeled using a Large Language Model pipeline, offering a valuable resource for training classification models and benchmarking text-segmentation architectures. AI

IMPACT Enables more precise information extraction and knowledge synthesis from biomedical literature.
RESEARCH · arXiv cs.IR (Information Retrieval) English(EN) · 3d · [2 sources]

When More Documents Hurt RAG: Mitigating Vector Search Dilution with Domain-Scoped, Model-Agnostic Retrieval

A new research paper introduces MASDR-RAG, a method to combat "vector search dilution" in retrieval-augmented generation (RAG) systems. This dilution occurs when scaling RAG to large document sets, leading to decreased accuracy as similarity searches return irrelevant information. The proposed solution involves scoping retrieval to specific domains using organizational metadata, which significantly improved performance in tests. AI

IMPACT This research offers a practical solution to improve the accuracy and efficiency of RAG systems when dealing with large, diverse datasets.
RESEARCH · arXiv stat.ML English(EN) · 3d · [2 sources]

Annealed Entropic Allocation for Ranking and Selection

Researchers have introduced Annealed Entropic Allocation, a novel framework for sequential budget allocation in ranking and selection problems. This method employs an annealed weighted soft-min approach to refine the maximin objective, improving performance when multiple options are closely matched. The framework incorporates a saddlepoint approximation for enhanced discrimination with finite budgets, while maintaining the original large-deviation target as the smoothing parameter is annealed. AI

IMPACT Introduces a new statistical method for optimizing sequential decision-making in ranking and selection tasks.
- Annealed Entropic Allocation
RESEARCH · arXiv cs.CV English(EN) · 3d · [2 sources]

AnyMod-LLVE: Low-Light Video Enhancement with Modality-Agnostic Inference

Researchers have developed AMNet, a novel multimodal framework for low-light video enhancement (LLVE) that can perform inference even when auxiliary data like infrared or event streams are unavailable. The system uses a Spatial-Spectral Dual-Gated Translator to generate implicit representations from RGB inputs, enabling robust enhancement. Extensive experiments show AMNet's superior performance in modality-absent conditions, with code and models publicly released. AI

IMPACT This framework could improve video analysis and capture in challenging lighting conditions, potentially impacting surveillance, autonomous driving, and photography.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Piper: A Programmable Distributed Training System

Researchers have developed Piper, a novel distributed training system designed to simplify the complex process of composing various parallelism strategies for large-scale model training. This system decouples strategy declaration from runtime implementation, allowing users to define training approaches through model annotations and scheduling directives. Piper then compiles these directives into execution plans, maintaining performance parity with existing methods while enabling new efficiencies through joint scheduling of computation and communication. AI

IMPACT Simplifies complex distributed training setups, potentially accelerating research and deployment of large models.
- Piper
- DeepSeek-V3
RESEARCH · arXiv cs.CV English(EN) · 3d · [2 sources]

MOFA-VTON: More Fashion Possibilities with Fine-Grained Adaptations in Virtual Try-On

Researchers have introduced MOFA-VTON, a novel virtual try-on method designed to offer more fashion possibilities through fine-grained adaptations. Unlike existing methods that often result in monotonous outputs due to fixed dressing patterns, MOFA-VTON allows users to adjust clothing adaptations using simple sketches. The system employs a mask construction strategy and layout adjustment blocks with cross-attention to refine the spatial arrangement of clothing, enabling flexible and precise target clothing modifications. AI

IMPACT Enhances user control in virtual try-on, potentially leading to more personalized fashion experiences and e-commerce applications.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

OncoTraj: a public benchmark for longitudinal resistance prediction in EGFR-mutant non-small-cell lung cancer on osimertinib

Researchers have introduced OncoTraj, a new public benchmark dataset designed to advance the prediction of drug resistance in EGFR-mutant non-small-cell lung cancer. The dataset comprises longitudinal patient trajectories from 813 individuals treated with osimertinib, harmonized from three clinical-genomic sources. OncoTraj includes three defined tasks for model evaluation: predicting progression at 12 months, estimating time to progression, and classifying resistance mechanisms. Initial evaluations with various baseline models, including transformers, indicate that current single-timepoint snapshot features limit predictive accuracy, suggesting a need for serial ctDNA data in future iterations. AI

IMPACT Establishes a standardized benchmark for AI models in predicting cancer drug resistance, potentially accelerating clinical applications.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

DMT: Demographic Conditioning, Morphology-Enhanced Transformer for Cuffless Blood Pressure Estimation from PPG Signals

Researchers have developed a new Transformer-based model called DMT for estimating blood pressure from photoplethysmography (PPG) signals without a cuff. The model incorporates demographic information through feature modulation and uses an auxiliary morphology head to focus on relevant waveform features. Evaluated on the PulseDB dataset, DMT achieved mean absolute errors of 4.56 mmHg for systolic and 2.62 mmHg for diastolic blood pressure, significantly outperforming previous demographic-enhanced baselines. AI

IMPACT Introduces a novel approach for cuffless blood pressure estimation, potentially improving wearable health monitoring.
- PulseDB
- Transformer
- PPG
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Towards Autonomous Accelerator Design: FPGA Accelerator Generation with SECDA

Researchers have developed SECDA-DSE, a framework that integrates Large Language Models (LLMs) to automate the design of FPGA-based accelerators for AI workloads. This system uses LLMs for reasoning-guided exploration, generating candidate architectures and refining them through a feedback loop. The framework successfully produced and executed three distinct accelerator designs on FPGA hardware, demonstrating its ability to adapt configurations for diverse workloads and reduce manual design effort. AI

IMPACT Automates complex hardware design, potentially accelerating AI hardware development and deployment.
RESEARCH · arXiv cs.AI English(EN) · 3d · [3 sources]

Democratising Camera Trap AI: An Open-Source Model for Detecting UK Mammals

Researchers have developed an open-source object detection model specifically for identifying UK mammals and birds from camera trap images. This model, trained on over 48,000 labeled instances, aims to democratize wildlife monitoring by providing a free alternative to commercial platforms. It achieves high accuracy, with a mean Average Precision of 0.984, and is designed for ecologists with no prior machine-learning experience. AI

IMPACT Provides ecologists with a free, accessible tool for biodiversity monitoring, potentially accelerating wildlife research.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Learning the Universe: Posterior Reliability of Neural Generative Models in High-Dimensional Field-Level Inference of Cosmic Initial Conditions

Two new arXiv papers explore the application of neural networks in cosmology. The first paper introduces a neural marking scheme to extract more cosmological information than traditional methods, significantly tightening constraints on key parameters like sigma8 and Omega_m. The second paper investigates the reliability of neural generative models for inferring cosmic initial conditions, highlighting that standard metrics do not guarantee accurate uncertainty estimation in high-dimensional settings. AI

IMPACT These papers demonstrate advanced AI techniques for extracting deeper insights from cosmological data and improving the reliability of scientific inference.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

PatchSTG: Scalable Spatiotemporal Graph Transformers for Traffic Forecasting on Irregular Sensor Networks

Two research papers introduce novel transformer-based architectures for traffic forecasting. The first, a lightweight and interpretable transformer, uses a mixed graph algorithm unrolling approach with ADMM to capture spatial and temporal correlations, drastically reducing parameter counts. The second, PatchSTG, addresses scalability issues in irregular sensor networks by employing a patch-based hierarchical spatial representation and dual attention mechanisms for efficient local and global dependency modeling. AI

IMPACT These new transformer architectures offer improved accuracy and computational efficiency for traffic forecasting, potentially benefiting intelligent transportation systems.
- PatchSTG
- arXiv
- Transformer
- Ji Qi
- Rhode Island
RESEARCH · Hugging Face Daily Papers English(EN) · 3d · [3 sources]

Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code

Researchers have identified a new vulnerability in Large Language Models (LLMs) where a technique designed to improve code generation reliability, Grammar-Constrained Decoding (GCD), can be exploited to produce malicious code. This attack, named CodeSpear, uses benign code grammar constraints to bypass LLM safety measures. To counter this, a new defense called CodeShield has been developed, which trains LLMs to generate harmless "honeypot" code under GCD, thus maintaining safety without sacrificing utility. AI

IMPACT New attack vector highlights security risks in LLM code generation, necessitating robust defenses like CodeShield.
RESEARCH · Hugging Face Daily Papers English(EN) · 3d · [3 sources]

Time-Series Foundation Model Embeddings for Remaining Useful Life Estimation

Researchers have developed a novel method for predicting the Remaining Useful Life (RUL) of industrial equipment by leveraging pre-trained time-series foundation models (TSFMs). This approach uses Chronos-2 as a frozen backbone to extract features, which are then fed into a lightweight regression neural network for RUL estimation. Experiments on real-world data demonstrate that this method significantly outperforms traditional baselines, offering a more data-efficient and practical solution for industrial predictive maintenance. AI

IMPACT This research offers a more data-efficient approach to predictive maintenance, potentially reducing downtime and costs in industrial settings.
- arXiv
- Time-series foundation model
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Flaws in the LLM Automation Narrative

A new research paper challenges the narrative that large language models consistently perform at expert human levels on knowledge economy tasks. The study highlights that current benchmarks often fail to account for training data overlap and do not adequately measure error magnitude or response reliability. By introducing a novel coding-based data analysis task, the research found that human experts outperformed frontier LLMs on average, exhibiting less performance variability and fewer significant errors. AI

IMPACT Highlights the need for more robust LLM evaluation methods beyond average performance metrics.
- arXiv
- Large Language Models
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Itô maps for any-step SDEs

Researchers have introduced the Itô map, a novel method for any-step stochastic differential equation (SDE) integration. This approach allows generative models to predict future states in a single pass by utilizing intermediate states and Brownian paths. The Itô map offers differentiable access to posterior samples, enabling improved inference-time control and demonstrating strong performance in synthetic and image-generation benchmarks. AI

IMPACT Introduces a new primitive for posterior sampling and stochastic control in generative models, potentially improving sampling efficiency and steering capabilities.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Data assimilation for subsurface flow using latent diffusion model parameterization: performance of ensemble-Kalman and Monte Carlo techniques

Researchers have developed a new method for data assimilation in subsurface flow simulations by leveraging latent diffusion models (LDMs). This approach aims to improve the calibration of model parameters to match observed data while maintaining geological realism. The study compares ensemble-Kalman methods with Monte Carlo techniques in the LDM latent space, finding that while ensemble-Kalman methods reduce uncertainty, they can produce unrealistic models. Rigorous Monte Carlo sampling, however, shows promise in achieving both geological realism and improved uncertainty reduction. AI

IMPACT This research offers a novel approach to subsurface flow simulation, potentially improving resource exploration and management by enhancing data assimilation accuracy and geological realism.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Robust Regression of General ReLUs with Queries

Researchers have developed a new computationally efficient algorithm for learning general ReLUs in an interactive setting. This algorithm significantly reduces the number of required label queries compared to passive learning methods. The study also establishes that query access is crucial for improving label complexity in active learning scenarios. AI

IMPACT Introduces a more efficient method for learning complex neural network components, potentially speeding up model development.
- arXiv
RESEARCH · arXiv cs.CL English(EN) · 3d · [2 sources]

Provenance-Grounded Gating and Adaptive Recovery in Synthetic Post-Training Data Curation

Researchers have developed a new method for curating synthetic data used in post-training large language models. This approach focuses on ensuring the generated data is grounded in its source evidence and explores strategies for recovering discarded samples. The study found that using exact source provenance enhances faithfulness gating and that a combination of hallucination and reward gates is necessary, as they reject different types of samples. An adaptive recovery pipeline was shown to improve yield and recall compared to simple resampling. AI

IMPACT Enhances the quality and efficiency of synthetic data used for fine-tuning LLMs, potentially leading to more capable models.
- LLM
- arXiv
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Overcoming Rank Collapse in Feedback Alignment

Researchers have identified a key limitation in Feedback Alignment (FA), a method for training neural networks that bypasses the biological implausibility of backpropagation. They found that FA's error signals have a lower rank than those used in backpropagation, restricting the exploration of the parameter space and hindering its scalability to deeper architectures. To address this, the study proposes two mechanisms: an optimizer called Muon that orthogonalizes weight updates and hidden activity normalization, which promotes activation orthogonality. These methods significantly improve FA's performance on benchmarks like CIFAR100, suggesting that increasing the dimensionality of update geometry is crucial for scaling FA as an alternative to backpropagation. AI

IMPACT Introduces techniques to improve training efficiency and scalability for neural networks, potentially enabling more complex models.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Monte Carlo Pass Search: Using Trajectory Generation for 3D Counterfactual Pass Evaluation in Football

Researchers have developed a new method called Monte Carlo Pass Search (MCPS) to evaluate player passes in football using 3D trajectory generation. This approach treats pass evaluation as a Monte Carlo Tree Search problem, incorporating a value model, a world model for multi-agent interactions, and a policy for generating pass variants. The system utilizes a high-fidelity dataset from the Bundesliga and adapts an autoregressive trajectory generator from autonomous driving to forecast outcomes and attribute pass success. AI

IMPACT Introduces a novel AI-driven methodology for objective player performance evaluation in football.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

FADA: Accessible fetal ultrasound interpretation and annotation with a selectively distilled unified vision-language model

Researchers have developed FADA, a unified vision-language model designed to interpret and annotate fetal ultrasounds, addressing a global shortage of trained sonographers. Built upon Qwen3.5-VL, FADA integrates interpretation, classification, detection, and segmentation into a single pipeline, eliminating the need for external labels. The model achieves high accuracy in segmentation and detection, with expert validation confirming clinically acceptable outputs. Notably, FADA can run offline on a smartphone, offering a practical solution for resource-constrained settings. AI

IMPACT Enables accessible, offline AI-assisted fetal ultrasound diagnostics in low-resource areas.
RESEARCH · arXiv cs.AI English(EN) · 3d · [3 sources]

Generative Explainability for Next-Generation Networks: LLM-Augmented XAI with Mutual Feature Interactions

Researchers have developed a new framework to improve the explainability of AI models used in network operations. This system augments traditional explainable AI (XAI) methods by incorporating mutual feature interaction data into prompts for a moderately sized large language model (LLM). The goal is to generate natural language explanations that are more understandable and actionable for non-specialists, enhancing operator trust in AI-driven network management. AI

IMPACT Enhances trust and actionability of AI insights for network operators, potentially accelerating AI adoption in critical infrastructure.
- SHAP
- XAI
- LLM
RESEARCH · dev.to — LLM tag English(EN) · 3d · [2 sources]

Deep Dive: 7 Capability Dimensions \u00d7 8 AI Models \u2014 Who Leads Where?

A comparative analysis of eight AI models across seven capability dimensions reveals no single all-around champion. GPT-5.5 excels in agentic tasks and long context, while Claude Opus 4.8 leads in coding and general knowledge. Gemini 3.5 Flash offers strong agentic value and multimodal capabilities, and DeepSeek V4 Pro demonstrates prowess in competitive programming and mathematics. AI
$Deep Dive: 7 Capability Dimensions \u00d7 8 AI Models \u2014 Who Leads Where?$

IMPACT Provides detailed performance comparisons across key AI model capabilities, aiding operators in selecting the most suitable model for specific use cases.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Inside the Latent Flow: Causal Deciphering of Attention Dynamics in Audio Separation Foundation Models

Researchers have developed new methods to understand and manipulate the internal workings of large audio-language models. One technique, instruction-based vector steering, allows for the redirection of temporal attention within these models, enabling them to focus on specific sound events without retraining. Another approach uses causal intervention to decipher attention dynamics in audio separation models, revealing a dual-pathway text-conditioning mechanism and leading to an acceleration method called Layer-Selective Attention Caching. AI

IMPACT These studies offer new ways to interpret and control complex audio AI, potentially improving their performance and transparency in tasks like audio separation and event detection.
RESEARCH · Hugging Face Daily Papers English(EN) · 3d · [2 sources]

APEX: A Network-Native Time-Series Foundation Model for Forecasting and Anomaly Detection for Wireless Edge Operations

Researchers have developed APEX, a new network-native transformer model designed for time-series forecasting and anomaly detection in wireless network operations. Unlike generic models, APEX is specifically pre-trained on telemetry data from thousands of wireless networks, enabling it to better handle the unique characteristics of this data. The model, available in both large and edge versions, significantly outperforms existing baselines in predicting network degradations and identifying anomalies, with the edge version offering efficient on-device inference. AI

IMPACT Enhances proactive wireless network management by improving prediction accuracy and anomaly detection capabilities.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

First-Order Trajectory Matching: Fast Ensemble Predictions of Chaotic, Turbulent, Stochastic Systems

Researchers have developed a new surrogate-modeling method called First-Order Trajectory Matching (FTM) for predicting the behavior of chaotic, turbulent, and stochastic systems. FTM learns directly from system trajectories to model the transport of probability mass, enabling accurate ensemble predictions with low computational cost. The method's stability is analyzed by separating discretization error from sampling variance, ensuring reliable results when temporal resolution and sample size are balanced. AI

IMPACT Introduces a novel method for efficient prediction of complex dynamical systems, potentially impacting scientific simulation and forecasting.
- Benjamin Peherstorfer
- First-Order Trajectory Matching
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Multimodal Brain Tumour Classification Using Feature Fusion

Researchers have developed a multimodal deep learning network to classify brain tumors, aiming to replicate clinicians' reasoning by integrating MRI scans with radiomic features. This approach combines a CNN for image data and an MLP for radiomic features, fusing them through various strategies. The multimodal configurations consistently outperformed unimodal baselines, with gated fusion achieving a top accuracy of 96.13% on a dataset of 7,200 images. AI

IMPACT This research demonstrates a novel approach to medical image analysis, potentially improving diagnostic accuracy and efficiency in oncology.
- CT scans
- CNN
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Limitations of Learning Tanh Neural Networks with Finite Precision

A new paper explores the inherent limitations of training $\tanh$ neural networks using finite-precision computations. The research demonstrates that under such conditions, adaptive randomized algorithms are bound by the Monte Carlo convergence rate. This limitation persists unless the computational budget scales exponentially with network size, highlighting fundamental constraints on learnability for networks with localized bump functions. AI

IMPACT Highlights theoretical constraints on training efficiency for certain neural network architectures.
- Berner, Grohs, and Voigtländer
- Tanh neural networks
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

RoboNaldo: Accurate, Stable and Powerful Humanoid Soccer Shooting via Motion-Guided Curriculum Reinforcement Learning

Researchers have developed RoboNaldo, a novel three-stage reinforcement learning framework designed to enable humanoid robots to perform accurate and powerful soccer shots. This system guides the learning process using human motion data, progressively optimizing for shooting performance. In simulations, RoboNaldo significantly reduced shot error and increased velocity compared to existing methods. Real-world tests on a Unitree G1 robot demonstrated impressive accuracy and ball speed, approaching professional levels. AI

IMPACT Enables more sophisticated robotic control for complex physical tasks like sports.
- RoboNaldo
- Unitree G1
RESEARCH · arXiv cs.MA (Multiagent) English(EN) · 3d · [2 sources]

Phi-Actor-Critic: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria

Researchers have developed a new framework called Phi-Actor-Critic ($\Phi$-AC) to address challenges in multi-agent reinforcement learning. This method aims to steer learning towards Pareto-efficient correlated equilibria in general-sum games, where individual incentives can conflict with collective welfare. $\Phi$-AC utilizes swap regret minimization and a centralized attention critic to make counterfactual regret estimation more tractable, enabling the learning of stable and efficient coordination strategies. AI

IMPACT Introduces a novel approach to improve coordination and efficiency in multi-agent AI systems.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

GRAFT: Gain-Recalibrated Adapters for Transformer-Based Neural Population Activity Modeling

Researchers have developed GRAFT, a Transformer-based model designed for neural population activity modeling. This new model separates reusable temporal dynamics from a recalibratable neuron interface, allowing for better adaptation in brain-computer interfaces where neuron identities and statistics can change. GRAFT achieved a new state-of-the-art performance on the NLB'21 protocol, reaching 0.3866 co-bps. Furthermore, it demonstrated efficient cross-day recalibration by updating only a small percentage of its parameters. AI

IMPACT Sets new SOTA on neural population activity modeling, potentially improving brain-computer interfaces.
- Transformer
- NLB'21
- GRAFT
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Understanding and mitigating the risks of OpenClaw for non-technical users: A practical guide with Skill

A new research paper addresses the risks associated with the AI agent framework OpenClaw, particularly for users without technical expertise. The paper identifies seven core risks and provides plain-language explanations and actionable defensive strategies. To further assist non-technical users, it also introduces a companion OpenClaw Skill that automates security configurations. AI

IMPACT Provides practical guidance for non-technical users to safely engage with AI agent frameworks.
- arXiv
- OpenClaw