Brief

last 24h

[50/30520] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.AI English(EN) · 2d

Small Experiments, Cheaper Decisions: A Case Study in Staged Promotion for Micro-Pretraining

Researchers have developed a staged promotion protocol for micro-pretraining to optimize experimental costs. This method uses progressively larger budgets to evaluate configurations, starting with very short runs and increasing to 12 hours. The protocol aims to make cheaper decisions by identifying promising configurations early, even when initial rankings are host-sensitive, ultimately leading to a more efficient allocation of GPU hours. AI

IMPACT This staged promotion protocol could lead to more cost-effective AI model development by reducing wasted computational resources on unpromising configurations.
- Felipe Chavarro Polania
TOOL · arXiv cs.AI English(EN) · 2d

JailbreakOPT: Tool-Assisted Iterative Jailbreak Prompt Optimization

Researchers have developed JailbreakOPT, a new framework designed to improve iterative single-turn jailbreak prompt optimization for large language models. This method organizes various atomic jailbreak prompts into a library of attack tools, which are then composed to create more potent standalone attack prompts. By framing tool selection as a contextual bandit problem and using Thompson sampling, JailbreakOPT enhances attack success rates while reducing the number of queries needed. AI

IMPACT This research could lead to more robust LLM safety measures by improving the effectiveness of identifying and mitigating jailbreak vulnerabilities.
- LLMs
- JailbreakOPT
TOOL · arXiv cs.AI English(EN) · 2d

LSTM-Based Detection of Structural Breaks in Property Insurance Loss Reserving: A Climate-Informed Approach

Researchers have developed a new approach using Long Short-Term Memory (LSTM) neural networks to improve loss reserving in property insurance, particularly in the face of climate change-induced catastrophes. The study aims to test if LSTMs can detect and adapt to structural breaks in actuarial data more effectively than traditional methods like Chain Ladder. By incorporating climate data such as hurricane intensity and sea surface temperatures, the research anticipates a significant improvement in reserve accuracy for catastrophe-affected years. AI

IMPACT This research could lead to more accurate financial risk assessment for insurers facing climate-related events.
TOOL · arXiv cs.LG English(EN) · 2d

Beyond the Golden Teacher: Enhancing Graph Learning through LLM-GNN Co-teaching

Researchers have developed a new method called LLM-GNN Co-Teaching to improve few-shot graph learning. This approach avoids designating one model as a "golden teacher," instead allowing a Graph Neural Network (GNN) and a Large Language Model (LLM) to learn collaboratively. The models exchange confident pseudo-labels and update each other, with supervision derived from their agreement over time. This co-teaching framework consistently outperforms previous methods on six benchmarks, showing significant gains in accuracy for tasks like node classification. AI

IMPACT Enhances few-shot learning capabilities for graph-based AI systems, potentially improving performance in areas like recommendation engines and social network analysis.
TOOL · arXiv cs.AI English(EN) · 2d

Privacy-Preserving Federated Autoencoder for ECG Anomaly Detection on Edge Devices

Researchers have developed a privacy-preserving federated autoencoder system for detecting anomalies in electrocardiogram (ECG) data on edge devices. The system combines federated learning with differential privacy and INT8 quantization to maintain patient confidentiality, enable real-time inference on constrained hardware like the Raspberry Pi 4, and achieve high detection quality even with non-IID data from different hospitals. The study found that federated learning matched or surpassed centralized baselines, and INT8 quantization significantly reduced model size and latency with minimal loss in accuracy, demonstrating that privacy and edge deployment can be achieved simultaneously. AI

IMPACT Enables privacy-preserving AI for sensitive health data on resource-constrained devices, potentially accelerating clinical adoption.
- DP-SGD
- Flower
- ECG
- Federated Autoencoder
- Edge Devices
- PTB-XL dataset
- GDPR
- HIPAA
- Raspberry Pi 4
- INT8 quantization
- Rényi-DP
- FedAvg
TOOL · arXiv cs.AI English(EN) · 2d

T2S: A Rehearsal-Based Approach for Extraction-Resistant Model Watermarking

Researchers have developed a new method called T2S for embedding watermarks into AI models to protect intellectual property. This rehearsal-based approach simulates model extraction attacks during the watermarking process. By using the loss from a simulated stolen model as a training signal, T2S enhances the watermark's robustness against extraction and subsequent removal attempts. AI

IMPACT Enhances AI model IP protection by making watermarks more resilient to sophisticated extraction attacks.
- AI models
TOOL · arXiv cs.AI English(EN) · 2d

AI4Land: Scalable Deep Learning for Global High-Resolution Land Use Reconstruction

Researchers have introduced AI4Land, a novel deep learning framework designed to generate high-resolution land use reconstructions for climate modeling. The system utilizes a U-Net architecture to integrate coarse-resolution scenario data with static geophysical features, producing annual land use and land cover maps. Trained on Earth observation data and leveraging HPC infrastructure like MareNostrum5, AI4Land aims to reduce uncertainties in climate projections by providing realistic land surface conditions. AI

IMPACT Provides more accurate land surface data for climate simulations, potentially improving climate projection accuracy.
TOOL · arXiv cs.AI English(EN) · 2d

Offline Diffusion Policy for Multi-User Delay-Constrained Scheduling

Researchers have developed a new offline reinforcement learning algorithm called SOCD for delay-constrained scheduling in multi-user systems. This method utilizes a diffusion policy and a critic network to learn scheduling strategies solely from pre-collected data, avoiding the need for real-time system interaction. Experiments show SOCD effectively handles various system dynamics and outperforms existing scheduling approaches. AI

IMPACT This new algorithm could improve resource allocation in AI systems requiring real-time decision-making under delay constraints.
- Zhuoran Li
TOOL · arXiv cs.AI English(EN) · 2d

Synthetic Homes: A Multimodal Generative AI Pipeline for Residential Building Data Generation under Data Scarcity

Researchers have developed a multimodal generative AI pipeline called Synthetic Homes to create realistic residential building datasets. This framework addresses data scarcity in building energy modeling by integrating image, tabular, and simulation components. The system generates synthetic data from public records and images, demonstrating over 95% overlap with national datasets for key variables and outperforming GPT-based models in visual processing for building data. AI

IMPACT Enables scalable downstream tasks like energy modeling and urban simulation by reducing reliance on costly or restricted data sources.
TOOL · arXiv cs.AI English(EN) · 2d

Sonar-TS: Search-Then-Verify Natural Language Querying for Time Series Databases

Researchers have introduced Sonar-TS, a new framework designed to improve natural language querying for time series databases. This neuro-symbolic approach uses a Search-Then-Verify pipeline, first employing a feature index to identify candidate data windows via SQL, and then using generated Python programs to confirm these candidates against raw signals. To facilitate evaluation, the team also developed NLQTSBench, a novel benchmark for assessing natural language queries on large-scale time series data. AI

IMPACT Introduces a novel framework and benchmark for querying time series data, potentially improving data analysis for non-experts.
TOOL · arXiv cs.AI English(EN) · 2d

KAN-MLP-Mixer: A comprehensive investigation of the usage of Kolmogorov-Arnold Networks (KANs) for improving IMU-based Human Activity Recognition

Researchers have developed a hybrid neural network architecture, KAN-MLP-Mixer, that combines the precision of Kolmogorov-Arnold Networks (KANs) with the noise robustness and efficiency of Multi-Layer Perceptrons (MLPs). This approach strategically integrates KAN modules for input embedding and classification, while utilizing MLPs for intermediate feature mixing. Tested across eight public datasets, the KAN-MLP model demonstrated a 5.33% average improvement in macro F1 score over pure-MLP models, significantly outperforming standalone KAN and MLP baselines. AI

IMPACT This hybrid architecture offers improved accuracy and robustness for human activity recognition tasks using wearable sensors.
TOOL · arXiv cs.AI English(EN) · 2d

LSTM based IoT Device Identification

Researchers have developed a machine learning pipeline to identify IoT devices using Long Short-Term Memory (LSTM) networks. The system processes raw network packet captures into engineered features, which are then fed into the LSTM model as time-series sequences. The model achieved an accuracy of 79.85% and a macro-averaged F1-score of 75.70% across 27 device classes, with optimal performance observed at a sequence length of 18. AI

IMPACT Enhances IoT security by providing a more accurate method for device identification and vulnerability detection.
TOOL · arXiv cs.AI English(EN) · 2d

\texttt{Range-Arithmetic}: Verifiable Deep Learning Inference on an Untrusted Party

Researchers have developed a new framework called Range-Arithmetic for verifiable deep learning inference. This method allows computations to be offloaded to untrusted parties while ensuring their correctness without requiring re-execution. Range-Arithmetic achieves this by converting non-arithmetic operations into verifiable arithmetic steps, reducing computational and communication overhead compared to existing approaches. AI

IMPACT Enhances security and trust in decentralized AI systems by enabling verifiable outsourced computation.
- Ali Rahimi
- Range-Arithmetic
TOOL · arXiv cs.AI English(EN) · 2d

The Algorithm Is Not the Behavior: Learned Priors Override Look-Ahead in a Chess-Playing Neural Network

Researchers have discovered that a sophisticated neural network, Leela Chess Zero, can internally compute correct solutions to chess puzzles but ultimately override them in favor of safer, less aggressive moves. This phenomenon, termed "forgotten puzzles," demonstrates that the presence of an algorithm within a neural network does not guarantee its behavioral output. The study found that while the network's look-ahead capabilities correctly identified optimal moves, later layers prioritized defensive strategies, leading to the incorrect final output. By intervening to counteract this preference, researchers were able to recover a significant percentage of these "forgotten puzzles." AI

IMPACT Reveals a potential disconnect between an AI's internal reasoning and its final output, impacting trust and interpretability in complex decision-making systems.
- Elias Sandmann
- Leela Chess Zero
TOOL · arXiv cs.AI English(EN) · 2d

When Researchers Say Mental Model/Theory of Mind of AI, What Are They Really Talking About?

A new paper from Xiaoyun Yin argues that when researchers discuss AI's "mental models" or "theory of mind," they are often misinterpreting sophisticated pattern matching as genuine cognition. The paper contends that current evaluations, which show LLMs performing well on human cognitive tasks, only demonstrate behavioral mimicry. Yin proposes a shift towards analyzing the interactive dynamics between humans and AI, rather than testing AI in isolation. AI

IMPACT Challenges current benchmarks for AI cognition, suggesting a need for new evaluation frameworks focused on human-AI interaction.
- LLMs
- Xiaoyun Yin
- AI
TOOL · arXiv cs.AI English(EN) · 2d

Towards Deep Learning Surrogate for the Forward Problem in Electrocardiology: A Scalable Alternative to Physics-Based Models

Researchers have developed a deep learning model to efficiently simulate body surface potentials from cardiac electrical activity, offering a scalable alternative to traditional physics-based methods. This new framework utilizes a time-dependent, attention-based sequence-to-sequence architecture to predict electrocardiogram (ECG) signals. The model achieved high accuracy in simulations, demonstrating its potential for clinical applications and digital twins. AI

IMPACT This deep learning approach could significantly speed up cardiac simulations, enabling real-time analysis and broader clinical adoption.
- Shaheim Ogbomo-Harmitt PhD
- arXiv
TOOL · arXiv cs.AI English(EN) · 2d

CoVar: Confidence-Variance-Guided Pseudo-Label Selection for Semi-Supervised Learning

Researchers have developed CoVar, a new framework for semi-supervised learning that improves pseudo-label selection by considering both confidence and variance. This method addresses the limitations of relying solely on confidence, which can be unreliable due to model overconfidence and class imbalance. CoVar jointly models maximum confidence and residual-class variance to assess the reliability of pseudo-labels, leading to improved performance on various segmentation and classification benchmarks. AI

IMPACT Enhances semi-supervised learning techniques by providing a more robust method for pseudo-label selection, potentially improving model performance with less labeled data.
- Jinshi Liu
TOOL · arXiv cs.AI English(EN) · 2d

Improving Detection of Rare Nodes in Hierarchical Multi-Label Learning

Researchers have developed a new weighted loss objective for neural networks to improve the detection of rare nodes in hierarchical multi-label learning. This approach combines node-wise imbalance weighting with focal weighting components, which leverage ensemble uncertainties. The method aims to address the challenge of fine-grained classifications by emphasizing rare nodes and focusing on uncertain nodes during training. Experiments on benchmark datasets showed improvements in recall by up to five times and statistically significant gains in F1 score. AI

IMPACT Enhances model performance on fine-grained classification tasks by improving the detection of rare categories.
- arXiv
- Hugging Face
TOOL · arXiv cs.CV English(EN) · 2d

Precision-Aware Illumination-Disentangled Vision Transformer for Spacecraft 6D Pose Estimation

Researchers have developed a new Vision Transformer model, PAID-ViT, designed to improve the accuracy of 6D pose estimation for spacecraft. This model is particularly effective in challenging conditions like varying illumination, reflections, and weak textures. PAID-ViT achieves this by separating structural information from illumination-dependent visual data and incorporating patch reliability estimation. AI

IMPACT This model could improve the precision and reliability of autonomous spacecraft operations.
TOOL · arXiv cs.CV English(EN) · 2d

On Aligning Hierarchical Standardized Embedding for Audio-visual Generalized Zero-shot Learning

Researchers have introduced a new method called Aligning Hierarchical Standardized Embedding (AHSE) to improve audio-visual generalized zero-shot learning. AHSE addresses the limitations of existing methods by standardizing and hierarchically aligning audio-visual and textual embeddings. This approach aims to reduce distributional mismatches and preserve semantic and class relationships within a shared embedding space. Experiments on benchmark datasets show AHSE achieves competitive performance in zero-shot learning tasks. AI

IMPACT This research could lead to more robust and accurate classification systems that integrate multiple data modalities.
TOOL · arXiv cs.CV English(EN) · 2d

FreqKD: Frequency-Decoupled Cross-Modal Knowledge Distillation for Infrared Object Detection

Researchers have developed FreqKD, a novel knowledge distillation framework designed to improve object detection in infrared imagery by leveraging large-scale RGB foundation models. The method addresses the challenge of modality differences by analyzing and decoupling spatial frequencies, applying distinct supervision strategies to low-frequency (structural) and high-frequency (textural) components. This approach enhances cross-modal consistency and leads to significant performance gains on various datasets and architectures, outperforming baseline methods. AI

IMPACT Enhances transfer learning for specialized imaging tasks, potentially improving autonomous systems and surveillance.
TOOL · arXiv cs.CV English(EN) · 2d

How Auxiliary Reasoning Unleashes GUI Grounding in VLMs

Researchers have developed three zero-shot auxiliary reasoning methods to improve the ability of vision-language models (VLMs) to ground themselves within graphical user interfaces (GUIs). These methods involve providing explicit spatial cues like axes, grids, and labeled intersections within the input image, enabling VLMs to better articulate their implicit spatial understanding without costly fine-tuning. Experiments across four GUI grounding benchmarks and seven VLMs demonstrated significant performance gains, with one method, Mark-Grid Scaffold, boosting Gemini-3.1-Pro's accuracy on ScreenSpot-v2 from 11.72% to 95.20% and achieving state-of-the-art results on ScreenSpot. AI

IMPACT Enhances VLM capabilities for GUI interaction, potentially accelerating the development of autonomous agents.
TOOL · arXiv cs.CV English(EN) · 2d

SceneMiner: Identity-Preserving Multi-Task Fine-Tuning for Unified BEV Scene Mining

Researchers have developed SceneMiner, a novel pipeline for identifying challenging driving scenarios from video logs. This camera-only system utilizes a frozen vision-language backbone to generate multiple signals, including a retrieval embedding for text-based search, scene tags, and a physics-based risk score. A key innovation is "identity-preserving multi-task fine-tuning," which prevents interference between different tasks by carefully initializing and freezing parameters, allowing for efficient training of new sub-modules. AI

IMPACT Introduces a new method for identifying safety-critical driving scenarios, potentially improving autonomous vehicle training data.
- Abdalmalek Aburaddaha
- SceneMiner
TOOL · arXiv cs.CV English(EN) · 2d

CountZES: Counting via Zero-Shot Exemplar Selection

Researchers have developed CountZES, a novel approach for zero-shot object counting in complex scenes. This method improves upon existing techniques by refining exemplar selection through three synergistic stages: Detection-Anchored Exemplar, Density-Guided Exemplar, and Feature-Consensus Exemplar. These stages work together to ensure exemplars are textually grounded, consistent in count, and visually representative, leading to more accurate estimations. AI

IMPACT Introduces a new methodology for zero-shot object counting, potentially improving AI systems' ability to identify and quantify unseen objects in diverse environments.
- CountZES
- Muhammad Ibraheem Siddiqui
TOOL · arXiv cs.CV English(EN) · 2d

Semantic Segmentation of Node and Edge Diagrams for Assistive Technology

Researchers have developed new deep learning models for the semantic segmentation of node-link diagrams, which are commonly used to represent complex relationships and flowcharts. These diagrams are often inaccessible to visually impaired users when presented as images. The developed models achieve over 93% per-pixel accuracy on a synthetic dataset, offering a significant improvement for assistive technologies. AI

IMPACT Improves accessibility of visual data representations for assistive technologies.
TOOL · arXiv cs.CV English(EN) · 2d

TRON: Tracing Rays to Orchestrate a Neural Renderer for 3D Gaussian Reconstructions

Researchers have developed TRON, a new rendering framework that merges 3D Gaussian ray tracing with neural rendering for realistic scene manipulation. This approach addresses limitations in existing methods by combining explicit 3D scene knowledge with neural rendering capabilities. TRON enables dynamic editing of lighting, object motion, and material properties in real-world 3D environments, outperforming prior methods in realism, editability, and speed. AI

IMPACT Enables more realistic and interactive editing of 3D scenes, potentially impacting virtual reality and content creation tools.
- TRON
- neural rendering
TOOL · arXiv stat.ML English(EN) · 2d

Discovery and inference beyond linearity for epidemiological data by integrating Bayesian regression, tree ensembles and Shapley values

Researchers have developed a new framework called RuleSHAP to improve statistical inference for machine learning models in epidemiology. This framework integrates Bayesian regression, tree ensembles, and Shapley values to provide uncertainty quantification for feature effects, which is often lacking in current ML applications. RuleSHAP can detect nonlinear and interaction effects, offering individual-level uncertainty estimates, and has been demonstrated on simulated data and an epidemiological cohort to identify effects related to high cholesterol and blood pressure. AI

IMPACT Enhances the reliability of machine learning models for discovering health risk factors and improving epidemiological research.
TOOL · arXiv stat.ML English(EN) · 2d

Projected random forests and conformal prediction of circular data

Researchers have developed a new method for predicting outcomes in regression problems involving circular data, such as time of day or direction. This approach utilizes conformal prediction techniques to generate prediction sets with guaranteed coverage and adaptive arc lengths. By projecting existing linear-response regression models onto a circular space, the method can leverage high-performance models designed for linear data. AI

IMPACT Introduces a novel statistical technique for handling circular data in machine learning predictions.
- Paulo C. Marques F.
TOOL · arXiv cs.CV English(EN) · 2d

Motion Reinforces Appearance: RGB-Skeleton Gated Residual Fusion for Micro-Gesture Online Recognition

Researchers have developed a new dual-stream framework called DyFADet+ for recognizing micro-gestures in untrimmed videos. This method fuses RGB and skeleton data through a gated residual module, allowing skeleton motion to enhance the RGB representation. The system achieved an F1 score of 40.88 on the SMG dataset, securing second place in the Micro-gesture Online Recognition track of the 4th EI-MiGA-IJCAI Challenge. AI

IMPACT Introduces a novel fusion technique for multimodal gesture recognition, potentially improving human-computer interaction systems.
TOOL · arXiv cs.CV English(EN) · 2d

Periodic-MAE: Periodic Video Masked Autoencoder for rPPG Estimation

Researchers have developed Periodic-MAE, a novel self-supervised learning framework for estimating remote photoplethysmography (rPPG) from facial videos. This method utilizes a masked autoencoder to learn general spatio-temporal representations without direct rPPG supervision. By incorporating periodicity-aware frame masking and physiological bandlimit constraints, Periodic-MAE effectively captures quasi-periodic patterns relevant to pulse signal estimation. The framework demonstrates improved performance across multiple benchmark datasets and challenging real-world conditions. AI

IMPACT This self-supervised approach could enable more accessible and robust physiological monitoring from everyday video sources.
- Jiho Choi
- Periodic-MAE
TOOL · arXiv stat.ML English(EN) · 2d

Provable Recovery of Locally Important Signed Features and Interactions from Random Forest

Researchers have developed a new method to identify and interpret important features and their interactions within Random Forests, particularly for individual predictions. This approach focuses on co-occurrences of features along decision paths, offering insights into whether specific feature values drive a prediction. The method is theoretically proven to consistently recover true local signals under a specific model assumption and has been demonstrated through simulations and a real-world example. AI

IMPACT Enhances interpretability of ensemble models, potentially improving trust and debugging in AI applications.
- Random Forest
- Merle Behr
TOOL · arXiv stat.ML English(EN) · 2d

Weighted Random Dot Product Graphs

Researchers have introduced a nonparametric weighted (W)RDPG model that extends the Random Dot Product Graph (RDPG) framework to handle weighted graphs. This new model assigns latent positions to nodes, with inner products of these vectors defining the distribution moments of incident edge weights. The WRDPG can differentiate weight distributions with identical means but different higher-order moments, offering enhanced analytical capabilities. The paper also details statistical guarantees for an estimator of nodal latent positions and provides a generative framework for sampling weighted graphs. AI

IMPACT Introduces a novel statistical model for analyzing complex relational patterns in weighted graphs, potentially improving machine learning applications involving network data.
TOOL · arXiv cs.CV English(EN) · 2d

Towards Conditional Feature Alignment for Cross-Domain Counting

Researchers have developed a new framework called Conditional Feature Alignment (CFA) to improve object counting models when applied to different datasets. Standard methods often fail because they try to make all data look the same, which can remove important variations. CFA instead aligns features based on specific conditions, such as foreground or background elements, allowing the model to better handle shifts in density and environmental factors. Experiments on crowd and cell counting benchmarks demonstrated significant performance improvements, particularly in challenging scenarios with large domain shifts. AI

IMPACT Improves the robustness of AI models for object counting across different datasets, enabling more reliable real-world applications.
TOOL · arXiv cs.AI English(EN) · 2d

Geometric Metrics and LLMs: What They Measure and When They Work

Researchers have conducted a comprehensive stress-test of geometric metrics used for evaluating Large Language Models (LLMs). Their analysis revealed that some metrics, like Schatten Norm and MOM, primarily reflect output length rather than genuine quality. While geometric metrics offer a modest improvement over text statistics alone for generator identification, they show only a weak association with lexical diversity. The study recommends specific use cases and identifies failure detection as a promising application for these metrics. AI

IMPACT Identifies limitations of current LLM evaluation methods and suggests new applications for geometric metrics in failure detection.
TOOL · arXiv cs.AI English(EN) · 2d

Noise-Guided Transport for Imitation Learning

Researchers have developed Noise-Guided Transport (NGT), a new imitation learning method designed for scenarios with limited expert demonstrations. NGT frames imitation as an optimal transport problem solved through adversarial training, requiring no pretraining or specialized architectures. This efficient and easy-to-implement approach demonstrates strong performance on complex continuous control tasks, even with as few as 20 transitions. AI

IMPACT Provides a more data-efficient approach for imitation learning, potentially enabling broader application in robotics and autonomous systems.
- Lionel Blondé
- Noise-Guided Transport
TOOL · arXiv cs.CV English(EN) · 2d

PIGEON: VLM-Driven Object Navigation via Points of Interest Selection

Researchers have developed PIGEON, a new framework for object navigation in unseen indoor environments. PIGEON leverages Vision-Language Models (VLMs) by formulating navigation as a sparse decision problem, using "Points of Interest" (PoIs) to couple executable waypoints with visual observations. This approach allows VLMs to select critical PoIs, such as exploration frontiers or target objects, while low-level planners handle continuous motion. Experiments on Habitat ObjectNav benchmarks show PIGEON achieves state-of-the-art zero-shot performance and demonstrates robustness on physical robots. AI

IMPACT This framework could improve robotic navigation efficiency and adaptability in complex, unseen environments.
TOOL · arXiv cs.AI English(EN) · 2d

Physics-informed generative AI for semiconductor manufacturing: Enforcing hard physical constraints in generative models by construction

A new perspective paper proposes that generative AI models used in semiconductor manufacturing must be designed with physics principles integrated from the start, rather than relying on post-hoc filtering. The paper surveys existing architectural tools like physics-informed diffusion and PDE-constrained variational models, highlighting their application in areas such as lithography and process simulation. It argues that for physical systems where validity is paramount, generative models that enforce constraints by construction will outperform those that merely filter for them, with semiconductor fabrication serving as the most critical test case. AI

IMPACT This research could lead to more reliable AI-driven design and control in complex physical industries like semiconductor manufacturing.
TOOL · arXiv cs.AI English(EN) · 2d

When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines

Researchers have developed a new framework called CRCP to address corpus poisoning attacks in Retrieval-Augmented Generation (RAG) systems. Existing attacks often fail when faced with realistic RAG pipelines that include chunking and reranking stages. CRCP aims to overcome this by optimizing for retrieval relevance, reranker consistency, and robustness across chunk boundaries, demonstrating significantly higher attack success rates in experiments. AI

IMPACT Highlights a realism gap in RAG security evaluations, suggesting new methods are needed to defend against sophisticated poisoning attacks.
TOOL · arXiv cs.AI English(EN) · 2d

Dual-Stance Evaluation of Sycophancy: The Structure of Agreement and the Limits of Intervention

Researchers have developed a new method called dual-stance evaluation to assess large language models' sycophancy. This technique tests whether interventions designed to reduce agreement with false, sycophantic statements also impact agreement with factual statements. Experiments on Llama-3-8B-Instruct revealed that while sycophantic and factual agreement are represented in distinct internal subspaces, a single intervention direction affects both equally, hindering the ability to selectively reduce sycophancy without compromising factual accuracy. AI

IMPACT Introduces a novel evaluation framework that could lead to more nuanced LLM safety testing and development.
- arXiv
- Llama-3-8B-Instruct
TOOL · arXiv cs.AI English(EN) · 2d

Towards Fully Automated Exam Grading: Fairness-Aware Recognition of Handwritten Answers with Foundation Models

Researchers have developed a method for fully automated exam grading using vision-language foundation models (VLMs). These models can accurately recognize handwritten answers, achieving 98.4% accuracy on a benchmark dataset, significantly improving upon previous automated approaches. The study emphasizes fairness, particularly minimizing false negatives, and demonstrates that a targeted prompt can reduce the false-negative rate to 0.58%. This approach makes automated grading of paper-based exams defensible at scale, with a self-review step catching most grading discrepancies. AI

IMPACT Automated grading systems could become more accurate and fair, potentially impacting educational institutions and assessment processes.
- arXiv
- Foundation Models
TOOL · arXiv cs.LG English(EN) · 2d

Tensor Methods: A Unified and Interpretable Approach for Material Design

Researchers have introduced tensor completion methods as a unified and interpretable approach for material design, addressing limitations of traditional machine learning models. These tensor methods not only compete with standard ML in predictive accuracy but also offer interpretable factors that can reveal underlying physical phenomena. Experiments show these factors can guide experimentalists in identifying novel patterns, and specialized tensor models improve generalization on non-uniformly sampled data, outperforming baseline ML methods. AI

IMPACT Introduces interpretable AI techniques that could accelerate discovery and design in materials science.
- Shaan Pakala
- Machine Learning
TOOL · arXiv cs.AI English(EN) · 2d

Forecasting Future Behavior as a Learning Task

Researchers have developed a new method for predicting the behavior of large reasoning models (LRMs) by training specialized "Behavior Forecasters." These forecasters learn directly from a model's reasoning trajectory, bypassing the need for traditional explanations. The approach proved more accurate than existing models like GPT-5.4 and Claude Opus-4.6 in predicting answer repetition and the impact of input changes, while also being more cost-efficient. AI

IMPACT This approach could lead to more reliable AI systems by enabling better prediction of their behavior without complex, potentially inaccurate, explanations.
TOOL · arXiv cs.AI English(EN) · 2d

MA-DLE: Speech-based Automatic Depression Level Estimation via Memory Augmentation

Researchers have developed a new method called MA-DLE for estimating depression levels using speech analysis. This approach augments standard GRU-extracted features with a memory bank that selectively integrates historical temporal and dynamic memory features. A Hierarchical Attention Fusion module then combines these augmented features with GRU outputs. The MA-DLE method has demonstrated state-of-the-art performance on the DAIC-WOZ and E-DAIC datasets. AI

IMPACT This research could lead to more accessible and scalable tools for mental health assessment.
- MA-DLE
- DAIC-WOZ
- E-DAIC
TOOL · arXiv cs.AI English(EN) · 2d

On the Study of Biometric Spoofing Detection using Deep Learning

This research paper investigates the use of deep learning models, specifically MobileNetV2, DenseNet-121, and Inception-v3, for detecting spoofing attacks in facial recognition systems. Using the CelebA-Spoof dataset, the study found MobileNetV2 to be the most effective, achieving 92% accuracy while maintaining computational efficiency. The paper also highlights the challenges of generalization for other models and suggests future work on domain adaptation and hybrid architectures to improve biometric security. AI

IMPACT Enhances understanding of deep learning's role in securing biometric systems against sophisticated spoofing techniques.
TOOL · arXiv cs.LG English(EN) · 2d

SPADE: Split-and-Delay Embeddings for Autoregressive High-Granularity Calorimeter Simulation

Researchers have developed SPADE, a novel autoregressive transformer model designed for simulating high-granularity calorimeter data in particle physics. Unlike previous methods that embed multiple features jointly, SPADE embeds them independently and introduces a delay between feature streams. This approach allows the standard self-attention mechanism to learn intra-token correlations effectively. SPADE demonstrates competitive performance against existing models for photon shower generation in the ILD detector and offers a new pathway for applying LLM-style pretraining to complex, multi-feature datasets. AI

IMPACT Introduces a new transformer architecture applicable to complex scientific simulation, potentially enabling LLM-style pretraining for high-dimensional data.
TOOL · arXiv cs.LG English(EN) · 2d

Family-Aware Residual Architecture for Predicting Quantum Circuit Simulation Performance

Researchers have developed a novel neural network architecture designed to predict the performance of quantum circuit simulations. This family-aware residual architecture leverages a pretrained classifier to identify the algorithmic family of a quantum circuit, enabling more accurate predictions of simulation cost and fidelity thresholds. The system can predict these parameters in milliseconds, significantly reducing the need for time-consuming trial-and-error simulations that can take hours. AI

IMPACT This AI model could significantly speed up quantum circuit design and experimentation by reducing simulation time.
- Family-Aware Residual Architecture
- arXiv
TOOL · arXiv cs.CL English(EN) · 2d

Energy-Efficient On-Device RAG on a Mobile NPU: System Design and Benchmark on Snapdragon X Elite

Researchers have developed an energy-efficient Retrieval-Augmented Generation (RAG) pipeline that runs entirely on a mobile Neural Processing Unit (NPU), specifically the Qualcomm Hexagon NPU found in the Snapdragon X Elite. This system significantly outperforms CPU and GPU baselines in terms of speed, energy consumption, and latency for both indexing and query processing. Evaluations indicate that the NPU-accelerated RAG achieves comparable answer quality to CPU and GPU methods, suggesting a viable path for private, low-latency, and sustainable on-device AI applications. AI

IMPACT Enables practical, private, and low-latency AI applications on edge devices without compromising quality.
TOOL · arXiv cs.LG English(EN) · 2d

Learning Patterns and Abstractions from Perceptual Sequences

A new research paper explores how humans and models learn from sequences by breaking them into smaller parts, a process called chunking. The research proposes chunking as a rational strategy for discovering recurring patterns and nested hierarchies, enabling efficient sequence factorization. The paper also introduces a model that learns both chunks and abstract variables, uncovering invariant symbolic patterns and showing similarities to human learning. AI

IMPACT Proposes a new computational principle for structured knowledge acquisition in sequences, potentially influencing future AI model architectures.
- Shuchen Wu
TOOL · arXiv cs.CL English(EN) · 2d

A Geometric Profile of Semantic Information in Text: Frame-Conditional Uniqueness and a Trade-Off Triangle for Scalar Summaries

Researchers have developed a new geometric framework to measure the semantic information contained within a text. This framework, detailed in a recent paper, offers a three-coordinate semantic profile that captures novelty, breadth, and integration of ideas. The study also proves that no single scalar summary can simultaneously satisfy analytic stability, ordinal robustness, and cross-representation comparability, leading to a trade-off triangle for scalar summaries. AI

IMPACT Provides a novel theoretical lens for evaluating text quality and understanding model outputs.
- Shannon
- BERTScore
- arXiv
TOOL · arXiv cs.LG English(EN) · 2d

Persistent Homology as a Theory of Emergent Structure

A new research paper proposes persistent homology as a mathematical framework to understand emergent structures across various systems. The theory suggests that persistent, non-trivial homology classes represent macro-features that remain stable despite underlying microscopic changes. This approach frames emergence as a measurement problem, using tools like contractive-similarity graphs and Hodge decomposition to predict robustness and hierarchical organization in phenomena ranging from fluid dynamics to neural networks and social systems. AI
- Xin Li