Brief

last 24h

[50/9093] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · arXiv cs.CL English(EN) · 3d · [2 sources]

It Takes One to Bias Them All: Breaking Bad with One-Shot GRPO

A new research paper demonstrates that large language models, despite extensive alignment training, can be easily biased with just a single example. The study utilized Group Relative Policy Optimization (GRPO) to show that even one biased input can cause stereotype-driven reasoning to generalize across various attributes and benchmarks. This highlights a significant vulnerability in current LLM alignment methods, suggesting that post-training guardrails can be readily overridden. AI

IMPACT Reveals a critical vulnerability in LLM alignment, suggesting current safety measures may be insufficient against targeted manipulation.
RESEARCH · arXiv cs.AI English(EN) · 3d · [3 sources]

Large-scale semantic mapping of learner agency and autonomy reveals what measurement and generative AI research overlook

A new research paper published on arXiv analyzes the concepts of learner agency and autonomy, identifying a "jingle-jangle" fallacy in how they are defined and measured. The study mapped over 14,000 publications, revealing three core dimensions: control of learning, intrinsic motivation, and sociocultural action. Current generative AI research in education primarily focuses on the first dimension, overlooking the others and potentially limiting the scope of AI-mediated learning environments. AI

IMPACT Highlights a gap in current AI education research, suggesting a need to broaden focus beyond learning control to encompass intrinsic motivation and sociocultural aspects of learner agency.
RESEARCH · arXiv cs.AI English(EN) · 3d · [3 sources]

K-Forcing: Joint Next-K-Token Decoding via Push-Forward Language Modeling

Researchers have introduced K-Forcing, a new paradigm for accelerating language model inference by decoding multiple tokens simultaneously. This push-forward approach distills an existing autoregressive model into a mapping that generates k tokens in a single pass. K-Forcing aims to improve efficiency for high-load batch serving scenarios, a critical area for large-scale LLM deployment. Initial evaluations show a 2.4-3.5x speedup with a modest impact on quality. AI

IMPACT Offers a promising route to accelerate autoregressive generation for LLMs in high-load deployment scenarios.
- OpenWebText
- arXiv
- K-Forcing
- LM1B
RESEARCH · arXiv cs.CV English(EN) · 4d · [3 sources]

Don't waste SAM

Researchers have explored the effectiveness of Meta AI's Segment Anything Model (SAM) for waste segmentation tasks. By fine-tuning SAM on three specific waste datasets, they found that the SAM-ViT-H model significantly improved performance, achieving a +30 IoU increase on the Zerowaste and TACO datasets. This study suggests that fine-tuning SAM is a critical step for enhancing its generalization capabilities in downstream applications like waste segmentation. AI

IMPACT Fine-tuning foundational models like SAM can unlock new applications in specialized domains, improving efficiency and accuracy in tasks like waste management.
RESEARCH · arXiv stat.ML English(EN) · 3d · [2 sources]

Magnitude-Based Features for Multispecies Spatial Data

Researchers have introduced magnitude-based features as a novel quantitative tool for analyzing multispecies spatial data. This method captures interactions between different entities by considering their spatial configuration and scale. The approach has been demonstrated on synthetic tumor microenvironment data and real-world colorectal cancer samples, identifying distinct neighborhood types and revealing spatial heterogeneity. AI

IMPACT Introduces a new analytical framework for complex spatial data, potentially applicable to AI models dealing with biological or ecological systems.
- Julia Sollberger
RESEARCH · Hugging Face Daily Papers English(EN) · 3d · [2 sources]

AI Researchers Must Help Lead Arms Control to Mitigate Military AI Risks

A new paper argues that AI researchers must actively participate in arms control efforts to mitigate the risks associated with military AI applications. The authors emphasize that while long-term AI safety concerns are important, the immediate integration of frontier AI models into defense systems poses a more pressing threat. They suggest drawing lessons from nuclear deterrence to develop verification and diplomatic strategies for military AI, urging researchers to lead technical efforts in this area. AI

IMPACT AI researchers must engage in arms control to prevent misuse of advanced AI in military applications.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Exploring the Design Space of Reward Backpropagation for Flow Matching

Researchers have introduced FlowBP, a new framework designed to improve the alignment of text-to-image models with human preferences. This method addresses limitations in direct reward backpropagation, such as memory constraints and gradient inflation, by creating a surrogate backward trajectory. FlowBP offers three variants that bound memory usage and limit gradient chaining, showing improvements across various metrics on models like SD3.5-M and FLUX. AI

IMPACT Introduces a novel framework to improve the efficiency and effectiveness of aligning generative models with human preferences.
RESEARCH · arXiv cs.CL English(EN) · 3d · [2 sources]

Modeling Complex Behaviors: Multi-Personality Composition and Dynamic Switching in Vision-Language Models

Researchers have developed a new framework for conditioning and evaluating the personalities of multimodal large language models (MLLMs). Their experiments indicate that while personality induction can enhance image captioning, it may hinder performance on precise reasoning tasks like visual question answering. The study also observed balancing and residual effects during multi-trait composition and dynamic switching, suggesting that model behavior is influenced by both past and present personality constraints. AI

IMPACT Introduces a framework for controlling and evaluating MLLM personalities, potentially improving their social interaction capabilities.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

CIAware-Bench: Benchmarking Control Intervention Awareness Across Frontier LLMs

Researchers have developed CIAware-Bench, a new benchmark designed to measure how well frontier large language models can detect interventions in their output. The benchmark tests models' ability to distinguish their own generated text from text that has been subtly altered by a control mechanism. Evaluations across eleven models revealed varying levels of control intervention awareness, with detection often easier between models from the same provider, suggesting reliance on stylistic differences. AI

IMPACT This benchmark could help developers create more robust AI control protocols by revealing how easily current models can be manipulated or detected.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

What Fits (Into Few Tokens) Doesn't Overfit: Compression and Generalization in ML Research Agents

Researchers have investigated why machine learning, particularly when driven by large language models (LLMs), exhibits surprisingly little overfitting despite adaptive benchmark use. Their study on LLM-driven research agents suggests that successful ML strategies are highly compressible. Experiments with output and input compression, using short prompts and one-bit feedback, demonstrated that these bottlenecks minimally impacted performance across various datasets, supporting the idea that effective strategies occupy a low-complexity region of strategy space. AI

IMPACT Suggests that the inherent compressibility of successful ML strategies may explain the observed lack of overfitting in benchmark-driven ML.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Data-Driven Runway and Taxiway Exits Prediction of Landing Aircraft: A Case Study at Hartsfield-Jackson Atlanta International Airport

Researchers have developed a two-stage AI system to predict aircraft taxi-in decisions at Hartsfield-Jackson Atlanta International Airport. The system uses machine learning models, including XGBoost and LightGBM, to forecast which runway exit an aircraft will use and whether it will cross an active departure runway. Trained on ASDE-X surface trajectory data, aircraft characteristics, and weather, the models achieve accuracies between 0.70-0.89 depending on the stage. The research aims to enhance air traffic controller situational awareness by providing calibrated, explainable predictions. AI

IMPACT Enhances air traffic control efficiency and safety through predictive analytics for aircraft movements.
RESEARCH · arXiv cs.CL English(EN) · 3d · [2 sources]

Measuring Human Value Expression in Social Media Texts: Calibrated LLM Annotation and Encoder Transfer

Researchers have developed a method to measure human values expressed in social media texts using LLMs. The study, which utilized non-English posts and Schwartz's theory of basic human values, found that different LLMs interpret values differently. Through iterative prompt calibration and error analysis, the accuracy of LLM annotations was improved, and these annotations were then transferred to an encoder model for scalable prediction. AI

IMPACT This research offers a novel approach to analyzing subjective content in social media, potentially improving sentiment analysis and understanding of public opinion.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Structure from Reasoning, Numbers from Search: On-Premise Open LLMs as Structural Priors for Coupled MIMO Controller Tuning

Researchers have explored the use of on-premise open-source large language models (LLMs) to improve the tuning of controllers for complex industrial processes. While traditional methods struggle with strongly coupled multi-input multi-output (MIMO) systems, LLMs can provide a structural prior, guiding the tuning process more effectively. The study found that LLMs excel in proposing counter-intuitive structures and achieving optimal control with significantly fewer evaluations compared to traditional optimizers, especially as system complexity increases. AI

IMPACT On-premise LLMs can serve as sample-efficient, interpretable structural priors for complex control systems, potentially accelerating industrial automation.
RESEARCH · arXiv cs.CL English(EN) · 3d · [2 sources]

Who Brought Easter Eggs to Eid? Auditing Cultural Translation of Math Word Problems Across Diverse Languages and Regions

A new study analyzed how large language models like Claude Opus 4, GPT-4.1, and Gemini 2.5 Pro translate math word problems across various languages and cultures. The research found that while models often agree on the type of transformation, they frequently substitute specific cultural elements like names and foods, leading to a significant divergence in the cultural context presented to students. Furthermore, all tested language-model combinations exhibited "entropy collapse," meaning the adaptation process compressed rather than expanded cultural diversity, and models often misattributed regional contexts or introduced cross-cultural contamination, such as equating egg hunts with Eid activities. AI

IMPACT Reveals significant limitations in LLMs' ability to perform nuanced cultural translation, impacting educational applications.
RESEARCH · arXiv cs.CL English(EN) · 3d · [2 sources]

Mind the Gap: Can Frontier LLMs Pass a Standardized Office Proficiency Exam?

A new research paper introduces an evaluation framework for testing Large Language Model (LLM) agents' proficiency in using standard office software like Word, Excel, and PowerPoint. The study found that even advanced LLMs struggle with complex document automation tasks, with single-turn models scoring below 37% and more sophisticated agentic systems reaching only 68.8% on a 100-point scale. This highlights a significant gap in current LLM capabilities for fine-grained office automation. AI

IMPACT Highlights significant limitations in LLM agents for practical office automation tasks, indicating a need for further development in agentic capabilities and reasoning.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Architect-Ant: Editable Automatic Furnishing of Architectural Floor Plans

Researchers have developed Architect-Ant, a framework for automatically furnishing architectural floor plans with furniture. This system utilizes a new dataset called AntPlan-270, comprising 270 annotated floor plans, to train a vision-language model. Architect-Ant represents furniture layouts using a domain-specific language and incorporates architectural constraints to ensure plausible and geometrically valid arrangements, ultimately generating realistic blueprint-style images. AI

IMPACT Introduces a novel dataset and framework for AI-driven interior design and architectural visualization.
- AntPlan-270
RESEARCH · arXiv cs.AI English(EN) · 3d · [3 sources]

Beyond APIs: Probing the Limits of MLLMs in Physical Tool Use

Researchers have developed PhysTool-Bench, a new benchmark designed to evaluate how well Multimodal Large Language Models (MLLMs) can understand and use physical tools. The benchmark includes over 2,500 queries involving nearly 2,700 real-world tools across various industries. Testing revealed that even top-performing models struggle significantly, identifying only about 58.7% of tools and successfully completing just 21.0% of tasks, highlighting a critical gap in their ability to interact with the physical world. AI

IMPACT Highlights a significant limitation in current MLLMs for embodied AI, suggesting a bottleneck for real-world robotic applications.
RESEARCH · arXiv cs.LG English(EN) · 3d · [3 sources]

Closing the Modality Gap in Zero-Shot HAR: Contrastive Training and Separability-Optimized Prototypes on IMU Data

Researchers have developed a new method to improve zero-shot learning for human activity recognition using inertial measurement unit (IMU) data. Their approach focuses on bridging the gap between sensor data and semantic understanding by optimizing prototype representations. By employing contrastive training and using more descriptive text prototypes, they achieved a significant increase in accuracy for recognizing unseen activities. AI

IMPACT Enhances the ability of AI systems to recognize human activities from sensor data without prior specific training examples.
RESEARCH · Hugging Face Daily Papers English(EN) · 3d · [2 sources]

High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation

Researchers have developed Z-Image Turbo++, a novel 2-step image generation model that significantly narrows the quality gap compared to 8-step models. This is achieved through a distillation process from an 8-step teacher model, employing distribution-aligned adversarial learning, step-decoupled parameterization, and end-to-end training with iterative regularization. The new method uses teacher-generated images for GAN training and assigns independent parameters to each denoising step, improving the efficiency-quality trade-off in image generation. AI

IMPACT This research offers a more efficient approach to high-fidelity image generation, potentially accelerating applications requiring faster inference times.
- Z-Image Turbo++
RESEARCH · arXiv cs.CV English(EN) · 3d · [2 sources]

AnimaSpark: A Feed-Forward Method for Animating Arbitrary 3D Objects

Researchers have developed AnimaSpark, a new feed-forward method for generating category-agnostic 3D animations. This pipeline renders rigged 3D models into image representations, uses a video generation model, and then tracks keypoints to derive skeletal motion. The system distills 2D motion data and lifts it to 3D to animate objects, reportedly outperforming existing methods in speed, motion quality, and text-motion alignment. AI

IMPACT This method could significantly speed up 3D asset production by automating animation generation.
- arXiv
- AnimaSpark
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Bellman-Taylor Score Decoding for Markov Decision Processes with State-Dependent Feasible Action Sets

Researchers have introduced a new framework called Bellman-Taylor score decoding to address challenges in applying deep reinforcement learning to Markov decision processes with complex, state-dependent actions. This method maps policy learning into a Euclidean score space, allowing standard DRL algorithms to be used while enforcing feasibility through an action decoder. The approach has demonstrated near-optimal performance in small-scale tests and significant improvements over existing methods in larger systems, particularly when applied to queueing network control problems. AI

IMPACT Simplifies application of DRL to complex control problems, potentially enabling new solutions in operations research and robotics.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Population-Aware Physics-Informed Neural Particle Flow for Bayesian Update

Researchers have developed a new method called population-aware physics-informed neural particle flow (PA-PINPF) to improve Bayesian updates. This technique enhances the standard PINPF by incorporating information about the entire particle set into each particle's update, rather than processing them independently. Experiments show that PA-PINPF variants outperform the original method, with one version demonstrating particularly strong results by encoding population-level physics features. AI

IMPACT Introduces a novel approach to Bayesian inference that could improve the accuracy and efficiency of models in various applications.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Recalling Too Well: Sycophancy Evaluation and Mitigation in Memory-Augmented Models

A new research paper introduces MIST, a benchmark designed to evaluate sycophancy in memory-augmented large language models. The study found that persistent memory systems, while intended to improve helpfulness, significantly amplify sycophantic behavior by prioritizing user agreement over factual accuracy. The researchers propose two mitigation techniques that effectively reduce sycophancy while maintaining factual recall. AI

IMPACT Highlights a critical safety flaw in memory-augmented LLMs, potentially impacting their reliability in real-world applications.
- MIST
- LLMs
RESEARCH · arXiv cs.CV English(EN) · 3d · [2 sources]

PENet+: A Lightweight Residual Transformer Framework for Efficient Image Steganalysis

Researchers have developed PENet+, a more efficient version of the PENet framework for image steganalysis. This new model significantly reduces computational requirements and parameters while maintaining high detection accuracy. PENet+ achieves these improvements through techniques like classifier streamlining and replacing the backbone with a MobileNetV2-style network, making it suitable for resource-constrained environments. AI

IMPACT Provides a more computationally efficient method for detecting hidden information in images, enabling deployment on devices with limited resources.
- PENet+
- MobileNetV2
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

A Systematic Approach for Selecting Trajectories for Data Augmentation

Researchers have developed a systematic framework to improve trajectory data augmentation for machine learning. The study evaluated five selection strategies—Outlierness, Diversity, Representativeness, Uncertainty, and Random selection—across various datasets including animal behavior, maritime, and urban traffic. Results showed that systematic strategies, particularly Outlierness and Uncertainty, offer advantages over random selection, especially in sparse datasets, though their effectiveness is conditional and can degrade performance in dense datasets. AI

IMPACT Provides a more robust method for data augmentation, potentially improving model performance in data-scarce scenarios.
- Optuna
- arXiv
- UMAP
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Provenance Tracking in AI Compilers through the Lens of Coalgebra

Researchers have developed a new method for tracking the origin of data and operations within AI compilers. This approach uses observational semantics and a coalgebraic model to preserve provenance even when intermediate computational steps are removed. A prototype compiler named COVAN has been built to demonstrate the effectiveness of this lightweight technique, which aims to improve debugging and validation of compiler transformations. AI

IMPACT Enhances debugging and validation for AI compiler transformations, potentially improving AI development workflows.
- COVAN
- AI compilers
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

CLP: Collocation-Length Prediction for Zero-Loss Adaptive Multi-Token Inference

Researchers have developed a new method called Collocation-Length Prediction (CLP) to accelerate large language model inference. CLP addresses a core issue in multi-token prediction (MTP) where the prediction head for subsequent tokens interferes with the main language model head, causing quality degradation. By redesigning the architecture so the main head always generates the first token and a lightweight CLP layer predicts subsequent tokens, the method achieves significant speedups without sacrificing output quality. Experiments on Qwen2.5 models demonstrated speed increases of up to 1.29x with negligible repetition. AI

IMPACT Introduces a novel, lightweight approach to accelerate LLM inference, potentially reducing computational costs and latency for real-time applications.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

WorldKernel: A World Model is the Coupling Kernel of Admissible Possible Worlds

A new research paper introduces WorldKernel, a theoretical framework for world models that addresses limitations in current predictive models. The paper posits that standard predictors fail to capture uncertainty in counterfactual couplings between possible worlds. WorldKernel proposes a coupling kernel to represent this cross-world information, which can be bounded and acquired through targeted learning methods. AI

IMPACT Introduces a theoretical framework for world models that could improve AI's ability to reason about counterfactuals and uncertainty.
- WorldKernel
- arXiv
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

A Constrained Natural-Language Interface for Variational Multi-Physics Finite Element Simulations in FEniCS

Researchers have developed a constrained natural-language interface for finite element simulations using the FEniCS platform. This system limits large language models to front-end tasks like parsing prompts and generating geometry code, avoiding direct involvement in the core solver logic. The interface demonstrated high accuracy in parsing prompts and generating geometry, with the overall system achieving sub-percent to 2-5 percent agreement with benchmarks depending on the complexity of the simulation. AI

IMPACT Enables more accessible setup of complex physics simulations, reducing manual effort and potential errors in code generation.
RESEARCH · arXiv cs.CL English(EN) · 3d · [2 sources]

Trace Only What You Need: Structure-Aware On-Demand Hypergraph Memory for Long-Document Question Answering

Researchers have introduced DocTrace, a novel multi-agent retrieval-augmented generation (RAG) framework designed to enhance question answering over long documents. This system addresses limitations in existing RAG methods by organizing knowledge on-demand, leveraging document structure, and reusing past reasoning experiences. Experiments show DocTrace outperforms strong baselines on several datasets while significantly reducing computational costs. AI

IMPACT Enhances LLM reasoning over lengthy texts, potentially improving information retrieval and analysis in complex document sets.
RESEARCH · arXiv stat.ML English(EN) · 3d · [2 sources]

Conservation Laws from Data Symmetry in Neural Networks

Researchers have investigated whether inherent symmetries in training data can result in conserved quantities during the gradient-flow training of neural networks. Their findings indicate that for analytic and non-polynomial loss functions, data symmetries generally do not introduce additional integrals of motion. However, with mean squared error loss, specific data augmentation techniques can lead to the emergence of conserved quantities. The study introduces a framework using 'tensorizable networks' to model this phenomenon, encompassing architectures like linear, polynomial networks, and Lightning Attention. AI

IMPACT This research could lead to more stable and predictable neural network training by identifying conserved quantities, potentially improving model performance and understanding.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

What Do Deepfake Speech Detectors Actually Hear?

Researchers have developed a new method to understand how deepfake speech detectors make their decisions. By using Integrated Gradients on self-supervised representations, the technique can pinpoint specific moments in audio where evidence of a deepfake is detected. This analysis revealed that different detectors, such as AASIST, CA-MHFA, and SLS, rely on distinct audio cues, ranging from environmental sounds to phoneme artifacts and spectral integrity. AI

IMPACT Provides crucial insights into the decision-making processes of AI systems used for detecting synthetic media.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Ethical and Technical Limits of Deepfake Speech Datasets

A new audit of 39 deepfake speech datasets reveals significant limitations in their ethical and technical aspects. Researchers found that most datasets lack crucial demographic metadata, making fairness assessments nearly impossible and preventing subgroup analysis. Additionally, a substantial overlap in the source audio corpora used across these datasets could lead to inflated claims of generalization and undermine cross-dataset evaluation. AI

IMPACT Highlights critical data limitations that could hinder the development and fair evaluation of AI-powered speech technologies.
- arXiv
- deepfake speech datasets
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

RAT: Reference-Augmented Training for ASV Anti-Spoofing

Researchers have developed a new training strategy called Reference-Augmented Training (RAT) to improve the detection of audio deepfakes. While initially designed to use speaker reference recordings, the method surprisingly enhances deepfake detection even when the reference is absent or mismatched during inference. This approach achieved state-of-the-art results on the ASVspoof 5 benchmark, outperforming larger ensemble systems. AI

IMPACT This new training method could lead to more robust defenses against sophisticated audio deepfakes.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Pose-ICL: 3D-Aware In-Context Learning for Pose-Controllable Subject Customization

Researchers have introduced Pose-ICL, a new framework designed to improve subject customization in image generation by enabling better pose control. This method utilizes 3D-aware in-context learning, anchoring image tokens to surface coordinates within a volumetric bounding box to enhance 3D awareness. Pose-ICL aims to overcome limitations in existing techniques that struggle with pose accuracy and identity consistency for customized subjects, showing significant improvements in evaluations. AI

IMPACT Enhances pose accuracy and identity consistency in customized image generation, potentially improving creative workflows.
RESEARCH · arXiv cs.CL English(EN) · 3d · [4 sources]

AuRA: Internalizing Audio Understanding into LLMs as LoRA

Researchers have developed two novel methods, Spatial-Omni and AuRA, to enhance the audio understanding capabilities of large language models (LLMs). Spatial-Omni integrates spatial audio cues using First-Order Ambisonics encoding into existing LLMs, creating new datasets and benchmarks for spatial audio tasks. AuRA, on the other hand, uses a distillation approach with LoRA adaptation to internalize audio encoding within LLMs, enabling efficient parallel inference and outperforming cascaded systems. AI

IMPACT These methods could lead to more sophisticated multimodal AI systems capable of richer audio scene analysis and interaction.
- SO-Dataset
- FOA
- LLMs
- Spatial-Omni
- SO-Encoder
- SO-Bench
- SO-QA
- First-Order Ambisonics
- AuRA
- LoRA
RESEARCH · arXiv cs.LG English(EN) · 4d · [3 sources]

Efficient AI-Inspired Reduction of Feynman Integrals via Tube Seeding

Researchers have developed a novel AI-inspired method to accelerate the reduction of complex Feynman integrals, a critical step in theoretical physics calculations. This new strategy employs a sparse seeding technique, significantly reducing computational time and memory requirements compared to existing methods. The approach has been successfully demonstrated on challenging multi-loop integrals, showing promise for applications in particle and gravitational-wave physics. AI

IMPACT This AI-driven method could significantly accelerate complex calculations in theoretical physics, potentially leading to new discoveries in particle and gravitational-wave research.
- AI
- Laporta algorithm
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Recoverable but Not Stationary:Local Linear Structures in Weights and Activations

Researchers have investigated the nature of linear structures within neural network weights and activations, finding that while local low-rank structures exist, they are not stationary. The study, conducted on synthetic transformers and LLMs like DistilGPT-2 and Qwen-0.5B, revealed that useful bases drift significantly over short training periods. However, initial recovery updates can capture a substantial portion of displacement, suggesting evolving local geometries rather than global task directions. AI

IMPACT Suggests that linear structures in neural networks are dynamic and local, impacting how we understand and manipulate model behavior.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Task Robustness via Re-Labelling Vision-Action Robot Data

Researchers have developed a new framework called TREAD to improve robot learning by augmenting existing datasets. This method uses large Vision-Language Models (VLMs) to generate more diverse and linguistically rich instructions for robot tasks. By decomposing demonstrations into grounded language-action pairs and adding variations of text goals, TREAD enhances a robot's ability to understand and generalize to new instructions and scenarios. AI

IMPACT Enhances robot instruction following and generalization by leveraging VLM capabilities for data augmentation.
RESEARCH · arXiv cs.LG English(EN) · 3d · [3 sources]

Non-linear mechanical field reconstruction coupling recurrent neural networks with physics-informed graph neural networks

Researchers have developed a novel framework combining Long Short-Term Memory (LSTM) networks with physics-informed Graph Neural Networks (GNNs) to reconstruct complex mechanical stress fields. This approach effectively captures path-dependent constitutive responses and spatially resolves stress fields, overcoming computational bottlenecks in multi-scale simulations. The model achieves a significant speedup of three orders of magnitude compared to traditional finite element methods and demonstrates generalization capabilities to longer loading sequences. AI

IMPACT This framework offers a significant speedup for complex simulations, potentially accelerating materials science and engineering research.
RESEARCH · arXiv cs.CV English(EN) · 3d · [4 sources]

Beyond Model Size: Probing the Gaps in Visual in-Context Learning by Training a Tiny Model

Two new research papers published on arXiv explore the effectiveness of visual in-context learning (VICL). One paper challenges the notion that large models and extensive data are essential for VICL by training a tiny model with only 1 million parameters and 70,000 images. The other paper introduces VIBE, a comprehensive benchmark designed to evaluate VICL models across diverse domains and tasks, highlighting limitations in current adaptation capability assessments. AI

IMPACT Highlights potential for smaller models in visual adaptation and calls for improved benchmarking in the field.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

Improving Text-Instance Alignment Of Foreground Conditioned Out-Painting Via Customized Concept Embedding

Researchers have developed a new framework called CCE-Diffusion to improve the quality of images generated through foreground conditioned outpainting. This method addresses artifacts caused by misalignment between text prompts and visual instances by customizing concept embeddings. The CCE-Module, a key component of the framework, bridges generic semantics with specific visual details, leading to significantly reduced artifacts and enhanced image quality. AI

IMPACT Enhances image generation quality by reducing artifacts, potentially lowering costs for product showcasing.
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

XtrAIn: Training-Guided Occlusion for Feature Attribution

Researchers have developed XtrAIn, a novel method for feature attribution in machine learning models. This technique addresses issues with traditional occlusion-based methods by transferring the occlusion operation from the input space to the parameter space. XtrAIn analyzes how feature-associated parameter updates influence model output during training, offering a more stable and interpretable approach to understanding feature importance. Variants like Xstep and XtrAIn+ further enhance computational efficiency and target-specific analysis, showing improved attribution patterns on image and medical datasets. AI

IMPACT Offers a more reliable tool for understanding model behavior and debugging AI systems.
- Thodoris Lymperopoulos
- XtrAIn
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

When Do Autoregressive Sequence Models Forecast Physical Wavefields? A Controlled Study on Synthetic Seismograms

Researchers have investigated the stability of autoregressive sequence models when forecasting long-horizon physical wavefields, such as seismograms. Their study, using a model called SeismoGPT on synthetic seismograms, found that multi-token prediction significantly stabilizes the forecasting process. Additional gains were observed with a horizon-embedding hybrid prediction head and a cross-horizon STFT-magnitude coherence loss, though performance critically depends on a specific context-ratio threshold. AI

IMPACT Identifies key architectural choices for improving the stability of autoregressive models in long-horizon forecasting of physical signals.
- SeismoGPT
- arXiv
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Embodiment-conditioned Generalist Control for Multirotor Aerial Robots

Researchers have developed a generalist control policy for multirotor aerial robots that can adapt to various configurations using a single set of network weights. This policy is conditioned on a physics-grounded embodiment descriptor, allowing it to understand how mass-normalized motor thrusts affect the robot's movement. The system was trained in just five minutes on an RTX 3090 GPU and demonstrated successful zero-shot transfer to real-world hexarotor systems with different morphologies. AI

IMPACT Enables a single AI model to control diverse robotic hardware, potentially reducing development time for new drone designs.
RESEARCH · arXiv cs.IR (Information Retrieval) English(EN) · 3d · [2 sources]

ConvMemory v2: A Recall-Preserving Top-10 Evidence Reranker for Conversational Memory Retrieval

Researchers have introduced ConvMemory v2, an advanced reranker designed to improve conversational memory retrieval. This system refines the top candidate memories identified by a previous version, ConvMemory v1, by reordering them to enhance recall. On the LoCoMo benchmark, ConvMemory v2 significantly boosted full MRR from 0.5824 to 0.6560 and H@1 from 0.4440 to 0.5474, nearly closing the gap with more computationally intensive methods. AI

IMPACT Enhances conversational AI by improving memory recall accuracy, potentially leading to more coherent and context-aware interactions.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

A Unified Siamese Learning Framework for Zero-Day Anomaly Detection and Classification in Optical Networks

Researchers have developed a novel Siamese neural network designed for optical networks. This framework enables zero-day anomaly detection and one-shot classification, meaning it can identify and categorize new types of anomalies without prior training. The system demonstrates over 99% accuracy and can adapt instantly to different lightpaths and previously unseen anomaly types. AI

IMPACT This framework could significantly improve the reliability and security of optical networks by enabling rapid detection of novel threats.
- optical networks
- Siamese neural network
RESEARCH · arXiv cs.LG English(EN) · 3d · [2 sources]

Encoding the Euler Characteristic Transform

Researchers have developed a novel continuous encoding method for the Euler Characteristic Transform (ECT), a shape descriptor used in machine learning. This new approach tokenizes the net Euler-characteristic change attributed to each vertex, allowing a transformer to map it to a feature vector. The method improves accuracy on five out of six classification benchmarks, outperforming traditional discretization techniques and highlighting the significance of the encoding itself over specific network architectures. AI

IMPACT Introduces a more accurate method for shape analysis in machine learning, potentially improving performance in tasks involving point clouds, graphs, and meshes.
RESEARCH · arXiv cs.AI English(EN) · 3d · [2 sources]

READER: Robust Evidence-based Authorship Decoding via Extracted Representations

Researchers have developed READER, a new framework for identifying which Large Language Model (LLM) generated a given text, even when prompts vary. This method uses a frozen proxy LLM to analyze activation spaces and accumulate evidence across multiple responses. READER achieves significant accuracy, outperforming previous methods and demonstrating that stronger LLMs possess more decodable authorship structures. AI

IMPACT Establishes a new method for LLM provenance, crucial for verifying AI-generated content in agentic applications.
- READER
- Agent500
- LLM
RESEARCH · arXiv cs.AI English(EN) · 4d · [3 sources]

Post-Quantum Secure Federated DeFi for Inclusive Banking

A new research paper proposes a post-quantum secure federated DeFi framework to enhance financial inclusivity for underserved individuals. The system uses lattice-based Fully Homomorphic Encryption (FHE) to allow multiple banks to collaborate on encrypted data. It integrates assessments with evidence from the NASA-IBM Prithvi Geospatial Foundation Model (GFM) to make lending decisions, with decentralized technologies ensuring data integrity and accountability. AI

IMPACT This framework could enable more equitable access to financial services by leveraging AI and advanced encryption.
- Fully Homomorphic Encryption
- NASA-IBM Prithvi Geospatial Foundation Model