Brief

last 24h

[50/424] 186 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.LG · 1d

Learning fMRI activations dictionaries across individual geometries via optimal transport

Researchers have developed a new dictionary learning method for fMRI data that accounts for individual brain geometry variations. This approach utilizes the optimal transport-based Fused Gromov-Wasserstein (FGW) distance to compare graphs with differing structures and features. To manage computational costs, they employ amortized optimization with a neural network to approximate optimal transport plans, enabling the learning of dictionary atoms that balance feature alignment and structural consistency. Experiments on the HCP dataset show this method effectively captures geometric variability and retains crucial information. AI

IMPACT Introduces a novel computational method for analyzing complex neuroimaging data, potentially improving brain state classification and population-level studies.
TOOL · arXiv cs.CV · 1d

ProCrit: Self-Elicited Multi-Perspective Reasoning with Critic-Guided Revision for Multimodal Sarcasm Detection

Researchers have introduced ProCrit, a novel framework for detecting multimodal sarcasm by employing a two-agent system. This system includes a proposal agent that generates diverse analytical perspectives and a critic agent that evaluates and guides revisions. To address the lack of detailed reasoning data, ProCrit synthesizes process-level annotations using a dynamic-role agentic rollout, creating sequences that preserve cross-perspective dependencies. The framework then refines both agents through a dual-stage reinforcement learning process, demonstrating effectiveness on multiple benchmarks. AI

IMPACT Introduces a novel agentic approach for multimodal reasoning, potentially improving AI's ability to understand nuanced language like sarcasm.
- arXiv
TOOL · arXiv cs.LG · 1d

CIG: Exploration via Conditional Information Gain

Researchers have introduced Conditional Information Gain (CIG), a novel reward mechanism for reinforcement learning designed to improve exploration strategies. CIG addresses limitations of existing methods by providing a tractable surrogate for trajectory-level information gain, allowing it to scale to high-dimensional state spaces. Tested across twelve tasks in both discrete and continuous control environments, CIG demonstrated competitive or superior performance compared to previous exploration techniques, even in the presence of stochastic distractors. AI

IMPACT Introduces a more robust exploration strategy for reinforcement learning agents, potentially improving performance in complex and noisy environments.
TOOL · arXiv cs.AI · 1d

Governance by Construction for Generalist Agents

Researchers have developed a policy system called CUGA designed to provide governance for generalist AI agents operating in enterprise environments. This system acts as a modular, policy-as-code layer that integrates with existing LLM agents without requiring model fine-tuning. CUGA enforces governance through five checkpoints: intent guarding, steering reasoning via playbooks, enforcing tool usage, human-in-the-loop approvals for risky actions, and output formatting. The system aims to ensure predictable, auditable, and compliance-aware behavior in complex workflows, as demonstrated in a healthcare scenario. AI

IMPACT Introduces a novel policy-as-code framework to enhance safety and compliance for enterprise AI agents without model retraining.
- LLM
TOOL · arXiv cs.AI · 1d

CAdam: Context-Adaptive Moment Estimation for 3D Gaussian Densification in Generative Distillation

Researchers have developed CAdam, a new framework for generative distillation in 3D Gaussian Splatting that addresses limitations in adaptive densification. CAdam reinterprets densification as a signal verification problem, using gradient moments to distinguish consistent geometric signals from generative noise. This approach significantly reduces the number of Gaussian primitives needed while maintaining perceptual quality, improving memory efficiency in generative 3D tasks. AI

IMPACT Improves memory efficiency and representation quality in 3D generative models by reducing redundant primitives.
TOOL · arXiv cs.LG · 1d

PlexRL: Cluster-Level Orchestration of Serviceized LLM Execution for RLVR

Researchers have developed PlexRL, a cluster-level runtime designed to improve the efficiency of training large language models (LLMs) for reinforcement learning with verifiable rewards (RLVR). RLVR training is often inefficient due to idle time caused by long-tailed rollouts and tool-induced stalls. PlexRL addresses this by multiplexing LLM services across multiple RLVR jobs, filling idle periods by time-slicing model execution without costly migrations. Evaluations show PlexRL can reduce GPU hour costs by up to 37.58% while maintaining algorithmic flexibility and adding minimal overhead. AI

IMPACT Optimizes LLM training infrastructure, potentially lowering costs and increasing throughput for RLVR applications.
- LLM
- RLVR
- PlexRL
TOOL · arXiv cs.LG · 1d

Genetic Programming with Transformer-Based Mutation for Approximate Circuit Design

Researchers have developed a new method for designing approximate arithmetic circuits using genetic programming enhanced by a transformer-based mutation operator. This hybrid approach aims to overcome stagnation in the evolutionary design process by integrating a standard mutation operator with the novel transformer-based one. The system was trained on a large dataset of genetic programming chromosomes representing approximate multipliers, and it has demonstrated the ability to achieve better trade-offs between error and performance compared to existing state-of-the-art libraries. AI

IMPACT Introduces a novel transformer-based mutation for genetic programming, potentially improving automated circuit design and leading to new, patentable designs.
TOOL · arXiv cs.AI · 1d

DISC: Decoupling Instruction from State-Conditioned Control via Policy Generation

Researchers have developed a new method called DISC that decouples language instructions from state-conditioned control in robotics. Unlike previous approaches that share network parameters, DISC uses a hypernetwork to generate task-specific policies directly from instructions, preventing observation leakage. This novel approach significantly outperforms existing methods on benchmarks like LIBERO-90 and Meta-World, demonstrating its effectiveness in complex, long-horizon tasks and real-world applications. AI

IMPACT Introduces a novel architecture for language-conditioned robotics that mitigates common failure modes and improves performance on complex tasks.
- $\\pi_0$
- LIBERO-90
TOOL · arXiv cs.LG · 1d

Activation-Free Backbones for Image Recognition: Polynomial Alternatives within MetaFormer-Style Vision Models

Researchers have developed new activation-free backbone architectures for vision models, utilizing polynomial functions instead of traditional pointwise nonlinearities like ReLU or GELU. These novel modules, integrated into the MetaFormer framework, demonstrate competitive or superior performance compared to activation-based models on tasks such as ImageNet classification and semantic segmentation. The study also shows these polynomial variants outperform prior specialized polynomial networks while requiring less computational cost. AI

IMPACT Introduces a new architectural approach for vision models that could lead to more efficient and robust image recognition systems.
- ImageNet
- ReLU
- GELU
- MetaFormer
- ADE20K
- PolyNeXt
TOOL · arXiv cs.AI · 1d

USV: Towards Understanding the User-generated Short-form Videos

Researchers have introduced USV, a new dataset comprising approximately 224,000 user-generated short-form videos. This dataset is designed to advance the understanding of high-level semantic information in videos, moving beyond instance-level recognition. To facilitate research, the paper also establishes topic recognition and video-text retrieval tasks on USV, proposing baseline methods like MMF-Net and VTCL. AI

IMPACT Introduces a new dataset and baseline methods to advance research in understanding user-generated short-form videos.
- MMF-Net
TOOL · arXiv cs.CV · 1d

HyDAR-Pano3D: A Hybrid Disentangled Anatomical Recovery Framework for Panoramic-to-3D Reconstruction

Researchers have developed HyDAR-Pano3D, a novel framework for reconstructing detailed 3D dental anatomy from 2D panoramic radiographs. This two-stage approach disentangles the learning process, first creating a normalized canonical volume using radiographic features and semantic priors from SAM, and then restoring patient-specific variations. The method significantly outperforms existing techniques, achieving high scores in PSNR, SSIM, and Dice for anatomical reconstruction, and enabling accurate downstream segmentation tasks. AI

IMPACT Enables more accurate 3D dental reconstructions from standard 2D X-rays, potentially reducing the need for CBCT scans and improving diagnostic capabilities.
- SAM
- HyDAR-Pano3D
TOOL · arXiv cs.LG · 1d

Markovian Circuit Tracing for Transformer State Dynamic

Researchers have developed a new framework called Markovian Circuit Tracing (MCT) to analyze the internal state dynamics of transformer models. This method uses synthetic Hidden Markov Model (HMM) tasks to test if transformer activations exhibit coarse state-transition structures. The findings indicate that transformers can learn near-Bayes next-token predictors and that residual activations contain partial Bayesian belief information, with state patching significantly improving accuracy. AI

IMPACT Introduces a new benchmark and evaluation framework for transformer interpretability, potentially aiding in understanding model behavior.
TOOL · arXiv cs.AI · 1d

GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval

Researchers evaluated the GraphRAG pipeline for retrieving information from Electronic Health Record (EHR) schemas using open-source large language models deployed on consumer hardware. The study benchmarked models like Llama 3.1, Mistral, Qwen 2.5, and Phi-4-mini on a single GPU, assessing indexing efficiency, knowledge graph construction, latency, and answer quality. Results indicated that models below approximately 7 billion parameters struggle with structured output errors, and local retrieval generally outperformed global summarization in terms of speed and factual accuracy. AI

IMPACT Demonstrates the feasibility of using smaller, locally deployed LLMs for complex tasks like EHR schema retrieval, potentially improving privacy and reducing costs in healthcare.
- Llama 3.1
- LLMs
- Ollama
- Phi-4-mini
- Qwen 2.5
- EHR
- GraphRAG
TOOL · arXiv cs.CL · 1d

Assessing socio-economic climate impacts from text data

A new paper on arXiv proposes guidelines for using text data to assess the socio-economic impacts of climate change. The research addresses the fragmentation and methodological complexity in the field, offering recommendations for defining impacts, handling biases, and selecting modeling strategies. The goal is to support the creation of more accurate datasets for disaster risk management and attribution studies. AI

IMPACT Provides a framework for using NLP and LLMs to analyze climate impact data, potentially improving disaster risk management.
- arXiv
- Brielen Madureira
RESEARCH · arXiv stat.ML · 1d · [2 sources]

Spectral bandits for smooth graph functions with applications in recommender systems

Researchers have developed new bandit algorithms designed for scenarios where payoffs are smooth across graph-connected data. These algorithms are particularly applicable to online learning problems like content-based recommendation, where items are nodes and their expected ratings are influenced by neighbors. The proposed methods aim to minimize cumulative regret by introducing an 'effective dimension' concept, showing that user preferences for thousands of items can be estimated from just tens of evaluations. AI

IMPACT Introduces novel algorithms for graph-based online learning, potentially improving recommendation system efficiency.
- arXiv
- Spectral bandits for smooth graph functions with applications in recommender systems
RESEARCH · arXiv stat.ML · 1d · [2 sources]

Latent Process Generator Matching

Researchers have introduced a new framework called latent process generator matching for generative models. This approach generalizes existing generator matching theory by treating the observed generative state as a deterministic image of a tractable Markov process. The method allows for learning a generator of a stochastic process that matches the one-time marginal distributions of the projected process, extending previous work on static latent variables to time-dependent conditional processes. AI

IMPACT Introduces a generalized framework for generative models, potentially improving training and generation processes for flow-matching and diffusion models.
TOOL · arXiv cs.CV · 1d

Spatial Gram Alignment for Ultra-High-Resolution Image Synthesis

Researchers have introduced Spatial Gram Alignment (SGA), a new framework designed to improve ultra-high-resolution image synthesis using large-scale pre-trained Latent Diffusion Models (LDMs). Traditional methods struggle with extreme resolutions due to a conflict between learnability and fidelity, where direct feature distillation can degrade generation quality. SGA addresses this by aligning self-similarities of generative features with foundation model priors, preserving microscopic pixel-level fidelity while ensuring macroscopic structural coherence. AI

IMPACT Enables more detailed and structurally coherent ultra-high-resolution image generation, potentially improving applications in digital art and media.
TOOL · arXiv cs.CV · 1d

Decomposing Subject-Driven Image Generation via Intermediate Structural Prediction

Researchers have developed a new two-stage framework for subject-driven text-to-image generation that first predicts a structural map (like a Canny edge map) and then renders the final image using both appearance and structure. This approach aims to better preserve high-frequency details such as logos, patterns, and text, which are often degraded in existing methods. To enhance text handling, they also created a large dataset of 100,000 image pairs with textual consistency, and evaluations using GPT-4.1 showed significant improvements over baseline methods. AI

IMPACT This research offers a novel approach to improving the fidelity of text-to-image generation, particularly for preserving fine details and text.
- GPT-4.1
TOOL · arXiv cs.CL · 1d

Most Transformer Modifications Still Do Not Transfer at 1-3B: A 2020-2026 Update to Narang et al. (2021) with Downstream Evaluation and a Noise Floor

A recent study re-evaluated the effectiveness of Transformer model modifications, finding that most still do not yield significant improvements when scaled to 1-3 billion parameters. Researchers tested 20 modifications introduced after 2021, using downstream evaluation metrics and controlling for variables like data, compute, and training recipes. The findings largely echo a 2021 study, with only a couple of modifications showing benefits, and one of those proving unstable at the larger scale. The research emphasizes the need for rigorous reporting, downstream evaluation, and cross-scale stability testing for architecture comparisons. AI

IMPACT Confirms that architectural innovations in large language models often fail to scale effectively, suggesting a need for more robust evaluation methods.
TOOL · arXiv cs.AI · 1d

ELSA: An ELastic SNN Inference Architecture for Efficient Neuromorphic Computing

Researchers have introduced ELSA, a novel architecture designed to enhance the efficiency of neuromorphic computing using spiking neural networks (SNNs). ELSA enables true elastic inference by processing data in a fine-grained, token-wise pipeline, allowing for immediate forwarding of results and reduced latency. The architecture incorporates optimizations like a bundled address event representation protocol and mini-batch spiking Gustavson-product to minimize memory access and communication traffic. Experiments demonstrate that ELSA significantly outperforms existing accelerators in both speed and energy efficiency compared to both quantized artificial neural networks and other SNN accelerators. AI

IMPACT Introduces a new architecture that significantly improves speed and energy efficiency for neuromorphic computing, potentially accelerating the adoption of SNNs.
TOOL · arXiv cs.LG · 1d

Beyond Numerical Features: CNN-Driven Algorithm Selection via Contour Plots for Continuous Black-Box Optimization

Researchers have developed a novel method for algorithm selection in continuous black-box optimization that utilizes contour plots instead of traditional numerical features. A Convolutional Neural Network (CNN) analyzes these contour visualizations of probed landscapes to predict the performance of different solvers. This image-based approach demonstrated significant improvements over the single best solver (SBS) on the BBOB 2009 benchmark and showed competitiveness with existing feature-based methods. AI

IMPACT Introduces a novel image-based approach for algorithm selection in optimization, potentially improving efficiency without relying on traditional numerical features.
- CNN
- BBOB 2009
RESEARCH · arXiv stat.ML · 1d · [2 sources]

Sample Complexity of Transfer Learning: An Optimal Transport Approach

Researchers have theoretically analyzed the benefits of transfer learning using an optimal transport framework. Their findings suggest that for data dimensions greater than three, transfer learning offers improved sample efficiency compared to direct learning, particularly for complex models with non-smooth activation functions. This theoretical advantage was numerically demonstrated using image classification tasks, showing significant performance gains in data-scarce scenarios. AI

IMPACT Provides theoretical backing for transfer learning's effectiveness in data-hungry AI models.
TOOL · arXiv cs.AI · 1d

Tunable MAGMAX: Preference-Aware Model Merging for Continual Learning

Researchers have developed Tunable MAGMAX, a new framework for continual learning that allows for preference-aware model merging. This method enables control over task-specific performance in merged models, adapting them to different deployment needs and user preferences. By using a preference vector and leveraging target environment data, the system can automatically construct optimal vectors without manual input. Experiments show Tunable MAGMAX effectively manages task-wise performance and adapts merged models to various environments, outperforming or matching baseline methods. AI

IMPACT Enables more flexible deployment of continual learning models by allowing customization of task performance.
- MAGMAX
- Tunable MAGMAX
TOOL · arXiv cs.CV · 1d

What Semantics Survive the Connector? Diagnosing VLM-to-DiT Alignment in Video Editing

Researchers have developed a new diagnostic dataset and protocol called TRACE-Edit to evaluate how well semantic information is preserved when Vision-Language Models (VLMs) are used for video editing. Their findings indicate that the alignment process between VLMs and Diffusion Transformer models (DiTs) can significantly degrade fine-grained structural details, challenging the assumption of lossless semantic transfer. This research identifies the VLM-to-DiT alignment as a critical bottleneck and provides a foundation for developing improved multi-modal alignment architectures. AI

IMPACT Identifies a key bottleneck in current video editing models, potentially guiding future research towards more semantically faithful multi-modal alignment.
- VLM
TOOL · arXiv cs.AI · 1d

Interaction Locality in Hierarchical Recursive Reasoning

Researchers have introduced a new framework called interaction locality to measure how information flows within AI models during spatial reasoning tasks. This framework analyzes whether computations remain confined to nearby areas or semantic segments, or if they cross these boundaries. The study applied this to models like HRM, TRM, and MTU3D, finding that high-level states in recursive models tend to write information locally, accumulating into broader structures, while embodied models concentrate causal spatial structure at module boundaries. AI

IMPACT Introduces a novel measurement framework for analyzing spatial reasoning in AI, potentially leading to more efficient and interpretable models.
TOOL · arXiv cs.CV · 1d

AttriStory: Fine-grained Attribute Realization for Visual Storytelling with Diffusion Models

Researchers have introduced AttriStory, a new benchmark and method for improving fine-grained attribute realization in visual storytelling generated by diffusion models. The system addresses the challenge of ensuring specific attributes like clothing color and textures are accurately depicted across narrative scenes. AttriStory utilizes a plug-and-play latent optimization module and a novel AttriLoss objective to guide the diffusion model during the early stages of image generation, enhancing attribute control without altering existing story generation pipelines. AI

IMPACT Enhances control over specific visual details in AI-generated narratives, moving towards more precise attribute-driven storytelling.
RESEARCH · arXiv stat.ML · 1d · [2 sources]

Axiomatizing Neural Networks via Pursuit of Subspaces

Researchers have introduced a new theoretical framework called the Pursuit of Subspaces (PoS) hypothesis to better understand the inner workings of deep neural networks. This axiomatic approach uses geometric postulates to explain representation, computation, and generalization in neural network architectures. The PoS hypothesis aims to bridge the gap between the empirical success of neural networks and the current lack of theoretical understanding, offering a principled foundation for deep learning. AI

IMPACT Provides a new theoretical lens for understanding and potentially improving neural network architectures and generalization.
TOOL · arXiv cs.LG · 1d

Cumulative Meta-Learning from Active Learning Queries for Robustness to Spurious Correlations

Researchers have developed a new active learning framework called Cumulative Active Meta-Learning (CAML) to improve the robustness of machine learning models against spurious correlations. CAML treats each active learning round as a meta-learning task, using queried samples to refine the model's inductive bias rather than just updating its likelihood. This cumulative approach captures sequential dependencies between learning rounds, leading to significant accuracy improvements for minority groups on various benchmarks. AI

IMPACT Enhances model reliability and fairness by addressing spurious correlations, potentially improving performance in sensitive applications.
TOOL · arXiv cs.LG · 1d

Causal Machine Learning Is Not a Panacea: A Roadmap for Observational Causal Inference in Health

A new roadmap paper highlights the limitations of causal machine learning (ML) in health research, despite its growing use with large observational clinical datasets. The authors emphasize the need for careful assessment of validity assumptions and responsible application by both clinical experts and ML practitioners. Without these precautions, causal ML approaches risk producing biased or misleading results, potentially impacting clinical research and patient care. AI

IMPACT Provides a framework for responsible application of causal ML in healthcare, aiming to improve the rigor and interpretability of clinical research.
TOOL · arXiv cs.LG · 1d

Learning to Think in Physics: Breaking Shortcut Learning in Scientific Diffusion via Representation Alignment

Researchers have developed a new framework called REPA-P to improve the accuracy and robustness of physics-informed diffusion models. This method aligns intermediate model representations with physical states during training by using lightweight projection heads that are removed during inference, thus adding no computational overhead. Experiments across four different physics tasks demonstrated that REPA-P can accelerate convergence, reduce physics residuals, and enhance out-of-distribution performance. AI

IMPACT Enhances the accuracy and robustness of scientific diffusion models, potentially improving their application in fields like fluid dynamics and electromagnetism.
TOOL · arXiv cs.CV · 1d

Diffuse to Detect: Bi-Level Sample Rebalancing with Pseudo-Label Diffusion for Point-Supervised Infrared Small-Target Detection

Researchers have developed a new framework for infrared small-target detection using point supervision, addressing challenges of unstable pseudo-labels and sample imbalance. Their approach utilizes a physics-induced annotation strategy based on heat diffusion to generate reliable pseudo-masks from single-point labels. A bi-level dual-update framework optimizes detector weights, sample weights, and diffusion parameters, enhancing supervision and adapting to sample distribution. AI

IMPACT Introduces a novel method for improving the accuracy and efficiency of infrared small-target detection using physics-informed AI.
- Pseudo-labels
- Point supervision
TOOL · arXiv cs.LG · 1d

ShapeBench: A Scalable Benchmark and Diagnostic Suite for Standardized Evaluation in Aerodynamic Shape Optimization

Researchers have introduced ShapeBench, a new open-source benchmark designed to standardize evaluations in aerodynamic shape optimization. This benchmark includes 103 tasks across eight shape categories, featuring validated surrogates for rapid testing and optional high-fidelity CFD pipelines for verification. ShapeBench aims to enable fair comparisons between various optimization methods, including classical, general-purpose, and LLM-driven approaches, by using a consistent budget metric and highlighting the variance in optimizer performance across different tasks. AI

IMPACT Provides a standardized framework for evaluating and comparing AI-driven methods in aerodynamic shape optimization.
TOOL · arXiv cs.AI · 1d

VBFDD-Agent for Electric Vehicle Battery Fault Detection and Diagnosis: Descriptive Text Modeling of Battery Digital Signals

Researchers have developed VBFDD-Agent, a novel system designed for detecting and diagnosing faults in electric vehicle batteries. This agent utilizes a descriptive text modeling approach, transforming raw battery data into natural language descriptions to create a specialized corpus. By integrating this corpus with maintenance manuals and large language model reasoning, VBFDD-Agent provides structured diagnostic results and actionable maintenance recommendations, enhancing human-AI collaboration in battery health management. AI

IMPACT Introduces a new method for AI-driven diagnostics in electric vehicles, potentially improving safety and maintenance efficiency.
TOOL · arXiv cs.CL · 1d

The Illusion of Intervention: Your LLM-Simulated Experiment is an Observational Study

Researchers have identified a critical flaw in using large language models (LLMs) to simulate human behavior for experimental studies. Because LLMs are trained on observational data, interventions can inadvertently alter the simulated users' underlying attributes, leading to "user drift." This drift can distort the estimated effects of interventions, making the experimental results unreliable. The study proposes methods to diagnose this confounding using negative control outcomes and mitigate it by adjusting LLM personas with relevant confounders. AI

IMPACT Highlights a potential pitfall in using LLMs for experimental research, impacting the reliability of findings in behavioral science and AI studies.
TOOL · arXiv cs.CV · 1d

SpineContextResUNet: A Computationally Efficient Residual UNet for Spine CT Segmentation

Researchers have developed SpineContextResUNet, a new 3D Residual U-Net architecture designed for efficient segmentation of spinal CT scans. This model addresses the high computational demands of existing methods by using a lightweight Context Block with parallel multi-dilated convolutions, avoiding the need for resource-intensive Transformers or RNNs. SpineContextResUNet achieves high accuracy on public benchmarks and demonstrates viable inference performance on commodity hardware, making it suitable for point-of-care diagnostics and edge devices. AI

IMPACT Enables more accessible AI-driven medical diagnostics on low-resource hardware.
TOOL · arXiv cs.AI · 1d

The Devil is in the Condition Numbers: Why is GLU Better than non-GLU Structure?

A new paper analyzes the effectiveness of Gated Linear Units (GLU) in large language models, finding that they improve training speed by reshaping the neural tangent kernel (NTK) spectrum. Researchers observed that GLU structures lead to a smaller condition number and faster convergence, a phenomenon sometimes resulting in loss-crossing between GLU and non-GLU models. However, the study also indicated that GLU's benefit is primarily in optimization acceleration rather than reducing the generalization gap. AI

IMPACT Explains a key architectural advantage of modern LLMs, potentially guiding future model design for faster training.
TOOL · arXiv cs.AI · 1d

Conflict-Aware Additive Guidance for Flow Models under Compositional Rewards

Researchers have developed a new method called Conflict-Aware Additive Guidance ($g^ ext{car}$) to improve the control and fidelity of generative models, particularly when dealing with multiple, potentially conflicting, constraints. This technique addresses issues where combining constraints can lead to deviations from the natural data distribution. $g^ ext{car}$ dynamically detects and resolves these gradient conflicts, demonstrating effectiveness across various applications including image editing and decision-making for planning and control, while maintaining efficient computation. AI

IMPACT Enhances control and fidelity in generative models for complex, multi-constraint tasks.
TOOL · arXiv cs.AI · 1d

PACD-Net: Pseudo-Augmented Contrastive Distillation for Glycemic Control Estimation from SMBG

Researchers have developed PACD-Net, a novel self-supervised framework designed to estimate glycemic control metrics from sparse self-monitoring of blood glucose (SMBG) data. This approach uses pseudo-SMBG samples as teacher signals and contrastive learning to ensure consistent representations across different sampling patterns. The model, which employs a hybrid Swin Transformer-CNN backbone, demonstrates superior accuracy and stability compared to existing methods for estimating Time Above Range, Time in Range, and Time Below Range from real-world SMBG data, particularly under extremely sparse conditions. AI

IMPACT Offers a practical tool for interpreting clinical SMBG data and a generalizable method for learning from sparse sensor data.
- PACD-Net
- Swin Transformer-CNN
TOOL · arXiv cs.AI · 1d

The Hidden Signal of Verifier Strictness: Controlling and Improving Step-Wise Verification via Selective Latent Steering

Researchers have developed a new method called VerifySteer to control the strictness of generative verifiers in step-wise verification processes. This technique identifies a hidden signal within the verification paragraph's hidden state that indicates the verifier's tendency to accept or reject a step. By selectively steering this signal, VerifySteer can modulate verifier strictness without requiring fine-tuning, offering a way to balance error detection and correctness certification. AI

IMPACT Improves the reliability and efficiency of AI verification systems, potentially reducing computational costs for ensuring AI correctness.
TOOL · arXiv cs.CV · 1d

STAR-IOD: Scale-decoupled Topology Alignment with Pseudo-label Refinement for Remote Sensing Incremental Object Detection

Researchers have developed STAR-IOD, a new framework designed to improve incremental object detection in remote sensing imagery. This method addresses challenges like intra-class scale variations and missing annotations, which hinder knowledge transfer and preservation in existing detectors. STAR-IOD utilizes a Subspace-decoupled Topology Distillation module for structural knowledge transfer and a Clustering-driven Pseudo-label Generator to accurately distinguish targets from background noise. The framework also introduces two new datasets, DIOR-IOD and DOTA-IOD, and demonstrates superior performance over state-of-the-art approaches. AI

IMPACT Introduces novel techniques for incremental object detection in remote sensing, potentially improving autonomous systems and data analysis in this domain.
TOOL · arXiv cs.LG · 1d

Lowering the Barrier to IREX Participation: Open-Source Algorithms, Toolkit, and Benchmarking for Iris Recognition

Researchers have introduced two novel open-source iris recognition algorithms, TripletIris and ArcIris, designed to lower participation barriers for the IREX X program. The paper details Python and IREX-compliant C++ implementations, enabling broader assessment of open-source solutions. Additionally, it provides open-source tools for iris segmentation and circle estimation, facilitating the development and integration of new recognition methods. AI

IMPACT Provides open-source tools and algorithms that could accelerate research and development in iris recognition systems.
RESEARCH · arXiv stat.ML · 1d · [2 sources]

Tippett-minimum Fusion of Representation-space Diffusion Models for Multi-Encoder Out-of-Distribution Detection

Researchers have developed a novel method for detecting out-of-distribution (OOD) data by fusing multiple diffusion models. This approach, termed EncMin2L, statistically identifies each encoder's sensitivity to different types of distribution shifts using only in-distribution data. The system then combines these per-encoder scores to produce a robust OOD signal, outperforming existing methods while using fewer parameters. AI

IMPACT This new method for out-of-distribution detection could improve the reliability and safety of AI systems by better identifying unfamiliar or adversarial inputs.
TOOL · arXiv cs.CL · 1d

Post-Hoc Understanding of Metaphor Processing in Decoder-Only Language Models via Conditional Scale Entropy

Researchers have developed a new metric called conditional scale entropy (CSE) to analyze how decoder-only language models process metaphors. CSE measures the breadth of computational engagement across different frequency scales within a transformer's layers. Studies using CSE revealed that metaphorical tokens consistently activate a wider range of computational scales compared to literal tokens in models ranging from 124 million to 20 billion parameters, including architectures like GPT-2, LLaMA-2, and GPT-oss. AI

IMPACT Introduces a novel metric for understanding metaphorical processing in LLMs, potentially aiding in the development of more nuanced language understanding capabilities.
TOOL · arXiv cs.AI · 1d

How to Build Marcus's Algebraic Mind: Algebro-Deterministic Substrate over Galois Fields

Researchers have developed a new hyperdimensional computing architecture called PyVaCoAl/VaCoAl, which is built around the XOR-and-shift operation over Galois Fields. This architecture aims to fulfill Gary Marcus's three core requirements for cognitive architectures: operations over variables, recursively structured representations, and a distinction between individuals and kinds. The system demonstrates reversible variable binding, non-commutative compositional bundling for distinguishing sentence structures, and address-space separation, potentially offering a functional neural substrate that more closely aligns with Marcus's specifications than previous approaches. AI

IMPACT Proposes a novel computational substrate that could enable more sophisticated AI architectures, potentially addressing limitations in current models.
- Gary Marcus
- PyVaCoAl/VaCoAl
TOOL · arXiv cs.AI · 1d

Closed Loop Dynamic Driving Data Mixture for Real-Synthetic Co-Training

Researchers have developed AutoScale, a novel closed-loop system designed to optimize the mixture of real and synthetic data for training autonomous driving models. This system dynamically adjusts the data mixture based on performance feedback, addressing the challenges of scene bias and inefficient data utilization in current co-training methods. AutoScale employs Graph Regularized AutoEncoder for scene representation and Cluster-aware Gradient Ascent for reweighting, demonstrating improved performance with fewer synthetic samples under budget constraints. AI

IMPACT This approach could lead to more efficient and effective training of autonomous driving systems by optimizing data usage.
TOOL · arXiv cs.CL · 1d

Draw2Think: Harnessing Geometry Reasoning through Constraint Engine Interaction

Researchers have developed Draw2Think, a new framework that enhances geometric reasoning in vision-language models by interacting with the GeoGebra constraint engine. This system uses a Propose-Draw-Verify loop to externalize hypotheses onto an executable canvas, ensuring geometric accuracy and allowing for auditable checks on both model construction and engine measurements. Draw2Think significantly improves the accuracy of geometric problem-solving and rendering scores on various benchmarks. AI

IMPACT Improves geometric reasoning capabilities in vision-language models, potentially leading to more accurate AI systems for tasks involving spatial understanding.
TOOL · arXiv cs.CV · 1d

Resolving Long-Tail Ambiguity in Unsupervised 3D Point Cloud Segmentation with Language Priors

Researchers have developed LangTail, a new framework designed to improve unsupervised 3D point cloud segmentation by addressing the issue of long-tail ambiguity. This problem occurs when minor object classes are overlooked in favor of dominant ones during the segmentation process. LangTail integrates semantic knowledge from language models to create a more balanced understanding of categories, which is then used to guide the segmentation, leading to better identification of underrepresented classes. Experiments show significant improvements in mean Intersection over Union (mIoU) scores on benchmark datasets. AI

IMPACT Enhances representation of minority classes in 3D data, potentially improving AI's understanding of complex environments.
- nuScenes
- S3DIS
- LangTail
- ScanNet-v2
TOOL · arXiv cs.AI · 1d

An Application-Layer Multi-Modal Covert-Channel Reference Monitor for LLM Agent Egress

Researchers have developed a novel reference monitor designed to detect and prevent covert channels used by compromised Large Language Model (LLM) agents to leak data. The system employs a multi-stage text processing pipeline and media scrambling techniques for audio and images to eliminate hidden data transmission. It uses cryptographic attestations to distinguish legitimate media from data disguised as media, and measures residual capacity to ensure covert channels are destroyed or bounded. AI

IMPACT Introduces a novel security mechanism to protect against data exfiltration by compromised AI agents.
- LLM
- arXiv
TOOL · arXiv cs.CV · 1d

A Non-Reference Diffusion-Based Restoration Framework for Landsat 7 ETM+ SLC-off Imagery in Antarctica

Researchers have developed DiffGF, a novel framework designed to restore corrupted Landsat 7 satellite imagery from Antarctica. This method utilizes a diffusion-based approach in latent and pixel spaces, eliminating the need for external reference data, which is often unavailable or unreliable for the rapidly changing Antarctic landscape. A new dataset, SLCANT, was created to train and evaluate DiffGF, demonstrating its effectiveness in high-fidelity image restoration and its utility in downstream applications like crevasse segmentation. AI

IMPACT Enables better utilization of historical satellite data for environmental monitoring and research in challenging regions.
- Antarctica
- SLCANT
- DiffGF
TOOL · arXiv cs.CV · 1d

Sketch2MinSurf: Vision-Language Guided Generation of Editable Minimal Surfaces from Hand-Drawn Sketches

Researchers have developed Sketch2MinSurf, a novel framework for generating editable 3D minimal surfaces from hand-drawn sketches. This approach combines vision-language guidance with geometric optimization, addressing the challenges of non-Euclidean surface representation and topological consistency. The system utilizes a spatial-topological encoding and a specialized loss function to ensure both accurate reconstruction and coherent topology, producing artifact-free, editable manifolds suitable for design workflows. AI

IMPACT Enables more intuitive and direct creation of complex 3D models for design and art applications.
- arXiv
- Sketch2MinSurf