Brief

last 24h

[50/10532] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.AI English(EN) · 3d

Tractogram foundation model

Researchers have developed TractFM, a novel foundation model designed to learn representations directly from diffusion MRI tractograms. This model uniquely combines a local streamline encoder with a permutation-equivariant tractogram encoder, enabling it to process all streamlines from a subject simultaneously. By pretraining on anatomical parcellation, TractFM generates reusable embeddings for both individual streamlines and compact subject-level descriptors. The model demonstrates strong generalization capabilities, achieving accurate tract parcellation and predicting subject phenotypes like age and sex across different tractography algorithms and datasets. AI

IMPACT Enables more robust and generalizable analysis of brain white-matter pathways, potentially improving diagnostic and research capabilities in neuroscience.
- TractFM
- Human Brain
TOOL · arXiv cs.AI English(EN) · 3d

When Attribution Patching Lies: Diagnosis and a Second-Order Correction

Researchers have developed a new method to improve the accuracy of attribution patching, a technique used to understand how different parts of a language model contribute to its behavior. The current method, a first-order approximation, can be unreliable due to network non-linearities. The new approach introduces a second-order correction using Hessian-vector products, which significantly enhances the fidelity of circuit recovery. This method is computationally feasible for larger models and offers practical tools for detecting untrustworthy estimates and quantifying errors. AI

IMPACT Improves interpretability of AI models, enabling more reliable circuit identification and debugging.
TOOL · arXiv cs.AI English(EN) · 3d

Bypassing Copyright Protection in Diffusion-based Customization via Two-Stage Latent Feature Optimization

Researchers have developed a new method called Two-Stage Latent Feature Optimization (TS-LFO) to bypass copyright protection in diffusion-based image customization. This technique addresses existing defenses by restoring the mapping between input images and their latent representations. TS-LFO uses a two-stage optimization process to suppress noise and refine latent features, demonstrating effectiveness against state-of-the-art copyright protection methods. AI

IMPACT This research highlights potential vulnerabilities in current AI image copyright defenses, suggesting a need for more robust protection mechanisms.
TOOL · arXiv cs.AI English(EN) · 3d

The Bioelectrical Information Theory: Investigating the theoretical compression limit of bioelectrical signals under artificial intelligence

Researchers have introduced a new theoretical framework for compressing bioelectrical signals, moving beyond traditional waveform preservation methods. This "Bioelectrical Information Theory" considers physiological structure, model capacity, and task requirements to determine compression limits. The approach involves reducing noise, creating structured representations, and discarding task-irrelevant information, ultimately reframing compression as a model- and task-conditioned quantity. AI

IMPACT This new theoretical framework could enable more efficient compression of bioelectrical data for AI-driven applications like brain-computer interfaces.
- artificial intelligence
- Bioelectrical Information Theory
TOOL · arXiv cs.AI English(EN) · 3d

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters

Researchers have introduced Sigma-Branch (SigmaB), a novel framework designed to optimize deep neural networks for memory-constrained edge devices. SigmaB restructures dense networks into a hierarchical tree with shared backbones, routers, and specialized leaves, enabling dynamic inference. This approach significantly reduces the number of active parameters per inference by executing only a single root-to-leaf path, thereby minimizing off-chip weight transfers without sacrificing overall model capacity. AI

IMPACT Reduces per-inference active parameters by up to 60%, enabling more efficient AI deployment on edge devices with limited memory.
TOOL · arXiv cs.AI English(EN) · 3d

A Note on the Strategic Confinement Problem

Researchers have introduced the "strategic confinement problem," which addresses how to prevent programs processing confidential data from leaking it when interacting with strategic agents. This problem arises because these agents can concentrate residual communication capacity on specific, high-impact data predicates, allowing for significant harm even with negligible information leakage. The paper argues that AI systems, due to their unpredictable learned conventions and potential for covert communication, naturally instantiate this challenge, shifting the focus from information flow to the strategic outcomes achievable by agents. AI

IMPACT Highlights potential new avenues for AI safety research concerning strategic agent interactions and information leakage.
- Lampson
- Christian Schroeder de Witt
TOOL · arXiv cs.LG English(EN) · 3d

What Demonstration Curation Metrics Do to Your Policy

Researchers have found that metrics used to curate training data for AI policies do not necessarily improve the performance of those policies. In experiments on a pick-and-place benchmark, a metric that was highly effective at detecting defects actually resulted in the worst-performing policy. Conversely, a metric with lower defect detection accuracy produced a policy that was nearly as good as one trained on perfect data. The study also revealed that many metrics incorrectly use episode length as a proxy for defects, inflating their apparent accuracy. AI

IMPACT Highlights the need to evaluate data curation methods based on resulting policy performance rather than defect detection accuracy alone.
- LIBERO
TOOL · arXiv cs.AI English(EN) · 3d

Geometry-Aware Anisotropic Boundary Correction for Aerodynamic Simulation

Researchers have developed a new framework called GeoABC to improve the accuracy of neural operators in aerodynamic simulations. This method explicitly models the anisotropic nature of flow near boundaries, where behavior differs along the wall versus perpendicular to it. By incorporating boundary geometry as a directional prior, GeoABC enhances predictions and significantly reduces errors, making neural operators more suitable for high-fidelity aerodynamic simulations. AI

IMPACT Improves the accuracy of AI models used in engineering simulations, potentially speeding up design processes.
- GeoABC
- neural operators
TOOL · arXiv cs.AI English(EN) · 3d

Divide-and-Conquer Modeling for the CTF-4-Science Lorenz Benchmark

Researchers have developed a novel divide-and-conquer modeling strategy specifically for the CTF-4-Science Lorenz benchmark. This approach tailors different model classes to distinct prediction tasks within the benchmark, rather than using a single model for all scenarios. The system achieved a final public score of 79.63 by employing techniques like smoothing-based reconstruction for denoising, NG-RC/NVAR models for long-time forecasting, and a fitted Lorenz transition correction for short-time prediction, demonstrating the effectiveness of scenario-specific updates. AI

IMPACT Introduces a specialized approach to chaotic system prediction, potentially improving forecasting accuracy in complex dynamic systems.
- CTF-4-Science Lorenz benchmark
- Lorenz
TOOL · arXiv cs.AI English(EN) · 3d

Unsupervised Style Representation Learning for AI-Text Detection via Paraphrase Inversion

Researchers have developed a new method for detecting AI-generated text by learning style representations without needing authorship labels. This approach uses a style encoder to reconstruct human text from its machine-generated paraphrase, effectively capturing non-semantic stylistic features. The learned representations perform competitively in both few-shot and zero-shot detection scenarios, even generalizing to unseen language models and tasks like authorship verification. AI

IMPACT This unsupervised approach could improve the robustness and applicability of AI text detection systems, aiding in combating misinformation and plagiarism.
TOOL · arXiv cs.AI English(EN) · 3d

Emotion Profiling in LLM-Based Literary Translation: Systematic Shifts Across MT and Post-Editing

A new research paper explores the emotional characteristics of translations produced by Large Language Models (LLMs). The study compares LLM translations of Margaret Atwood's "Oryx and Crake" with human translations and post-edited versions. Findings indicate that LLMs imprint distinct emotional patterns on their translations, which can obscure the original author's voice and are only partially corrected by human post-editing. AI

IMPACT Reveals how LLMs may alter authorial voice in translation, impacting literary authenticity and the effectiveness of post-editing.
TOOL · arXiv cs.AI English(EN) · 3d

Pareto-Guided Teacher Alignment for Fair Personalized Text Generation

Researchers have developed a Pareto-guided teacher alignment framework to address fairness issues in personalized text generation. This framework aims to reduce demographic disparities while maintaining personalization fidelity by combining several techniques, including candidate generation, feasibility gating, and Pareto-style selection. Evaluations on persuasion tasks revealed that different alignment strategies occupy distinct regions of a fairness-personalization Pareto frontier, highlighting the objective-dependent nature of fairness mitigation and the need for multi-audit model selection. AI

IMPACT Introduces a novel approach to balance personalization and fairness in text generation, potentially influencing future model development and evaluation.
TOOL · arXiv cs.AI English(EN) · 3d

MMClima: A Framework for Multimodal Climate Science Data and Evaluation

Researchers have developed MMClima, a new framework designed to advance AI capabilities in climate science. This framework includes a large dataset of over 104,000 expert-validated question-answer pairs that integrate text, video transcripts, and scientific figures across five key climate domains. MMClima also provides an evaluation pipeline and a domain-adapted baseline model, mmclima-70b-txt, to facilitate standardized multimodal AI assessment in climate research. AI

IMPACT Enables more robust AI evaluation for climate science, potentially accelerating research and solutions.
- MMClima
TOOL · arXiv cs.AI English(EN) · 3d

Integral Field Unit Spectroscopy with One Fiber

Researchers have developed a novel probabilistic foundation model capable of predicting detailed spectral data for galaxies using only broadband images. This model, trained on extensive data from the Dark Energy Spectroscopic Instrument (DESI), bypasses the need for traditional Integral Field Unit (IFU) spectroscopy, which is observationally expensive. The system achieves IFU-like spectral resolution and spatial mapping capabilities by leveraging a masked autoencoder framework and incorporating fiber and redshift-aware encodings, demonstrating performance comparable to supervised methods trained directly on IFU data from the MaNGA survey. AI

IMPACT Enables more efficient and cost-effective analysis of astronomical data, potentially accelerating galaxy evolution studies.
- MaNGA
TOOL · arXiv cs.AI English(EN) · 3d

An Improved Generative Adversarial Network for Micro-Resistivity Imaging Logging Restoration

Researchers have developed an improved Generative Adversarial Network (GAN) specifically for restoring partially missing micro-resistivity imaging logs. The proposed method incorporates a Feature Pyramid Network (FPN) as the generative backbone, enhanced with depth-separable convolutional residual blocks and Inception modules to better capture pixel and semantic information across multiple scales. Experimental results show an average structural similarity measure of 0.903 on test data, outperforming other methods by approximately 0.3 and improving semantic structure coherence and texture details for subsequent interpretation. AI

IMPACT Enhances image restoration techniques for specialized geological logging, potentially improving data interpretation accuracy.
TOOL · arXiv cs.AI English(EN) · 3d

Exploration of Foundation Model-Based Robots in Patient and Elderly Care

A new perspective paper explores the integration of foundation models into robots for patient and elderly care. While these models offer potential for personalized assistance and flexible communication, current systems face challenges with reliability, such as hallucinations and conversational breakdowns. The paper highlights that while usability and engagement benefits are reported, evidence for significant clinical or care-related outcomes remains limited, suggesting a need for care-specific evaluation standards and better integration into existing workflows. AI

IMPACT Highlights the gap between current AI capabilities in robots and the stringent requirements for reliable patient care, suggesting future research directions.
TOOL · arXiv cs.AI English(EN) · 3d

A Source Domain is All You Need: Source-Only Cross-OS Transfer Learning for APT Anomaly Detection via Semantic Alignment and Optimal Transport

Researchers have developed a novel framework for detecting advanced persistent threats (APTs) across different operating systems without requiring any labeled data from the target system. The approach uses natural language processing to describe process behavior, embeds these descriptions using pre-trained language models, and then applies optimal transport methods to quantify deviations from normal behavior learned from a source operating system. Evaluations on multiple APT scenarios and operating systems demonstrated improved detection accuracy compared to existing source-only methods. AI

IMPACT This research offers a new method for cybersecurity that could improve threat detection capabilities across diverse systems.
TOOL · arXiv cs.AI English(EN) · 3d

Dual-Branch Gated Fusion for Open-Set Audio Deepfake Source Tracing

Researchers have developed a new dual-branch gated fusion framework to improve the tracing of audio deepfake sources. This system combines XLSR-53 with a novel 66-dimensional descriptor called CORES, which captures a wider range of synthesis artifacts than previous methods. An input-conditioned gate adaptively weights these two branches to overcome representational imbalance and enhance performance on out-of-domain datasets. AI

IMPACT This research could lead to more robust detection of audio deepfakes, enhancing security and trust in digital communications.
- MLAAD
- XLSR-53
- CORES
TOOL · arXiv cs.AI English(EN) · 3d

A Practical Recipe Towards Improving Sim-and-Real Correlation for VLA Evaluation

Researchers have developed a systematic study to improve the correlation between simulation and real-world robot evaluations for vision-language-action (VLA) policies. The study analyzes how well simulation platforms preserve real-world conclusions regarding policy ranking and performance. It also offers guidance on leveraging simulation for policy improvement, including when simulator-based fine-tuning is beneficial and how data volume impacts alignment. AI

IMPACT Provides a framework to enhance the reliability of simulation for developing and evaluating AI-driven robot policies.
- VLA policies
TOOL · arXiv cs.AI English(EN) · 3d

Test-time Adversarial Takeover: A Real-time Hijacking Interface against Robotic Diffusion Policies

Researchers have developed a novel attack method called Test-time Adversarial Takeover (TAKO) that allows real-time hijacking of robotic systems controlled by diffusion-based policies. This attack manipulates the visual conditioning input to the robot, enabling an attacker to steer the robot's actions and achieve custom objectives. TAKO utilizes universal patches learned through diffusion inference, proving effective across various robotic tasks, visual encoders, and generative inference models, with human operators achieving 100% takeover success in all tested scenarios. AI

IMPACT Demonstrates a significant new vulnerability in embodied AI systems, potentially impacting the safety and security of deployed robots.
TOOL · arXiv cs.AI English(EN) · 3d

The Distributed Detectability Band Against Marginal-Preserving Attacks

Researchers have developed a new method called the Distributed Detectability Band to counter AI sabotage attacks that are designed to evade detection. These attacks distribute harmful actions across many seemingly benign steps, making them difficult for standard AI monitors to identify. The proposed technique encodes harm within the temporal correlation structure of actions, rather than relying on individual action scores, to effectively detect these sophisticated, sub-threshold sabotage attempts. AI

IMPACT Introduces a novel approach to detecting sophisticated AI sabotage, potentially improving the security and reliability of AI systems.
- Distributed Detectability Band
- AI
TOOL · arXiv cs.AI English(EN) · 3d

Visual-TCAV: Concept-based Attribution and Saliency Maps for Post-hoc Explainability in Image Classification

Researchers have developed Visual-TCAV, a new framework for explaining image classification models. This method combines local saliency maps with concept-based attribution, addressing limitations of existing techniques. Visual-TCAV can pinpoint where a specific concept is recognized within an image and quantify its contribution to a prediction, demonstrating improved faithfulness over prior methods. AI

IMPACT Provides enhanced interpretability for AI image classification, potentially aiding debugging and trust.
TOOL · arXiv cs.AI English(EN) · 3d

Assessment of Personality Dimensions Across Situations in Dyadic Role-Play Scenarios

Researchers have developed a method to assess personality dimensions in dyadic role-play scenarios, moving beyond static trait assumptions. Their findings indicate that perceived personalities significantly vary across different situations, such as neutral interviews versus stressful client interactions. Acoustic and non-verbal features, rather than speaker embeddings, proved more effective in predicting these perceived traits, with stress specifically correlating with neuroticism. AI

IMPACT This research could lead to more adaptive and context-aware AI assistants that better align with user personalities in varying situations.
- Alice Zhang
TOOL · arXiv cs.CV English(EN) · 3d

Sketch-to-Layout: A Human-Centric Computational Agent for Constraint-Aware Synthesis of Modular Photobioreactors

Researchers have developed a computational agent designed to simplify the design and maintenance of building-integrated photobioreactors (PBRs). This agent uses a Constraint Satisfaction Problem (CSP) solver to translate user sketches into fabrication-ready configurations, successfully handling complex layouts up to 15x15 grids. Additionally, a computer vision system has been created to monitor algae health from photographs, enabling autonomous maintenance and supporting the goal of scalable carbon capture systems. AI

IMPACT This system could democratize the design and maintenance of carbon-neutral architectural elements, potentially accelerating the adoption of sustainable building technologies.
TOOL · arXiv cs.CV English(EN) · 3d

SPARX: Secure and Privacy-Aware Approximate CNN Acceleration with Edge RISC-V SoC

Researchers have developed SPARX, a framework for accelerating Convolutional Neural Networks (CNNs) on edge devices. This system integrates approximate computing with security and privacy features within a RISC-V System-on-Chip. SPARX utilizes a custom RISC-V instruction extension and an approximate logarithmic CNN accelerator, enhanced by a differential-noise privacy engine and authentication mechanisms. Evaluations show significant reductions in area and power, alongside improved throughput, with a minimal impact on accuracy for specific CNN models. AI

IMPACT Enables more efficient and secure AI inference on resource-constrained edge devices.
- CNN
- SPARX
- RISC-V
- SoC
- ResNet-20
- CIFAR-10
TOOL · arXiv cs.CV English(EN) · 3d

Automatic Labelling for Low-Light Pedestrian Detection

Researchers have developed an automated pipeline to generate labels for low-light pedestrian detection using infrared and RGB cameras. This method involves detecting pedestrians in infrared images and then transferring those labels to corresponding RGB images. Models trained with these generated labels outperformed those trained on ground truth in several key metrics, suggesting a scalable approach for creating large low-light pedestrian datasets. AI

IMPACT Automated labeling could accelerate the development of safer autonomous driving systems by improving low-light pedestrian detection.
TOOL · arXiv cs.CV English(EN) · 3d

GeoLoom: High-quality Geometric Diagram Generation from Textual Input

Researchers have developed GeoLoom, a new framework designed to generate high-quality geometric diagrams from textual descriptions. This system translates natural language into a formal language called GeoLingua, which is then used by a coordinate solver to produce precise diagrams. To support this, a new dataset called GeoNF has been created, and a novel evaluation metric has been proposed to assess structural accuracy. AI

IMPACT Introduces a novel method for generating precise geometric diagrams from text, potentially aiding in technical documentation and education.
- Ting Zhang
- GeoLoom
- GeoNF
- GeoLingua
TOOL · arXiv cs.CV English(EN) · 3d

Training Set Augmentation and Biology-Aware Harmonization Improve Radiomic Models for Lung Cancer Prediction in Indeterminate Nodules

Researchers have developed improved radiomic models for predicting lung cancer in indeterminate pulmonary nodules. By augmenting the training data with nodules from later development stages and employing biology-aware harmonization techniques, the models showed significantly better performance. This approach addresses challenges posed by low malignancy rates in early nodules and variability in image acquisition protocols, leading to higher accuracy in cancer prediction. AI

IMPACT Enhances AI's diagnostic capabilities in medical imaging, potentially leading to earlier lung cancer detection.
- Claire Huchthausen
TOOL · arXiv cs.CV English(EN) · 3d

Unleashing Correlation and Continuity for Hyperspectral Reconstruction from RGB Images

Researchers have developed a new method for reconstructing hyperspectral images (HSI) from standard RGB images, a process that can significantly reduce costs while maintaining high spatial resolution. The proposed Correlation and Continuity Network (CCNet) leverages both local spectral correlations and global spectral continuity to improve reconstruction quality. This approach has demonstrated state-of-the-art performance on benchmark datasets, outperforming existing spectral reconstruction algorithms. AI

IMPACT This method could enable more cost-effective hyperspectral imaging applications by leveraging standard RGB cameras.
TOOL · arXiv cs.CV English(EN) · 3d

Cyst-X: A Multi-Center MRI Benchmark and Federated Learning Framework for Malignancy-Risk Stratification of Pancreatic Cystic Neoplasm

Researchers have introduced Cyst-X, a new benchmark dataset and federated learning framework designed to improve the early detection of pancreatic cystic neoplasms. This initiative aims to address the challenges in stratifying malignancy risk, which currently leads to either unnecessary surgeries or missed diagnoses. The Cyst-X dataset includes 1,461 MRI scans from seven international centers, and the developed deep learning model achieved an AUC of 0.85 in distinguishing high-risk lesions. Notably, the federated learning approach allowed for distributed training across institutions without sharing raw patient data, maintaining performance while preserving privacy. AI

IMPACT Enhances early detection of pancreatic cancer precursors by providing a robust benchmark and privacy-preserving training method.
TOOL · arXiv cs.AI English(EN) · 3d

Learning-Guided Integration Contours Construction for Fast Large-Scale Generalized Eigensolvers

Researchers have developed Deepcontour, a new framework that uses deep learning to optimize the construction of integration contours for solving large-scale Generalized Eigenvalue Problems (GEPs). This method employs a deep learning-based spectral predictor and Kernel Density Estimation to automatically design efficient contours, leading to significant speedups. The framework achieved up to a 5.63x performance increase on various scientific datasets while maintaining numerical accuracy. AI

IMPACT Introduces a novel deep learning approach to accelerate scientific computing tasks, potentially impacting fields reliant on solving large-scale eigenvalue problems.
TOOL · arXiv cs.CV English(EN) · 3d

Selective Disk Bispectrum: A Complete and Rotation Invariant Image Descriptor

Researchers have introduced the selective disk bispectrum (SDB), a new image descriptor designed to be rotation-invariant and preserve all information except orientation. This descriptor aims to bridge the gap between hand-crafted and learned image representations by offering a balance of descriptive power, efficiency, and interpretability. The SDB has theoretical guarantees for its accuracy and invariance, and has been empirically validated on classification tasks and image alignment. AI

IMPACT This new descriptor could improve the performance and efficiency of computer vision models that require rotation invariance.
- Adele Lantow
- Selective Disk Bispectrum
TOOL · arXiv cs.AI English(EN) · 3d

GCA Framework: A GCC Countries-Grounded Dataset and Agentic Pipeline for Climate Decision Support

Researchers have developed the GCA framework, which includes a new dataset called GCA-DS and an agent named Gulf Climate Agent (GCA). This framework is designed to support climate decision-making specifically for the GCC states by integrating regional climate knowledge with geospatial and forecasting tools. The GCA agent utilizes a modular pipeline that processes real-time data and geospatial information to generate insights and visualizations, demonstrating improved reliability over general-purpose LLMs through domain fine-tuning and tool integration. AI

IMPACT This framework could improve the accuracy and actionability of AI-driven climate change analysis in specific regions.
TOOL · arXiv cs.AI English(EN) · 3d

HydraCIL: Decoupled Class-Incremental Learning through Prototype-Guided Multi-Head Classifiers

Researchers have introduced HydraCIL, a novel approach to class-incremental learning designed for resource-constrained environments like embedded systems. This method decouples feature extraction from classifier training, allowing for lightweight, task-specific classifier heads to be created without extensive backbone retraining. Experiments demonstrate that HydraCIL achieves performance comparable to state-of-the-art methods while significantly reducing training time and energy consumption. AI

IMPACT Enables more efficient and sustainable AI model adaptation in resource-limited edge devices.
TOOL · arXiv cs.CL English(EN) · 3d

AI Application Gives Users Real-Time Feedback on the Level of Peace in the Social Media Videos They Watch

Researchers have developed an AI application designed to provide users with real-time feedback on the perceived level of peace in social media videos. The system utilizes large language models to analyze YouTube video transcripts, measuring five social dimensions identified in peace studies. This approach proved more effective than traditional sentiment analysis in correlating with human coders' assessments of peace-related content. AI

IMPACT This research could lead to tools that help users and creators understand the impact of language on conflict and peace in media.
TOOL · arXiv cs.CL English(EN) · 3d

HarDBench: A Benchmark for Draft-Based Co-Authoring Jailbreak Attacks for Safe Human-LLM Collaborative Writing

Researchers have introduced HarDBench, a new benchmark designed to evaluate the safety of large language models (LLMs) when used in collaborative writing scenarios. The benchmark focuses on "draft-based co-authoring jailbreak attacks," where malicious users could prompt LLMs to generate harmful content within incomplete drafts. HarDBench covers high-risk domains like explosives, drugs, and weapons, and includes realistic prompts to test model susceptibility. The researchers also developed a safety-utility balanced alignment approach to mitigate these risks without compromising the LLM's helpfulness on benign tasks. AI

IMPACT Introduces a new method for evaluating LLM safety in collaborative writing, potentially leading to more robust AI assistants.
TOOL · arXiv cs.CL English(EN) · 3d

Who Gets Flagged? The Pluralistic Evaluation Gap in AI Content Watermarking

A new paper from arXiv highlights significant biases in current AI content watermarking techniques. The research indicates that the effectiveness and detectability of watermarks vary considerably based on the statistical properties of the content itself, leading to disparities across languages, cultural visual traditions, and demographic groups. The authors propose a framework for more inclusive benchmarking, emphasizing cross-lingual detection parity, culturally diverse content coverage, and demographic disaggregation of metrics, arguing that these fairness evaluations should precede widespread deployment. AI

IMPACT Highlights potential biases in AI content authentication, urging for fairer evaluation methods before widespread adoption.
- arXiv
- Alexander Nemecek
TOOL · arXiv cs.LG English(EN) · 3d

Spatiotemporal Graph Transformer for 3D Neighborhood Interaction and Quality Prediction in Metal Additive Manufacturing

Researchers have developed a novel spatiotemporal graph transformer designed to model complex interactions in metal additive manufacturing. This framework represents the manufacturing process as a network, allowing for the integration of multimodal data and capturing both within-node feature dependencies and cross-node neighborhood interactions. Experiments demonstrate that this approach significantly outperforms existing models in characterizing process-quality relationships, with cross-layer interactions proving critical for accurate quality prediction. AI
- Spatiotemporal Graph Transformer
TOOL · arXiv stat.ML English(EN) · 3d

Exact Functional ANOVA Decomposition for Categorical Inputs Models

Researchers have developed a new method for interpreting machine learning models with categorical inputs. This approach, based on functional ANOVA decomposition, provides a closed-form solution that is computationally efficient and works even with dependent features. The new framework also offers a natural generalization of SHAP values for categorical data, addressing a long-standing limitation in model explainability. AI

IMPACT Provides a more efficient and accurate way to understand model behavior with categorical data, potentially improving trust and debugging.
- Baptiste Ferrere
TOOL · arXiv cs.CV English(EN) · 3d

ClinReadNet: A clinical reading-inspired network for low-dose abdominal CT image quality assessment

Researchers have developed ClinReadNet, a novel deep learning framework designed to assess the quality of low-dose abdominal CT images. The network mimics radiologists' reading processes by incorporating modules that focus on both local details and overall image context, and by using attention mechanisms to identify regions of interest. Experiments on the LDCTIQAG2023 dataset show ClinReadNet achieves state-of-the-art performance in image quality assessment. AI

IMPACT This model could improve diagnostic accuracy by ensuring higher quality CT scans, potentially reducing the need for repeat scans and radiation exposure.
- LDCTIQAG2023
- ClinReadNet
TOOL · arXiv cs.CV English(EN) · 3d

PF-Trans: Physics-Embedded Frequency-Aware Transformer for Spectral Reconstruction

Researchers have developed PF-Trans, a novel Transformer model designed for spectral reconstruction in remote sensing. This model integrates physics-based principles and frequency-domain analysis to effectively address spectral aliasing, a common issue in snapshot broadband filter array imaging. PF-Trans achieves state-of-the-art performance, demonstrated by a Peak Signal-to-Noise Ratio of up to 48.50 dB on a specific dataset. AI

IMPACT Introduces a new method for spectral reconstruction, potentially improving remote sensing data quality and analysis.
TOOL · arXiv cs.LG English(EN) · 3d

DUET -- Dual User Embedding Transformers for Offsite Conversion Prediction

Researchers have developed DUET, a novel framework using dual transformer encoders to improve offsite conversion rate prediction in recommendation systems. This approach tailors specific transformer architectures to distinct user behavioral data streams: one for dense click signals and another for sparse, delayed conversion signals. The framework's complementary embeddings are then combined for downstream ranking, demonstrating up to a 0.38% reduction in normalized entropy and leading to consistent improvements in conversion prediction accuracy during A/B testing. AI

IMPACT Introduces a specialized dual-transformer architecture to improve prediction accuracy in recommendation systems.
- arXiv
- Reazul Hasan Russel
TOOL · arXiv cs.LG English(EN) · 3d

Privacy-Preserving Credit Risk Prediction with Alternative Data

Researchers have developed a new machine learning method called PrivacyCredit to address the challenge of using alternative data for credit risk prediction without compromising consumer privacy. This method allows financial institutions to build accurate credit risk models by securely incorporating data such as mobile phone communications, while ensuring that sensitive borrower information remains protected. Experiments on a real-world dataset demonstrated that PrivacyCredit achieves the same predictive performance as traditional methods that would require insecure data sharing. AI

IMPACT Enables more accurate credit risk assessment by securely integrating sensitive alternative data sources.
- arXiv
- PrivacyCredit
TOOL · arXiv cs.AI English(EN) · 3d

Deep Slice Interpolation for Reducing Through-Plane Anisotropy and Noise in Head CT

Researchers have developed a deep learning system to improve the resolution of head CT scans by interpolating intermediate slices. This method effectively halves the through-plane spacing, enhancing 3D visualization and reducing noise. The system was evaluated using various loss functions, with MS-SSIM+L1 showing the most promising results, outperforming traditional interpolation techniques. AI

IMPACT Improves medical imaging quality and diagnostic accuracy through advanced AI techniques.
TOOL · arXiv cs.CV English(EN) · 3d

Is Task-Specific Training Necessary for Anomaly Detection?

Researchers have introduced Retrieval-based Anomaly Detection (RAD), a novel framework that eliminates the need for task-specific training in anomaly detection. Unlike current methods that rely on costly encoder-decoder models for reconstruction, RAD utilizes a training-free memory-based retrieval system. This approach stores anomaly-free features and detects anomalies by matching test patches against this memory, demonstrating state-of-the-art performance on multiple benchmarks, even in few-shot settings. AI

IMPACT Challenges the necessity of task-specific training in anomaly detection, potentially simplifying deployment and improving efficiency.
TOOL · arXiv cs.CV English(EN) · 3d

Seal-Robust KCR: A Robust Kuzushiji Character Recognition Framework under Seal Interference

Researchers have developed a new framework to improve the recognition of Kuzushiji, an ancient Japanese cursive script, even when historical documents are obscured by seals. The proposed system integrates document restoration techniques to counteract seal interference, enhancing character detection, classification, and ordering. This approach significantly reduces character error rates compared to existing methods, particularly on synthetic test data designed to simulate severe seal interference. AI

IMPACT Enhances OCR capabilities for historical documents, potentially aiding in the digitization and accessibility of ancient Japanese texts.
TOOL · arXiv cs.LG English(EN) · 3d

MinhwaNet: Faithful but Insufficient Object Grounding in Korean Folk Painting

Researchers have developed MinhwaNet, a system designed to analyze Korean folk paintings. The model found that while it can accurately identify and localize auspicious symbols within the paintings, this object-grounding is insufficient for predicting the painting's genre. Instead, the model performs better when it fuses visual information with textual descriptions, indicating that the arrangement of symbols, rather than their mere presence, is key to genre classification. AI

IMPACT This research highlights the limitations of object recognition in complex cultural contexts and the importance of multimodal approaches for accurate classification.
- MinhwaNet
- Korean folk painting
TOOL · arXiv cs.LG English(EN) · 3d

Profy: Interpretable Visualization of Expertise-Dependent Motor Skills Toward Supporting Piano Practice

Researchers have developed Profy, a weakly supervised system designed to aid piano practice by providing interpretable visualizations of motor skills. The system learns from aggregated listener ratings to identify time-aligned passages that differentiate expert and amateur performances. Profy outputs clip-level predictions with evidence scores, enabling learners to focus on specific sections for review and replay, rather than relying on a single global performance score. AI

IMPACT Provides a novel approach to skill visualization in practice, potentially improving learning outcomes for musicians.
- arXiv
TOOL · arXiv cs.LG English(EN) · 3d

Finer is Better (with the Right Scaling)

A new arXiv paper investigates the paradox where smaller block sizes in LLM quantization can degrade model quality. Researchers found this is not an inherent limitation but stems from how statistical clustering interacts with scaling factors. The study proposes solutions like preventing scaling factor underflow and using targeted heuristics such as the 4-over-6 methodology to improve quality, emphasizing the need for tight coupling between hardware and software design for next-generation ML accelerators. AI

IMPACT Optimizes LLM performance on next-gen hardware by addressing quantization paradoxes, potentially improving efficiency and accessibility.
TOOL · arXiv cs.LG English(EN) · 3d

Pushing the limits of one-dimensional NMR spectroscopy for automated structure elucidation using artificial intelligence

Researchers have developed a deep learning framework capable of elucidating molecular structures from one-dimensional NMR spectra. This AI model, inspired by natural language processing techniques, can accurately predict molecular structures with up to 40 non-hydrogen atoms, covering a significant portion of drug-like chemical space. The transformer-based architecture achieves 60.4% accuracy within the top 15 predictions, demonstrating a novel approach to overcoming the combinatorial complexity of structure generation from spectral data. AI

IMPACT This AI approach could significantly accelerate drug discovery and chemical research by automating complex structure elucidation processes.