CIFAR-10 · PulseAugur

RESEARCH · CL_20296 · May 6 · 13:32

LLM 通过新颖的基于 Delta 的代码生成加速神经架构搜索

研究人员正在探索使用大型语言模型 (LLM) 进行神经架构搜索 (NAS) 的新颖方法。一种名为 SPARK 的方法旨在通过显式选择功能因素进行修改来改进 LLM 知识集成，从而减少意外的副作用并提高效率。另一种技术，Delta-Code Generation，专注于微调 LLM 以生成紧凑的代码差异，以改进现有架构而不是从头开始生成它们，从而显著减少代码冗余和计算成本。一项调查还根据效率、鲁棒性和持续学习对 NAS 方法进行了分类，…

RESEARCH · CL_18836 · May 6 · 04:00

Researchers accelerate discrete autoregressive models with Wasserstein flow and Jacobi decoding

Researchers have developed a new method to accelerate the inference of discrete autoregressive normalizing flows, a type of generative model. The proposed technique, Selective Jacobi Decoding, allows for parallel iterat…

RESEARCH · CL_18735 · May 6 · 04:00

AI research tackles layer free-riding and enhances data privacy for models

Researchers have identified a phenomenon in Forward-Forward networks called layer free-riding, where later layers can inherit tasks already partially handled by earlier layers, leading to a decay in gradient. Three loca…

RESEARCH · CL_18341 · May 5 · 13:33

GEM-FI: Gated Evidential Mixtures with Fisher Modulation

Researchers have introduced GEM-FI, a novel family of models designed to improve uncertainty estimation in deep learning. This approach addresses limitations of existing Evidential Deep Learning methods, which can be ov…

TOOL · CL_26961 · May 5 · 13:08

New AI framework learns classification losses without real data

Researchers have developed a new framework called Evolutionary Dynamic Loss (EDL) for pretraining classification losses without using real data. EDL learns a transferable loss function by generating synthetic prediction…

RESEARCH · CL_18343 · May 5 · 13:08

研究人员开发用于无分布预训练的进化动态损失

研究人员开发了一个名为进化动态损失（EDL）的新框架，用于预训练分类损失。EDL使用合成数据学习可迁移的损失函数，避免了在主要预训练阶段需要真实样本。该框架通过进化策略将损失优化为一个轻量级网络，并结合混沌变异来增强探索和改善收敛性。在CIFAR-10上的实验表明，EDL可以有效地替代交叉熵并达到相当或更好的准确率。

RESEARCH · CL_21948 · May 5 · 04:00

New AI unlearning methods balance data removal with model utility

Researchers have developed new methods for machine unlearning, a process that removes specific data from AI models without full retraining. One approach, SHRED, uses self-distillation and logit demotion to identify and …

RESEARCH · CL_16055 · May 5 · 04:00

New research explores ensemble models for improved AI performance and robustness

Two new research papers introduce novel methods for improving ensemble models in machine learning. The first, PACE, combines pruning and compression techniques to create more efficient and interpretable ensembles, outpe…

TOOL · CL_15706 · May 5 · 04:00

Checkerboard attack offers efficient, learning-free backdoor for deep learning models

Researchers have developed a new method called Checkerboard for launching clean-label backdoor attacks on deep learning models. This learning-free technique uses a closed-form checkerboard trigger derived from linear se…

TOOL · CL_15651 · May 5 · 04:00

Researchers develop DUNE, a dual-branch method to create robust unlearnable examples for AI models.

Researchers have developed DUNE, a novel dual-branch approach to create robust unlearnable examples for AI model training. This method optimizes perturbations in both spatial and color domains to degrade model generaliz…

TOOL · CL_15639 · May 5 · 04:00

New HyCAS defense bridges gap between certified and empirical adversarial robustness

Researchers have developed a new adversarial defense technique called Hybrid Convolutions with Attention Stochasticity (HyCAS). This method aims to bridge the gap between theoretical robustness guarantees and practical …

RESEARCH · CL_14418 · May 4 · 04:00

Kernel Hopfield networks show high storage capacity, stability limits analyzed

Researchers have analyzed the geometric properties and storage capacity limits of kernel Hopfield networks trained with Kernel Logistic Regression (KLR). Their experiments, using random sequences and CIFAR-10 image embe…

RESEARCH · CL_14406 · May 4 · 04:00

ROSA optical neural network architecture boosts efficiency and robustness

Researchers have introduced ROSA, a novel microring-based optical neural network architecture designed for enhanced robustness and energy efficiency. This design incorporates an optical shift-and-add module and a layer-…

RESEARCH · CL_14337 · May 4 · 04:00

视觉Transformer利用DCT提升注意力和效率

研究人员开发了一种利用离散余弦变换（DCT）来增强视觉Transformer的新颖方法。该方法包括一种基于DCT的自注意力初始化策略，可提高在CIFAR-10和ImageNet-1K等基准测试上的分类准确性。此外，一种基于DCT的注意力压缩技术通过截断输入块的高频分量来降低计算开销，从而在Swin Transformer等模型中保持性能。

RESEARCH · CL_11892 · May 1 · 04:00

New method corrects subsampling bias in drifting generative models

Researchers have developed Analytical Bias Correction (ABC), a method to address subsampling bias in drifting models, which are used for one-step generative tasks. The bias arises from using minibatches to estimate cent…

RESEARCH · CL_11881 · May 1 · 04:00

新研究揭示隐式偏差驱动深度学习中的神经缩放定律

研究人员发现了两个新的动力学缩放定律，它们描述了神经网络性能如何随着训练过程中复杂性度量的变化而变化。这些定律在CNN和Vision Transformers等各种架构以及多个数据集上均有观察到，并在收敛时恢复了已建立的测试误差缩放定律。单层感知器的分析工作支持了这些发现，并通过基于梯度的训练引入的隐式偏差来解释这种现象。

RESEARCH · CL_11404 · Apr 30 · 14:01

Decoupled Descent: Exact Test Error Tracking Via Approximate Message Passing

Researchers have developed a new training algorithm called Decoupled Descent (DD) that aims to eliminate the generalization gap in parametric models. DD uses approximate message passing theory to cancel biases caused by…

RESEARCH · CL_11405 · Apr 30 · 11:32

Linear-Core Surrogates offer smooth loss functions with linear rates for classification

Researchers have introduced Linear-Core (LC) Surrogates, a novel family of convex loss functions designed to combine the benefits of smooth and piecewise-linear losses in machine learning. These surrogates are different…

RESEARCH · CL_10213 · Apr 30 · 04:00

新的联邦学习方法增强了对抗攻击的鲁棒性

研究人员开发了一种新的鲁棒联邦学习方法，可以抵御对抗攻击。该方法称为基于损失的客户端聚类，只需要服务器和一名客户端等两个诚实参与者即可有效运行。理论分析表明，即使在强拜占庭攻击下也存在有界的最优性差距，实验结果表明在多个基准测试中显著优于标准的和鲁棒的联邦学习基线。

RESEARCH · CL_11689 · Apr 30 · 01:13

New DALS framework optimizes learning rates for neural network training

Researchers have introduced a new framework called Discriminative Adaptive Layer Scaling (DALS) to optimize learning rates in neural networks. DALS categorizes the evolution of learning rate strategies into five generat…