PulseAugur
实时 02:50:44
实体 multilayer perceptron

multilayer perceptron

PulseAugur coverage of multilayer perceptron — every cluster mentioning multilayer perceptron across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
50
90 天内 50
发布 · 30天
0
90 天内 0
论文 · 30天
50
90 天内 50
层级分布 · 90 天
情绪 · 30 天

6 天有情绪数据

最近 · 第 1/3 页 · 共 50 条
  1. TOOL · CL_45000 ·

    已识别出神经网络权重漂移是训练动态问题

    研究人员在神经网络中发现了一种称为“权重漂移”的现象,其中优化过程会无意中将权重推向负值。这种漂移独立于训练数据,在使用标准损失函数和 ReLU、GELU 等常见激活函数时会出现。研究表明,这种漂移会导致显著的激活稀疏性,可能影响模型准确性,并且还会放大 Transformer 层中的激活尖峰。

  2. TOOL · CL_44953 ·

    新的二次ReLU替代方案加速FHE神经网络推理

    研究人员开发了一种新方法,用于将神经网络中的ReLU激活函数替换为二次多项式,特别适用于全同态加密(FHE)。该方法旨在通过使用低次多项式来降低仅FHE推理的计算成本,同时保持校准数据集上的分类准确性。该方法将替换问题构建为线性分离问题,并使用凸包松弛将其扩展到误分类样本的情况,与现有方法相比实现了更快的推理速度。

  3. RESEARCH · CL_44706 ·

    权重衰减控制 Transformer 训练机制,揭示新的诊断方法

    研究人员发现,在模块化算术任务上,权重衰减是控制 Transformer 训练机制的关键参数。他们引入了两种新的、低成本的在线诊断方法——平均成对注意力头余弦相似度和熵标准差——以监测注意力激活的训练动态。这些诊断方法应用于各种实验条件和模型规模,能有效区分记忆、泛化(grokking)和崩溃,并确定了记忆到发展的边界的具体过渡点。

  4. TOOL · CL_38420 ·

    Bayesian wind tunnels reveal transformer geometric design for inference

    Researchers have developed "Bayesian wind tunnels" to rigorously study how transformers perform Bayesian reasoning. These controlled environments allow for the verification of Bayesian posteriors with high accuracy in s…

  5. TOOL · CL_49348 ·

    数字孪生模拟引入移动性后的游客流量变化

    研究人员开发了一个利用人流数字孪生来预测引入移动性措施影响的框架。该数字孪生采用多智能体模拟器,其中个体智能体根据位置、景点吸引力和出行量等因素学习决策模型。通过改变点间距离或景点吸引力等参数,系统可以模拟游客流通和数量的变化。使用日本和歌山城公园的数据进行的评估表明,该框架采用多层感知器决策模型,能够以超过0.7的余弦相似度复制流量变化。

  6. RESEARCH · CL_36595 ·

    New research advances federated learning with proactive client selection and privacy analysis

    Researchers are exploring new methods to improve federated learning, a technique for training models across decentralized data sources while preserving privacy. One approach, "Choose Wisely and Privately," uses mutual i…

  7. TOOL · CL_36622 ·

    New theory explains Transformer generalization delay via Bayesian inference

    Researchers have proposed a new theory explaining why Transformer models delay generalization after memorizing training data. The theory frames attention mechanisms as implicit Bayesian posteriors over task dependency g…

  8. TOOL · CL_49378 ·

    新生AI单元在结构增长过程中面临梯度信号弱的挑战

    研究人员在深度学习模型的结构可塑性方面发现了一个关键挑战,特别是在训练过程中添加新单元时。这些“新生”单元通常接收到的梯度信号比现有单元弱得多,这阻碍了它们的整合和有效性,尤其是在复杂的图像分类任务中。虽然干预措施可以改善这些增长网络的适应性性能,但它们并不能自动保证更好的最终子网络。该研究表明,深度学习中结构增长的成功在很大程度上取决于新单元如何整合到持续训练过程中的稳定性。

  9. TOOL · CL_30823 ·

    New STAIR training method boosts simple models for time series forecasting

    Researchers have introduced STAIR, a novel training paradigm designed to enhance the performance of simple models in long-term time series forecasting. This method decomposes the forecasting process into three stages: l…

  10. TOOL · CL_29409 ·

    New theory suggests transformers use geometric memorization

    Researchers have proposed a new theory of how transformer language models memorize factual information, suggesting a 'geometric' form of memorization rather than traditional associative memory. This model posits that le…

  11. TOOL · CL_28284 ·

    足球机器学习解释无法从精英联赛迁移到大学联赛

    一篇新发表在arXiv上的研究探讨了机器学习解释在足球表现分析中的迁移性。研究人员发现,从欧洲精英联赛中学到的表现决定因素未能可靠地迁移到大学级别的足球比赛中。当模型应用于大学数据时,关键表现指标显示出显著的重新排序和解释稳定性下降,这表明可解释性是领域相关的,并且可能预示着目标领域中存在的结构性模糊。

  12. TOOL · CL_28341 ·

    New DLR-Lock method secures open-weight language models

    Researchers have developed a new method called DLR-Lock to prevent unauthorized modifications of open-weight language models. This technique replaces standard MLPs with deep low-rank residual networks, which increase me…

  13. TOOL · CL_22424 ·

    Masked Language Prompting enhances few-shot fashion style recognition

    Researchers have developed a new method called Masked Language Prompting (MLP) to improve generative data augmentation for few-shot fashion style recognition. This technique masks words in reference captions and uses la…

  14. TOOL · CL_21901 ·

    Learned token routing in transformers adapts computation depth for efficiency

    Researchers have developed a new technique called Token-Selective Attention (TSA) for transformer models that allows them to dynamically adjust the computation depth for each token. This method uses a lightweight, learn…

  15. RESEARCH · CL_25812 ·

    Neural networks possess finite sample complexity, paper shows

    A new paper demonstrates that a wide range of feedforward neural network architectures possess finite sample complexity. This means they can learn effectively in the PAC model, even with unbounded parameters. The findin…

  16. TOOL · CL_20767 ·

    LEGO framework uses LoRA to detect synthetic images with greater accuracy

    Researchers have developed LEGO, a novel framework designed to detect synthetic images by focusing on generator-specific artifacts. This approach utilizes Low-Rank Adaptation (LoRA) modules, each trained to identify uni…

  17. TOOL · CL_20744 ·

    New ALDA4Rec method improves recommendation systems with graph-based learning

    Researchers have developed a new method called ALDA4Rec to improve recommendation systems by addressing noise and static representations in graph-based models. The approach constructs an item-item graph, filters noise u…

  18. TOOL · CL_20548 ·

    Norm Anchors Stabilize LLM Edits, Extending Usable Horizon by 4x

    Researchers have developed a new technique called Norm-Anchor Scaling (NAS) to improve the longevity of model edits in large language models. Existing methods for sequential model editing can degrade performance over ti…

  19. TOOL · CL_20537 ·

    eNTK eigenanalysis surfaces features in trained neural networks

    Researchers have demonstrated that analyzing the empirical Neural Tangent Kernel (eNTK) can reveal feature directions within trained neural networks. This method was tested on a 1-layer MLP and a 1-layer Transformer, sh…

  20. TOOL · CL_20389 ·

    LoRA-MoE deep learning framework aids Alzheimer's diagnosis via handwriting

    Researchers have developed a new deep learning framework called Low-Rank Mixture of Experts (LoRA-MoE) for diagnosing Alzheimer's disease using handwriting analysis. This approach utilizes specialized experts within the…