实体 Vision Transformers

Vision Transformers

PulseAugur coverage of Vision Transformers — every cluster mentioning Vision Transformers across labs, papers, and developer communities, ranked by signal.

Show in brief

总计 · 30天

90 天内 37

发布 · 30天

90 天内 0

论文 · 30天

90 天内 36

层级分布 · 90 天

research 13
tool 23
commentary 1

关系

used by Imagenet 1k 70%

时间线

2026-05-22 research_milestone A new paper proposes a method to improve Vision Transformer performance on dense prediction tasks by addressing semantic diffusion. 来源
2026-05-22 research_milestone A new paper proposes a method to improve Vision Transformer performance on dense prediction tasks. 来源
2026-05-22 research_milestone A new paper introduces stabilized Vision Transformers and a training recipe that achieves state-of-the-art results on the Apple Dense Material Segmentation benchmark. 来源

情绪 · 30 天

4 天有情绪数据

最近 · 第 2/2 页 · 共 37 条

RESEARCH · CL_21807 · May 7 · 13:45

Spark3R accelerates 3D reconstruction with asymmetric token reduction

Researchers have developed Spark3R, a novel framework designed to accelerate feed-forward 3D reconstruction models that utilize Vision Transformers. The method addresses the computational challenge posed by processing e…
RESEARCH · CL_21820 · May 7 · 12:14

Vision models' metonymy undermines attention-based interpretability, study finds

A new research paper published on arXiv introduces the concept of "visual metonymy" in vision models, where parts of an object encode information about the whole object. This phenomenon undermines the interpretability o…
TOOL · CL_20500 · May 7 · 04:00

New Sparse Backdoor attack hides undetectable compromises in image classifiers

Researchers have developed a novel supply-chain attack called Sparse Backdoor, capable of embedding a provably undetectable backdoor into pre-trained image classifiers like convolutional networks and Vision Transformers…
TOOL · CL_26994 · May 5 · 17:21

RD-ViT cuts data needs for vision segmentation tasks

Researchers have developed RD-ViT, a new Vision Transformer architecture designed for semantic segmentation that significantly reduces data dependency. By employing a recurrent-depth approach with a single shared block …
TOOL · CL_15656 · May 5 · 04:00

Researchers optimize Vision Transformers for semiconductor inspection

Researchers have developed a novel framework to optimize Vision Transformers (ViTs) for deployment in resource-constrained industrial settings. This approach simultaneously optimizes architecture, token compression, and…
TOOL · CL_15617 · May 5 · 04:00

Colinearity Decay trains vision Transformers for better low-bit quantization

Researchers have developed a new training technique called Colinearity Decay (CD) to make Vision Transformers (ViTs) more amenable to low-bit quantization. This method acts as a structural regularizer, penalizing alignm…
RESEARCH · CL_14337 · May 4 · 04:00

Vision Transformers leverage DCT for improved attention and efficiency

Researchers have developed a novel approach using the Discrete Cosine Transform (DCT) to enhance Vision Transformers. This method includes a DCT-based initialization strategy for self-attention, which improves classific…
RESEARCH · CL_11881 · May 1 · 04:00

New research reveals implicit bias drives neural scaling laws in deep learning

Researchers have identified two new dynamical scaling laws that describe how neural network performance changes with complexity measures throughout training. These laws, observed across various architectures like CNNs a…
RESEARCH · CL_11809 · May 1 · 04:00

HighFM foundation model learns from high-frequency Earth Observation data

Researchers have developed HighFM, a novel foundation model designed to learn from high-frequency Earth Observation data. This model utilizes over 2 terabytes of SEVIRI imagery from the Meteosat Second Generation platfo…
RESEARCH · CL_14095 · Apr 30 · 23:41

Vision Transformers optimize spatio-temporal vegetation classification efficiency

Researchers have developed an optimized Vision Transformer (ViT) approach for classifying vegetation pixels over time, addressing computational challenges in plant phenology monitoring. This new method offers significan…
RESEARCH · CL_10156 · Apr 30 · 04:00

Researchers revisit human-in-the-loop object retrieval using Vision Transformers

Researchers have revisited the task of Human-in-the-Loop Object Retrieval, a method for iteratively finding images with specific objects using user feedback. The process involves a system learning to distinguish relevan…
RESEARCH · CL_18799 · Apr 28 · 04:00

New research explores AI contribution measurement, RL optimization, and OOD detection

Researchers have developed CoTrace, a framework to measure and expose goal-level contributions in human-AI collaboration, revealing that while AI accounts for a smaller percentage of overall goal-shaping, it significant…
RESEARCH · CL_06541 · Apr 28 · 04:00

FOCUS framework enhances hyperspectral imaging interpretability for Vision Transformers

Researchers have developed FOCUS, a novel framework designed to enhance the interpretability of Vision Transformers (ViTs) when applied to hyperspectral imaging (HSI). This method addresses challenges in understanding V…
RESEARCH · CL_06469 · Apr 28 · 04:00

Vision Transformers learn spatial hierarchy mirroring primate visual cortex

Researchers have investigated how Vision Transformers (ViTs) encode spatial information without explicit spatial supervision during pretraining. By probing a ViT-B/16 model, they found that boundary structure is decodab…
RESEARCH · CL_06456 · Apr 28 · 04:00

KAConvNet integrates Kolmogorov-Arnold theorem with CNNs for vision tasks

Researchers have introduced KAConvNet, a novel convolutional neural network architecture that integrates the Kolmogorov-Arnold representation theorem. This new approach aims to enhance interpretability and efficiency by…
RESEARCH · CL_06414 · Apr 28 · 04:00

Vision Transformers offer new methods for face image quality assessment

Two new research papers propose novel methods for assessing face image quality using Vision Transformers (ViTs). The first, ATTN-FIQA, leverages pre-softmax attention scores from pre-trained ViTs to infer image quality …
RESEARCH · CL_03094 · Apr 21 · 17:48

Benign overfitting in adversarial training boosts Vision Transformer robustness

Researchers have theoretically analyzed adversarial training for Vision Transformers (ViTs), finding it can achieve near-zero robust training loss and generalization error under specific conditions. This defense strateg…

Spark3R accelerates 3D reconstruction with asymmetric token reduction

Vision models' metonymy undermines attention-based interpretability, study finds

New Sparse Backdoor attack hides undetectable compromises in image classifiers

RD-ViT cuts data needs for vision segmentation tasks

Researchers optimize Vision Transformers for semiconductor inspection

Colinearity Decay trains vision Transformers for better low-bit quantization

Vision Transformers leverage DCT for improved attention and efficiency

New research reveals implicit bias drives neural scaling laws in deep learning

HighFM foundation model learns from high-frequency Earth Observation data

Vision Transformers optimize spatio-temporal vegetation classification efficiency

Researchers revisit human-in-the-loop object retrieval using Vision Transformers

New research explores AI contribution measurement, RL optimization, and OOD detection

FOCUS framework enhances hyperspectral imaging interpretability for Vision Transformers

Vision Transformers learn spatial hierarchy mirroring primate visual cortex

KAConvNet integrates Kolmogorov-Arnold theorem with CNNs for vision tasks

Vision Transformers offer new methods for face image quality assessment

Benign overfitting in adversarial training boosts Vision Transformer robustness