PulseAugur
实时 20:26:51
实体 vision transformer

vision transformer

PulseAugur coverage of vision transformer — every cluster mentioning vision transformer across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
38
90 天内 38
发布 · 30天
0
90 天内 0
论文 · 30天
38
90 天内 38
层级分布 · 90 天
关系
情绪 · 30 天

7 天有情绪数据

最近 · 第 1/2 页 · 共 38 条
  1. RESEARCH · CL_48255 ·

    New Vision Transformer baseline sets SOTA on material segmentation

    Researchers have revived the Apple Dense Material Segmentation (DMS) benchmark by establishing a new Vision Transformer baseline. They identified that standard training methods struggle with amorphous textures due to hi…

  2. TOOL · CL_45058 ·

    New depthwise convolution speeds up vision foundation models

    Researchers have developed a new method to speed up vision foundation models by replacing certain attention heads in Vision Transformer (ViT) backbones with efficient depthwise convolution layers. This drop-in replaceme…

  3. TOOL · CL_42546 ·

    Fully Ternary Vision Transformer Achieves High Compression for Microcontrollers

    Researchers have developed FTerViT, a fully ternary Vision Transformer that compresses all weight matrices and normalization parameters. This approach significantly reduces the model's memory footprint, making it more f…

  4. TOOL · CL_40922 ·

    New anomaly detection uses vision transformers for autonomous driving

    Researchers have developed a new anomaly detection method for autonomous driving that uses pre-trained vision transformer embeddings. This approach models normality from a single reference image, avoiding the need for e…

  5. TOOL · CL_38387 ·

    CutMix training protocol induces spatial locality in Vision Transformers

    Researchers have found that specific training techniques can encourage spatial locality in Vision Transformers. By using a 'Modern' protocol involving data augmentation like CutMix and ColorJitter, along with label smoo…

  6. TOOL · CL_38820 ·

    LESSViT architecture improves hyperspectral model generalization across sensors

    Researchers have developed LESSViT, a novel architecture for hyperspectral imagery that addresses the challenge of generalizing models across different sensors. This Low-rank Efficient Spatial-Spectral ViT uses a struct…

  7. TOOL · CL_37948 ·

    TokenMask improves vision transformer segmentation efficiency

    Researchers have developed TokenMask, a novel approach for vision transformer segmentation that bypasses the need for explicit image-space reconstruction. This method computes mask logits directly from query-token affin…

  8. TOOL · CL_38007 ·

    New GLIA framework enhances Vision Transformer use in image quality assessment

    Researchers have developed a new framework called the Global-Local Interaction Adapter (GLIA) to improve Blind Image Quality Assessment (BIQA). This method leverages pre-trained Vision Transformers by using a dual-strea…

  9. TOOL · CL_31312 ·

    VoxCor method enables training-free volumetric features for medical imaging

    Researchers have developed VoxCor, a novel method for creating reusable volumetric feature representations from pre-trained 2D Vision Transformer models. This training-free approach combines triplanar inference with a w…

  10. TOOL · CL_29284 ·

    What-Where Transformer separates object appearance from location

    Researchers have introduced the What-Where Transformer (WWT), a novel visual backbone designed to better separate object appearance from spatial location. This new architecture uses a slot-based design where tokens repr…

  11. TOOL · CL_27971 ·

    Diffusion augmentation boosts Bangla character recognition accuracy

    Researchers have developed a confidence-guided diffusion augmentation method to improve the recognition of handwritten Bangla compound characters. This approach uses diffusion models to generate high-quality synthetic c…

  12. TOOL · CL_27505 ·

    Foundation model learns from Dutch satellite data for global benchmarks

    Researchers have developed a new foundation model for high-resolution remote sensing data, specifically trained on satellite images of the Netherlands. This model combines Convolutional Neural Networks and Vision Transf…

  13. TOOL · CL_22428 ·

    LC4-DViT uses generative AI and transformers for accurate land-cover mapping

    Researchers have developed LC4-DViT, a novel framework for land-cover classification using a deformable Vision Transformer. This approach combines generative data creation with a deformation-aware backbone to improve ac…

  14. TOOL · CL_22391 ·

    New framework fuses facial and physiological signals for better emotion recognition

    Researchers have developed a new framework for video-based emotion recognition that combines facial expressions with physiological signals from remote photoplethysmography (rPPG). Their method uses prompt tuning to inte…

  15. TOOL · CL_21919 ·

    Researchers develop robust foundation model for conservation laws using recurrent Vision Transformers

    Researchers have developed a new architecture that enhances Flux Neural Operators (Flux NO) by incorporating context through Recurrent Vision Transformers. This hypernetwork model extracts solution dynamics over time, e…

  16. RESEARCH · CL_20294 ·

    DART vision-language model offers comprehensive rope condition monitoring

    Researchers have developed DART, a vision-language foundation model designed for comprehensive rope condition monitoring. This model integrates a Vision Transformer with Llama-3.2-3B-Instruct to handle the entire inspec…

  17. TOOL · CL_18721 ·

    Hebbian Fast Weights enhance Vision Transformers for few-shot character recognition

    Researchers have developed a new approach to few-shot character recognition by integrating Hebbian Fast-Weight (HFW) modules into Vision Transformer architectures. This method aims to mimic biological neural systems' ab…

  18. RESEARCH · CL_18667 ·

    RD-ViT cuts data needs for segmentation, outperforming standard ViT with fewer parameters

    Researchers have developed RD-ViT, a novel Recurrent-Depth Vision Transformer designed for semantic segmentation tasks. This architecture significantly reduces data dependence by using a single, shared transformer block…

  19. RESEARCH · CL_18682 ·

    OneTrackerV2 unifies multimodal visual tracking with Dual Mixture-of-Experts

    Researchers have developed a new event-based visual object tracking framework that addresses limitations of existing methods by explicitly modeling event density variations across multiple temporal scales. This approach…

  20. TOOL · CL_16148 ·

    Researchers develop AI framework for fluid-structure interaction prediction

    Researchers have developed a new machine learning framework for predicting fluid-structure interactions (FSI) over long periods on deforming meshes. The system integrates a graph neural operator with a vision Transforme…