Interpolation between Convolution and Attention via K-Nearest Neighbors
Researchers have introduced Convolutional Nearest Neighbors (ConvNN), a novel framework that unifies convolutional neural networks (CNNs) and transformers. The paper argues that both architectures are special cases of k-nearest neighbor aggregation, differing in how neighbors are selected: CNNs use spatial proximity, while transformers use feature similarity. ConvNN allows for a continuous spectrum between local and global aggregation by configuring similarity functions and neighbor selection strategies. AI
IMPACT This research proposes a unified framework for computer vision architectures, potentially simplifying model design and enabling new hybrid approaches.