ITNet: A Learnable Integral Transform That Subsumes Convolution, Attention, and Recurrence
Researchers have introduced ITNet, a novel neural network architecture that unifies convolution, attention, and recurrence into a single learnable integral transform. This architecture uses a learnable kernel, implemented as an MLP, to model pairwise interactions, allowing it to adapt its behavior from data. ITNet can recover the functionalities of various existing architectures, including LSTMs, GRUs, S4, Mamba, and self-attention, by adjusting its parameters. The model has demonstrated competitive or superior performance across multiple benchmarks such as ImageNet-1K, GLUE, ModelNet40, VQA v2, and NLVR2. AI
IMPACT Unifies disparate neural network architectures, potentially simplifying model design and improving performance across various tasks.