Mixture-of-Transformers
PulseAugur coverage of Mixture-of-Transformers — every cluster mentioning Mixture-of-Transformers across labs, papers, and developer communities, ranked by signal.
6 day(s) with sentiment data
-
Robots learn to "see" touch using motion correlation in new tactile sensors
Researchers have developed a new method for robots to "see" touch by analyzing the correlation between transient and cumulative motion in tactile sensors. This approach aims to overcome the perception ambiguity of exist…
-
Mural integrates frozen LLMs into image generation via Mixture-of-Transformers
Researchers have developed a new method called Mural that integrates frozen Large Language Models (LLMs) with diffusion-based image generators. This approach utilizes a Mixture-of-Transformers (MoT) architecture to tran…
-
Symbiotic-MoE framework enhances multimodal AI by merging generation and understanding
Researchers have developed Symbiotic-MoE, a new pre-training framework designed to improve Large Multimodal Models (LMMs) by enabling them to perform both image generation and understanding tasks without catastrophic fo…
-
Vera layered diffusion model enhances video editing with content preservation
Researchers have introduced Vera, a novel layered diffusion framework designed for content-preserving video editing. Unlike existing methods that regenerate entire videos, Vera focuses on generating an edit layer and an…
-
New AI frameworks enhance video editing with content preservation and real-time capabilities
Researchers have developed new frameworks for video editing, addressing limitations in current automated systems. VideoAgent offers an all-in-one solution for diverse video comprehension and editing tasks, utilizing a m…
-
New AI models tackle long-horizon planning for autonomous driving
Researchers are developing advanced AI models for autonomous driving, focusing on improving trajectory planning and long-horizon decision-making. Several new frameworks, including ParkingTransformer, TerraTransfer, Alig…
-
MaskWAM model unifies masks for enhanced robotic control
Researchers have developed MaskWAM, a novel object-centric world-action model designed to improve robotic control through video prediction. By integrating masks as both inputs and predictions using a Mixture of Transfor…
-
Intel and NVIDIA advance AI hardware and models
Intel is focusing on agentic AI to drive a CPU renaissance and aims to establish a full-stack AI computing platform. Meanwhile, NVIDIA has launched Cosmos 3, an open physical AI model built on a Mixture-of-Transformers …
-
NVIDIA launches Cosmos 3 omnimodal model and Nemotron 3 LLM
NVIDIA has launched Cosmos 3, an omnimodal world model that unifies language, image, video, audio, and action using a Mixture-of-Transformers architecture. This release includes open weights, code, and datasets, with fi…
-
X Square Robot releases 4B VLA model with open code, real-robot tests
X Square Robot has released Wall-OSS-0.5, a 4 billion parameter vision-language-action (VLA) model. The model is built upon a 3 billion parameter vision-language model backbone and incorporates action experts using a Mi…
-
Ant Group's LingBot-VA robot control model accepted to RSS 2026
Ant Group's LingBot-VA, a causal world modeling framework for robot control, has been accepted into the prestigious Robotics: Science and Systems (RSS) 2026 conference. This framework enables robots to predict environme…
-
Mix3R combines feed-forward and generative AI for improved 3D reconstruction and pose estimation
Researchers have developed Mix3R, a novel method for 3D reconstruction that combines feed-forward and generative approaches. This technique generates 3D shapes in two stages, producing aligned sparse voxels and point ma…
-
SpatialFusion enhances image generation with 3D geometric awareness, outperforming GPT-4o
Researchers have developed SpatialFusion, a new framework designed to improve the 3D geometric understanding of image generation models. By integrating a spatial transformer with Mixture-of-Transformers architecture, Sp…