Together AI
PulseAugur coverage of Together AI — every cluster mentioning Together AI across labs, papers, and developer communities, ranked by signal.
- uses Gemma-4-31B-it-Pearl 90%
- uses Deepgram 90%
- founded Vipul Ved Prakash 90%
- partners with Pearl Research Labs 90%
- uses Nvidia Blackwell B200 90%
- developed Together Code Interpreter 90%
- used by NVIDIA Parakeet-TDT 0.6B v3 90%
- developed Gemma-4-31B-it-Pearl 90%
- employed by Dan Fu 90%
- partners with MiniMax AI 80%
- used by MiniMax AI 75%
- used by DeepSeek-R1 70%
- 2026-06-13 product_launch Together AI launched the MiniMax-M3 multimodal model. source
- 2026-06-12 research_milestone Together AI released benchmarks showing significant performance gains on Blackwell hardware for AI agent infrastructure. source
- 2026-06-10 research_milestone Together AI achieved ISO 27001:2022 certification after a successful audit. source
- 2026-06-10 research_milestone Together AI achieved ISO 27001:2022 certification for its Information Security Management System. source
- 2026-06-09 partnership Together AI partnered with Pax8 to offer AI infrastructure and models to small and medium-sized businesses. source
- 2026-06-01 product_launch Together AI is announcing a new model called M3. source
- 2026-05-29 product_launch Together AI is now serving the two fastest speech-to-text models, including NVIDIA Parakeet-TDT 0.6B v3. source
- 2026-05-29 product_launch Together AI launched a new open-source AI translation application. source
- 2026-05-22 product_launch Together AI launched updates to its Fine-Tuning Platform, adding support for new LLMs and extending context lengths. source
- 2026-05-22 product_launch Together AI announced the addition of 1,000 NVIDIA H100 and H200 GPUs to its infrastructure. source
- 2026-05-22 product_launch Together AI launches GPU clusters with NVIDIA Blackwell platform and optimized kernel collection, achieving significant performance gains. source
- 2026-05-22 product_launch Together AI launched major upgrades to its Batch Inference API. source
- 2026-05-22 product_launch Together AI released FlashAttention-3 and FlashAttention-4, optimized attention mechanisms for GPUs. source
- 2026-05-22 product_launch Together AI launched access to the Qwen3.7-Max model. source
- 2026-05-15 partnership Together AI and Pearl Research Labs formed a partnership to integrate blockchain for AI inference cost reduction. source
20 day(s) with sentiment data
Together AI's ATLAS system demonstrates superior inference speed on par with specialized hardware
Together AI's newly launched ATLAS system, an adaptive-learning inference engine, is showing remarkable performance, achieving up to 500 TPS on DeepSeek-V3.1. This performance rivals that of specialized hardware like Groq, suggesting Together AI is effectively optimizing LLM inference beyond standard GPU capabilities.
Together AI significantly bolsters inference capacity with H100/H200 GPU expansion
The addition of one thousand NVIDIA H100 and H200 GPUs to Together AI's infrastructure represents a substantial investment in inference capabilities. This move directly supports the growing demand for high-throughput AI model serving and is likely intended to power both their internal services and external customer workloads.
Together AI to offer ATLAS as a distinct inference optimization service
Given the significant performance gains demonstrated by ATLAS, Together AI may soon offer this adaptive-learning inference system as a standalone service or an add-on feature for their existing GPU offerings. This would allow customers to leverage ATLAS's dynamic optimization without needing to manage the underlying infrastructure themselves.
Together AI to integrate NVIDIA Blackwell features into all core services
The 90% training speed boost achieved with NVIDIA Blackwell and custom kernels indicates a deep integration. It's likely Together AI will leverage Blackwell's capabilities across their entire platform, including their new instant clusters and fine-tuning services, to offer a performance edge over competitors.
Together AI's ATLAS system shows strong performance against specialized hardware
The reported performance of Together AI's ATLAS system, achieving up to 500 TPS on DeepSeek-V3.1 and outperforming specialized hardware like Groq, is a significant technical achievement. This suggests their adaptive inference approach is highly effective and could set a new benchmark for LLM inference speed and efficiency.
-
Together AI partners with Pearl Research Labs to cut AI inference costs with crypto
Together AI has partnered with Pearl Research Labs to integrate blockchain technology into AI inference costs. This collaboration introduces a new inference endpoint for the Gemma-4-31B-it-pearl model, offering a discou…
-
Together AI models lead speech-to-text speed benchmarks
Together AI's speech-to-text models have achieved the top two positions on the Artificial Analysis leaderboard for transcription speed. The NVIDIA Parakeet TDT 0.6B V3 model, running on Together AI, is currently ranked …
-
Together AI releases Violin, an open-source video translation tool
Together AI has launched Violin, an open-source video translation tool designed to make online video content accessible across language barriers. The system utilizes advanced AI, including speech recognition, large lang…
-
Together AI launches Voice Finder for 600+ TTS voices
Together AI has launched Voice Finder, a new tool designed to help developers quickly select the most suitable voice for their applications from a catalog of over 600 options. The tool allows users to search for voices …
-
DeepSeek-V4's 1M-token context window is an inference systems challenge
Together AI has detailed the architectural innovations behind DeepSeek-V4's ability to handle a 1 million token context window. The model employs a hybrid attention design that compresses context before storing it in th…
-
Together AI simplifies model deployment with agent-driven CLI
Together AI has launched a new feature allowing developers to deploy and run any model from Hugging Face using their Dedicated Container Inference (DCI) infrastructure. This is facilitated by an agent-based CLI tool cal…
-
Artificial Analysis offers MiniMax-M2.7 with SambaNovaAI leading inference speed
Artificial Analysis has made its MiniMax-M2.7 model available through six different inference providers, highlighting significant differences in speed and cost. SambaNovaAI leads in performance, achieving 435 tokens per…
-
Together AI partners with Adaption to streamline model fine-tuning
Together AI has partnered with Adaption, a company co-founded by former Cohere and Google DeepMind leaders Sara Hooker and Sudip Roy. This collaboration integrates Adaption's data optimization tools with Together AI's f…
-
Together AI disables Linux kernel crypto interface to fix Copy Fail bug
Together AI has detailed its rapid response to a critical Linux kernel vulnerability, dubbed Copy Fail (CVE-2026-31431), which allows unprivileged local users to gain root access. The company treated the issue as a flee…
-
DeepSeek-V4 Pro model with 1.6T parameters now on Together AI
DeepSeek-V4 Pro, a large Mixture-of-Experts model with 1.6 trillion parameters, is now accessible on the Together AI platform. This model is designed for long-context reasoning, supporting up to a 512K-token context win…
-
Together AI releases open-source Reddit-like interface for AI model queries
Together AI has released an open-source, Reddit-like interface designed for asking AI models simple questions. Inspired by the r/explainlikeimfive subreddit, the tool aims to make AI interactions more accessible. It is …
-
Together AI announces open-source inference models at ICLR conference
Together AI announced its new inference and open-source model at ICLR. The company highlighted the model's capabilities and encouraged attendees to learn more.
-
Together powers Yutori's Delegate, an always-on agent for web monitoring and action
Together AI has launched a new inference service designed for continuous operation, specifically for AI agents. This service powers Delegate, a new product from Yutori AI that functions as an always-on agent. Delegate c…
-
Together AI scales inference to 300T tokens per month, a 10,000x increase YoY
Together AI has announced a significant increase in its inference capacity, processing 300 trillion tokens per month. This represents a tenfold growth from its previous capacity of 30 billion tokens per month. The compa…
-
Together AI enables AI labs and enterprises with infrastructure
Together AI has announced its platform is now available on NVIDIA Cloud.
-
Together AI unveils Parcae, a stable looped model architecture
Together AI has introduced Parcae, a novel stable architecture for looped language models. This new design allows models to achieve the quality of larger Transformers while using significantly fewer parameters, by incre…
-
Together AI launches Wan 2.7 video suite with enhanced creative controls
Together AI has launched the Wan 2.7 model suite, offering advanced video generation and editing capabilities. This suite includes text-to-video generation and will soon expand to image-to-video, reference-to-video, and…
-
Together AI kernels team optimizes GPUs with FlashAttention
The Together AI kernels team, including researchers Dan Fu and Tri Dao, developed FlashAttention, a software layer that significantly optimizes GPU performance for AI models. This breakthrough, achieved by applying data…
-
Smaller LLMs match GPT-4o on long context with "Divide and Conquer"
Researchers at Together AI have developed a "Divide and Conquer" framework that enables smaller language models to effectively handle long context tasks. Their study, presented at ICLR 2026, demonstrates that by breakin…
-
Together AI enhances fine-tuning for agents, reasoning, and vision
Together AI has enhanced its fine-tuning service to better support advanced AI workflows. The update includes native support for tool call, reasoning, and vision-language model fine-tuning, addressing common issues like…