Together AI
PulseAugur coverage of Together AI — every cluster mentioning Together AI across labs, papers, and developer communities, ranked by signal.
- uses Gemma-4-31B-it-Pearl 90%
- uses Deepgram 90%
- founded Vipul Ved Prakash 90%
- partners with Pearl Research Labs 90%
- uses Nvidia Blackwell B200 90%
- developed Together Code Interpreter 90%
- used by NVIDIA Parakeet-TDT 0.6B v3 90%
- developed Gemma-4-31B-it-Pearl 90%
- employed by Dan Fu 90%
- partners with MiniMax AI 80%
- used by MiniMax AI 75%
- used by DeepSeek-R1 70%
- 2026-06-13 product_launch Together AI launched the MiniMax-M3 multimodal model. source
- 2026-06-12 research_milestone Together AI released benchmarks showing significant performance gains on Blackwell hardware for AI agent infrastructure. source
- 2026-06-10 research_milestone Together AI achieved ISO 27001:2022 certification after a successful audit. source
- 2026-06-10 research_milestone Together AI achieved ISO 27001:2022 certification for its Information Security Management System. source
- 2026-06-09 partnership Together AI partnered with Pax8 to offer AI infrastructure and models to small and medium-sized businesses. source
- 2026-06-01 product_launch Together AI is announcing a new model called M3. source
- 2026-05-29 product_launch Together AI is now serving the two fastest speech-to-text models, including NVIDIA Parakeet-TDT 0.6B v3. source
- 2026-05-29 product_launch Together AI launched a new open-source AI translation application. source
- 2026-05-22 product_launch Together AI launched updates to its Fine-Tuning Platform, adding support for new LLMs and extending context lengths. source
- 2026-05-22 product_launch Together AI announced the addition of 1,000 NVIDIA H100 and H200 GPUs to its infrastructure. source
- 2026-05-22 product_launch Together AI launches GPU clusters with NVIDIA Blackwell platform and optimized kernel collection, achieving significant performance gains. source
- 2026-05-22 product_launch Together AI launched major upgrades to its Batch Inference API. source
- 2026-05-22 product_launch Together AI released FlashAttention-3 and FlashAttention-4, optimized attention mechanisms for GPUs. source
- 2026-05-22 product_launch Together AI launched access to the Qwen3.7-Max model. source
- 2026-05-15 partnership Together AI and Pearl Research Labs formed a partnership to integrate blockchain for AI inference cost reduction. source
20 day(s) with sentiment data
Together AI's ATLAS system demonstrates superior inference speed on par with specialized hardware
Together AI's newly launched ATLAS system, an adaptive-learning inference engine, is showing remarkable performance, achieving up to 500 TPS on DeepSeek-V3.1. This performance rivals that of specialized hardware like Groq, suggesting Together AI is effectively optimizing LLM inference beyond standard GPU capabilities.
Together AI significantly bolsters inference capacity with H100/H200 GPU expansion
The addition of one thousand NVIDIA H100 and H200 GPUs to Together AI's infrastructure represents a substantial investment in inference capabilities. This move directly supports the growing demand for high-throughput AI model serving and is likely intended to power both their internal services and external customer workloads.
Together AI to offer ATLAS as a distinct inference optimization service
Given the significant performance gains demonstrated by ATLAS, Together AI may soon offer this adaptive-learning inference system as a standalone service or an add-on feature for their existing GPU offerings. This would allow customers to leverage ATLAS's dynamic optimization without needing to manage the underlying infrastructure themselves.
Together AI to integrate NVIDIA Blackwell features into all core services
The 90% training speed boost achieved with NVIDIA Blackwell and custom kernels indicates a deep integration. It's likely Together AI will leverage Blackwell's capabilities across their entire platform, including their new instant clusters and fine-tuning services, to offer a performance edge over competitors.
Together AI's ATLAS system shows strong performance against specialized hardware
The reported performance of Together AI's ATLAS system, achieving up to 500 TPS on DeepSeek-V3.1 and outperforming specialized hardware like Groq, is a significant technical achievement. This suggests their adaptive inference approach is highly effective and could set a new benchmark for LLM inference speed and efficiency.
-
Together AI releases Mamba-3, prioritizing inference speed over training
Together AI has released Mamba-3, a new state space model (SSM) prioritizing inference efficiency over training speed. This model features a more expressive recurrence formula, complex-valued state tracking, and a multi…
-
Together AI launches NVIDIA's multimodal and 1M-context Nemotron 3 models
Together AI has launched NVIDIA's Nemotron 3 models, including the multimodal Nano Omni and the large-context Super, on its platform. Nemotron 3 Nano Omni, a 30B parameter model, excels at reasoning across video, images…
-
Together AI enhances GPU clusters with multi-tenancy and autoscaling
Together AI has enhanced its GPU clusters with new features aimed at improving efficiency and manageability for AI-native teams. The platform now supports multi-tenancy, allowing different teams to share compute resourc…
-
New methods tackle LLM KV cache compression for long contexts
Multiple research papers released in May and June 2026 propose novel methods for compressing the Key-Value (KV) cache in large language models (LLMs). These techniques aim to reduce the significant memory overhead assoc…
-
Speech models fail on street names, especially for non-native speakers
Researchers at Together AI have found that current state-of-the-art speech recognition models exhibit a significant failure rate, averaging 39% error in transcribing street names, particularly for non-native English spe…
-
Together AI expands LLM fine-tuning, adds longer contexts
Together AI has enhanced its fine-tuning platform to support a wider array of large language models, including recent releases from DeepSeek, Qwen, and Meta, alongside OpenAI's gpt-oss. The platform now offers expanded …
-
Together AI launches Rime V3 models for natural voice code-switching
Together AI has launched two new Rime models, V3 Turbo and V3, designed for natural code-switching in voice agents. V3 Turbo offers English-Spanish switching with a time-to-first-audio of approximately 120ms on dedicate…
-
DSGym framework standardizes data science agent evaluation and training
Researchers have introduced DSGym, a new framework designed to standardize the evaluation and training of data science agents. This system addresses limitations in current benchmarks by providing a unified API and self-…
-
Together AI rebrands, focuses on efficient AI inference infrastructure
Together AI has launched a brand refresh, emphasizing its role as an "AI Native Cloud" designed for builders of AI-native applications. The company is focusing on optimizing inference for efficiency and cost-effectivene…
-
Cursor and Together AI optimize AI coding assistant with NVIDIA Blackwell
Cursor, an AI-powered coding platform, has partnered with Together AI to optimize its real-time inference capabilities. This collaboration focuses on achieving low-latency responses within the editor's feedback loop, wh…
-
Multi-node training enables scaling foundation models across GPU clusters
Training large foundation models necessitates distributing the workload across numerous GPUs housed in multiple interconnected machines, a process known as multi-node training. This approach is essential for handling mo…
-
Guide details choosing open-source AI models for production
Choosing the right open-source AI model for production requires careful consideration of factors like transparency, adaptability, and control. While proprietary models offer tiered options, open models allow for deeper …
-
Together AI launches MiniMax Speech 2.8 Turbo for natural voice agents
Together AI has released MiniMax Speech 2.8 Turbo, an enterprise text-to-speech model designed for natural-sounding voice agents. This new model offers significant improvements in prosody, includes sound tags for vocal …
-
Together AI adds Rime voice models for expressive, controlled AI conversations
Together AI has integrated Rime's enterprise-grade voice models, Arcana v2 and Mist v2, into its platform. Arcana v2 offers expressive, conversational voices trained on real customer interactions, while Mist v2 provides…
-
Together AI VP: AI not hitting hardware wall, efficiency gains untapped
Together AI's VP of Kernels, Dan Fu, argues that the pursuit of AGI is not hitting a hardware wall. He posits that current AI systems are significantly underutilizing existing hardware, with training runs often achievin…
-
NVIDIA Nemotron Diffusion models offer 6.4x faster AI inference
NVIDIA has released the Nemotron-Labs Diffusion family of language models, available in 3B, 8B, and 14B parameter sizes. These models uniquely support autoregressive (AR), diffusion, and self-speculation decoding modes …
-
Together AI releases new Python SDK v2.0 RC
Together AI has released the Release Candidate for its new Python SDK, version 2.0. This updated SDK is built with a modern, type-safe architecture using OpenAPI specifications and Stainless, aiming for improved perform…
-
Together AI introduces AutoJudge for faster LLM inference
Researchers at Together AI have developed AutoJudge, a novel method to accelerate large language model inference. This technique automates the curation of task-specific datasets, enabling lossy speculative decoding with…
-
Together AI Cloud enhances RL pipelines with TorchForge and tool integrations
Together AI is enhancing its cloud platform to support advanced reinforcement learning (RL) pipelines, integrating TorchForge and Monarch for distributed training. The platform now offers low-latency GPU communication a…
-
Together AI launches unified platform for real-time voice agents
Together AI has launched a unified platform for building real-time voice agents, integrating speech-to-text (STT), large language models (LLM), and text-to-speech (TTS) within a single cloud environment. This co-locatio…