Together
PulseAugur coverage of Together — every cluster mentioning Together across labs, papers, and developer communities, ranked by signal.
15 day(s) with sentiment data
-
AI Agents to Collaborate on Open Science at World's Fair
The AI.Engineer World's Fair will feature discussions on AI agents collaborating for open science. Specifically, James Zou will present work on EinsteinArena and DSGym, focusing on multi-agent mathematical discovery and…
-
LLMs struggle to generate multi-GPU kernels, researchers find
Researchers at Together have found that while large language models can efficiently generate single-GPU kernels, they struggle significantly with multi-GPU kernel generation. These models perform poorly when asked to cr…
-
GLM-5.2 model detailed in guide for OpenAI-compatible API features
This tutorial provides a guide to using the GLM-5.2 model via an OpenAI-compatible API, focusing on its advanced features. It details how to set up various providers like Together and Hugging Face, securely manage API k…
-
Together provides GPU infrastructure for Cartesia's real-time voice AI
Together provides managed GPU infrastructure and cluster control to Cartesia, enabling them to handle demanding real-time voice inference workloads. Cartesia's system processes millions of audio minutes daily with a mod…
-
AI model providers: User seeks European options for GLM 5.2, DeepSeek V4
A user on Reddit's r/LocalLLaMA community is seeking European providers for running open-weight large language models, specifically mentioning GLM 5.2 and DeepSeek V4 Flash. The user noted that while OpenRouter lists nu…
-
Together AI offers fast GLM-5.2 inference with optimized serving
Together AI is now offering GLM-5.2, a model that is reportedly fast and capable of handling long-context coding and agent workloads. The company emphasizes its optimized serving infrastructure, which allows for high th…
-
Together's MiniMax-M3 expands agent context with enhanced inference
Together has released MiniMax-M3, an advancement in agent capabilities that significantly expands the context window. This new iteration allows agents to process and retain information from long histories, images, video…
-
OpenAI releases GPT-Image-2 and GPT-5.5 Instant upgrades, plus new cybersecurity tools
OpenAI has released GPT-Image-2, making it available on Together AI for developers to integrate into their applications. This model supports up to 16 reference images per call and offers native 1K, 2K, and 4K outputs, w…
-
Developer launches SkillForge, an AI skill manager inspired by Homebrew
A developer has created SkillForge, an open-source tool designed to streamline the creation and management of AI skills, drawing inspiration from package managers like Homebrew. SkillForge aims to address the limitation…
-
GLM-5.2 model released with speeds over 200 tps on Together platform
The GLM-5.2 model has been released and is now accessible on the Together platform. This new version boasts impressive inference speeds, reportedly exceeding 200 tokens per second, and is available for users to test and…
-
Lemonade v10.8 enhances local AI models with auto memory, cloud offload, and tool integration
Lemonade has released version 10.8, featuring significant improvements in memory and context management for local AI models. This update introduces dynamic VRAM management that automatically unloads idle models and resi…
-
Together: Open-source AI cheaper than closed-source for landing pages
Together argues that closed-source AI models are not worth their higher cost, citing a test where Kimi K2.7 Code was 16 times cheaper than Claude Fable-5 for generating landing pages, with comparable quality. The compan…
-
Together Compute Expands GPU Offerings with H100, H200, and B200
Together, an inference and open-source AI company, has significantly expanded its on-demand compute platform. The company announced the addition of a substantial number of high-end GPUs, including H100s, H200s, and the …
-
llmesh Digest Unifies Local and Cloud LLMs with Prompt Firewall and IoT Integration
The llmesh Digest introduces a Python framework designed to unify the usage of both local and cloud-based Large Language Models (LLMs). It offers a consistent interface for interacting with various LLM providers, includ…
-
Scowld app offers vendor-agnostic AI companion support
Scowld is a new AI companion application designed to avoid vendor lock-in by supporting a wide array of large language models. It offers compatibility with major providers like Gemini, OpenAI, Claude, Groq, xAI, and pla…
-
Z.ai releases GLM-5.2, setting new open-source benchmark for long-context AI
Z.ai has released GLM-5.2, an open-source language model with a 1 million token context window, positioning it as a strong contender in long-horizon tasks and coding benchmarks. The model features an improved architectu…
-
Together AI releases Cartesia Sonic 3.5 for real-time TTS
Together AI has announced the release of Cartesia Sonic 3.5, a new text-to-speech (TTS) model designed for real-time applications. The model boasts sub-90ms latency and supports 42 languages, with features for context-a…
-
Together releases Sonic-3.5 and Ink-2 for speech processing
Together has released two new models: Sonic-3.5 for text-to-speech and Ink-2 for speech-to-text. These models are designed for real-time streaming applications and are claimed to be the top-performing models in their re…
-
DeepSeek V4 Pro tops speed and latency benchmarks on Together AI
DeepSeek V4 Pro, when deployed on the Together AI platform, has achieved the top ranking on Artificial Analysis for both output speed and latency. This performance is attributed to advancements in inference systems, inc…
-
Together AI offers MiniMax M3 multimodal model with 1M context
Together AI has announced the availability of the MiniMax M3 API, an open-weight, native multimodal model. This model boasts a 1 million token context window, enhanced by MiniMax Sparse Attention, and features distinct …