How Together AI built the world’s fastest speech-to-text stack
Together AI has developed a highly efficient speech-to-text system, significantly outperforming existing models in speed. Their approach addresses the unique challenges of audio data processing, which is substantially larger than text and requires extensive preprocessing. By optimizing the entire data path, from CPU preprocessing to GPU execution, Together AI achieved record-low latency and high throughput for both streaming and offline transcription tasks. AI
IMPACT Sets new SOTA for speech-to-text latency and throughput, potentially lowering costs for AI applications requiring audio processing.