PulseAugur / Brief
EN
LIVE 02:55:52

Brief

last 24h
[2/2] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Try Sonic 3.5 in voice finder: https://t.co/5L8df8yzQM

    Together AI has announced the release of Cartesia Sonic 3.5, a new text-to-speech (TTS) model designed for real-time applications. The model boasts sub-90ms latency and supports 42 languages, with features for context-aware pronunciation and accurate transcript following. Developers can now access over 150 Cartesia Sonic 3.5 voices through Together AI's voice finder tool to compare and select voices before deployment. AI

    IMPACT Enhances real-time TTS capabilities with low latency and broad language support, potentially improving voice agent interactions.

  2. ZONOS2: real-time TTS with 8B params, 900M active, and high-fidelity voice cloning

    Zyphra has released ZONOS2, an open-source, real-time text-to-speech model featuring 8 billion total parameters and 900 million active parameters for efficient inference. This sparse Mixture-of-Experts model excels at high-fidelity, zero-shot voice cloning and aims to overcome the typical trade-off between speech quality and speed. ZONOS2 processes raw UTF-8 bytes instead of phonemes, improving support for multiple languages and code-switching, and was trained on over 6 million hours of audio data. AI

    ZONOS2: real-time TTS with 8B params, 900M active, and high-fidelity voice cloning

    IMPACT This sparse MoE TTS model offers high-fidelity voice cloning and real-time performance, potentially setting new benchmarks for expressive speech synthesis.