findsylls: A Language-Agnostic Toolkit for Syllable-Level Speech Tokenization and Embedding
Two new research papers introduce novel toolkits for syllable-level speech tokenization, aiming to improve spoken language modeling. The first, "findsylls," offers a language-agnostic toolkit that unifies various syllabification methods for reproducible comparisons across different languages and resource levels. The second, "ZeroSyl," presents a simpler, zero-resource method that extracts syllable boundaries and embeddings directly from pre-trained speech models like WavLM, outperforming prior syllabic tokenizers on multiple benchmarks. AI
IMPACT These advancements could lead to more efficient and accurate spoken language models by improving how speech is represented and processed.