实体 Whisper

Whisper

PulseAugur coverage of Whisper — every cluster mentioning Whisper across labs, papers, and developer communities, ranked by signal.

Show in brief

总计 · 30天

90 天内 33

发布 · 30天

90 天内 0

论文 · 30天

90 天内 17

层级分布 · 90 天

frontier release 1
significant 1
research 9
tool 19
commentary 3

关系

developed by OpenAI 100%
used by Ollama 90%

时间线

2026-05-12 research_milestone A new semi-supervised framework for speech confidence detection was proposed, achieving a Macro-F1 score of 0.751. 来源

情绪 · 30 天

11 天有情绪数据

最近 · 第 2/2 页 · 共 33 条

TOOL · CL_15989 · May 5 · 04:00

BaldWhisper model achieves 48% size reduction and 2.15x speedup

Researchers have developed BaldWhisper, a method to significantly compress and accelerate the Whisper speech-to-text model. By employing low-rank decomposition for embeddings and merging transformer layers, BaldWhisper …
RESEARCH · CL_14473 · May 4 · 04:00

Audio-language models struggle with dysarthric speech context, but fine-tuning shows promise

Researchers have developed a benchmark to test if current audio-language models can effectively use additional clinical context to improve automatic speech recognition for dysarthric speech. Initial findings indicate th…
RESEARCH · CL_22854 · May 3 · 23:37

Needle model distills Gemini for precise tool-calling tasks

A new 26-million parameter model named Needle has been developed, distilled from Google's Gemini to excel specifically at tool-calling tasks. The core innovation lies not in its size, but in its ability to reliably prod…
RESEARCH · CL_08610 · Apr 29 · 04:00

Researchers enhance elderly ASR with LLM paraphrasing and speech synthesis

Researchers have developed a novel data augmentation technique to improve automatic speech recognition (ASR) for elderly individuals. This method utilizes large language models to paraphrase existing transcripts, genera…
RESEARCH · CL_08266 · Apr 28 · 13:18

WhisperPipe architecture slashes ASR latency and memory use for real-time applications

Researchers have developed WhisperPipe, a new streaming architecture designed to improve real-time automatic speech recognition (ASR) performance. This architecture addresses the trade-off between accuracy and computati…
RESEARCH · CL_06729 · Apr 28 · 04:00

New FADE method improves ASR model quantization for edge devices

Researchers have developed FADE, a novel framework for improving post-training quantization of encoder-decoder Automatic Speech Recognition (ASR) models. This method addresses the issue of error accumulation across laye…
RESEARCH · CL_13934 · Apr 27 · 21:55

Talkie-1930: New 13B AI model trained on pre-1931 text explores historical knowledge

A new project called Talkie has released a 13-billion parameter language model trained exclusively on English text from before 1931. This "vintage" model aims to explore AI's ability to predict the future and generate n…
TOOL · CL_47664 · Feb 23 · 00:00

语音模型在街道名称识别上表现不佳，非母语者尤其如此

Together AI 的研究人员发现，当前最先进的语音识别模型存在显著的失败率，转录街道名称的平均错误率为 39%，特别是对于非英语母语者，他们的信息被误解的可能性高出 18%。这种不准确性可能导致严重的现实后果，例如增加出行时间和网约车等服务的成本。研究表明，一种名为“跨语言风格迁移”的合成数据生成技术，只需极少量的训练数据即可将转录准确率提高高达 60%。
TOOL · CL_00804 · Apr 22 · 10:00

Speak 利用 OpenAI 的人工智能进行个性化语言学习和全球扩张

语言学习应用程序 Speak 正在利用 OpenAI 的先进人工智能能力，创造个性化且高度互动的一对一辅导体验。该公司成立于 2016 年，随着语音识别和大型语言模型的进步而显著发展，实现了实时反馈和对话角色扮演等功能。Speak 的战略是首先专注于韩国市场以验证其人工智能原生模型，然后再进行全球扩张，目前该公司正在投资人工智能生成的课程，以实现跨不同领域的个性化学习路径。
TOOL · CL_02402 · Dec 4 · 10:00

Morgan Stanley leverages OpenAI's GPT-4 to enhance financial advisor services

Morgan Stanley has partnered with OpenAI to integrate GPT-4 into its financial advisory services, enhancing advisor efficiency and client engagement. The firm developed an internal chatbot, AI @ Morgan Stanley Assistant…
TOOL · CL_47802 · Dec 11 · 20:21

Replit推出AI模板以加快开发者入职

Replit推出了一套由AI驱动的模板，旨在简化开发者的入职流程并加速AI驱动型应用程序的创建。这些模板支持多种编程语言和框架，简化了向量数据库和大型语言模型等工具的复杂设置。值得注意的示例包括用于Qdrant向量搜索、比较Gemini和GPT-4、使用OpenAI构建AI支持代理以及使用OpenAI Whisper进行会议转录的模板。
FRONTIER RELEASE · CL_01524 · Jul 28 · 00:00

OpenAI launches advanced audio models for API, enhancing voice agents

OpenAI has released new, advanced audio models through its API, enhancing capabilities for voice agents. The updated speech-to-text models, including gpt-4o-transcribe and gpt-4o-mini-transcribe, offer improved accuracy…
TOOL · CL_47938 · Jul 29 · 04:00

Replit 集成 OpenAI 模型以提供编码辅助和教育

Replit 已与 OpenAI 合作，将其先进的 AI 模型集成到其编码平台中。该公司正在推出一门关于 LLM 和 GPT 的新课程，并推出了由 OpenAI 的 Codex 模型驱动的代码解释 beta 功能。此外，Replit 还在探索使用 GPT-3 生成博客内容，这凸显了 AI 与软件开发环境之间日益增长的协同作用。

BaldWhisper model achieves 48% size reduction and 2.15x speedup

Audio-language models struggle with dysarthric speech context, but fine-tuning shows promise

Needle model distills Gemini for precise tool-calling tasks

Researchers enhance elderly ASR with LLM paraphrasing and speech synthesis

WhisperPipe architecture slashes ASR latency and memory use for real-time applications

New FADE method improves ASR model quantization for edge devices

Talkie-1930: New 13B AI model trained on pre-1931 text explores historical knowledge

语音模型在街道名称识别上表现不佳，非母语者尤其如此

Speak 利用 OpenAI 的人工智能进行个性化语言学习和全球扩张

Morgan Stanley leverages OpenAI's GPT-4 to enhance financial advisor services

Replit推出AI模板以加快开发者入职

OpenAI launches advanced audio models for API, enhancing voice agents

Replit 集成 OpenAI 模型以提供编码辅助和教育