[AINews] Promptable Prosody, SOTA ASR, and Semantic VAD: OpenAI revamps Voice AI

Smol AI News ·

OpenAI released upgraded voice AI models covering promptable prosody, improved ASR, and semantic voice activity detection.

Categories: Model Releases

Excerpt

<p><strong>OAI Voice models are all you need.</strong></p> <blockquote> <p>AI News for 3/19/2025-3/20/2025. We checked 7 subreddits, <a href="https://twitter.com/i/lists/1585430245762441216" target="_blank"><strong>433</strong> Twitters</a> and <strong>29</strong> Discords (<strong>227</strong> channels, and <strong>4533</strong> messages) for you. Estimated reading time saved (at 200wpm): <strong>386 minutes</strong>. You can now tag <a href="https://x.com/smol_ai" target="_blank">@smol_ai</a> for AINews discussions!</p> </blockquote> <p>As one commenter said, the best predictor of an OpenAI launch is <a href="https://x.com/alexalbert__/status/1902765482727645667?s=46" target="_blank">a launch from another frontier lab</a>. Today's OpenAI mogging takes the cake because of how broadly it revamps OpenAI's offering - if you care about voice at all, this is as sweeping a change as the <a href="https://buttondown.com/ainews/archive/ainews-the-new-openai-agents-platform/" target="_blank">Agents platform revamp from last week</a>.</p> <p>We think <a href="https://x.com/juberti/status/1902771172615524791?s=46" target="_blank">Justin Uberti's summary is the best one</a>: <img alt="image.png" class="newsletter-image" src="https://assets.buttondown.email/images/25f1163e-d943-4aef-bf74-8f0cdc621b52.png?w=960&amp;fit=max" /></p> <p>But you should also watch the livestream:</p> <div> </div><p>The major three highlights are </p> <p><strong>OpenAI.fm</strong>, a demo site that shows off the new promptable prosody in 4o-mini-tts:</p> <p><img alt="image.png" class="newsletter-image" src="https://assets.buttondown.email/images/4467034b-1a6d-460c-9c1f-64c810bb821a.png?w=960&amp;fit=max" /></p> <p><strong>4o-transcribe</strong>, a new (non open source?) ASR model that beats whisper and commercial peers:</p> <p><img alt="image.png" class="newsletter-image" src="https://assets.buttondown.email/images/003ab624-c3c5-49ec-a0bd-54fdff9f96c3.png?w=960&amp;fit=max" /></p> <p>and finally, bli