OpenAI Launches New Voice Models With Reasoning, Translation in API
SAN FRANCISCO — OpenAI on Wednesday released new realtime voice models in its developer API, adding reasoning, translation and speech transcription capabilities aimed at enabling more natural voice-powered applications.
The new models expand OpenAI’s voice intelligence offerings, according to a blog post published by the company. The update gives developers access to enhanced voice capabilities that can reason through complex queries, translate between languages and transcribe speech in real time.
The release targets developers and enterprises building voice-enabled AI applications, providing API-level access to models that power more intelligent conversational experiences. The models are designed to move beyond basic speech-to-text functionality toward what OpenAI describes as more “natural and intelligent voice experiences,” according to the company’s announcement.
The addition of reasoning capabilities to voice models represents a technical advance, allowing voice-based AI systems to process and respond to complex queries rather than simply converting speech to text. Translation features built into the models could streamline development for companies building multilingual voice applications.
OpenAI has progressively expanded its voice technology since launching the original voice mode in ChatGPT. The company’s latest release makes advanced voice capabilities available through the API, according to the company’s announcement.
Pricing and detailed technical specifications for the new models were not immediately available. Developers can access the new models through OpenAI’s existing API platform.