AI & VideoProduct Launch

Microsoft Foundry Unveils MAI-Voice-2 AI for Multilingual Speech Generation

Microsoft is offering MAI-Voice-2 in public preview via Microsoft Foundry, enabling natural speech generation across more than 10 languages. This first-party AI voice model supports voice cloning and voice prompting for developers. It allows streaming professionals to build multilingual virtual agents and audio content workflows directly with Microsoft's tools.

Key Takeaways

MAI-Voice-2 supports voice cloning from short reference samples and voice prompting for tone and style modifications.
The model generates natural speech across more than 10 languages, available through Microsoft Foundry.
Developers can deploy MAI-Voice-2 directly via the Foundry catalog, integrating it with other speech and language models.
Microsoft will sell MAI-Voice-2 with its standard billing, content safety, and responsible AI policies.

Why It Matters

Microsoft's introduction of MAI-Voice-2 simplifies the development of AI-driven voice applications for streaming. The ability to generate natural, multilingual speech with voice cloning and prompting streamlines content localization and accessibility efforts. This move intensifies competition in the AI voice synthesis market, offering a more integrated solution for developers. Watch for adoption rates among content platforms and the specific applications emerging for global audience engagement.

Read full article at azure.microsoft.com

Agora: Agora Integrates OpenAI Real-Time API for Low-Latency Conversational AI

Amazon Web Services, Inc.: AWS SageMaker Adds Multi-Turn RL for Specialized AI Model Training

wTVision: wTVision Debuts CricketStats CG, Enters Cricket Graphics Market in Bangladesh