WellSaid Labs Invents a Game Changer AI Voice Model for Content Creators
Latest AI Voice Model Considers Context, Not Just Sounds.
WellSaid Labs, the leading AI text-to-speech technology company, has invented the most natural speech markup language available to content creators. WellSaid Labs’ entirely new respelling system allows a content creator the ability to give precise instructions to the AI, delivering more control over word pronunciation and desired emphasis. With this more intuitive AI able to capture the natural human performances in a voice actors’ delivery, the AI can now more freely predict how the actual voice actor would have read such content, delivering companies and content creators huge time savings.
Improved pronunciation, intonation, and user controls
Up until now, the Text to Speech (TTS) industry has only relied on a phonetic layer dictating how to pronounce words. However, voice-actors don’t read phonemes, they read graphemes, and now so does the WellSaid Labs model as well as having a pronunciation layer. Having only phonetic transcription can limit a model’s breadth of knowledge and therefore limit its ability to predict the pronunciation and delivery of new and unique words. Also it is difficult to empower users with a consistent system for guiding a voice avatar to pronounce words according to the user’s preferences, such as with correct vowel sounds and syllabic emphasis. WellSaid Labs has made an enormous breakthrough in overcoming these limitations.
“Customer feedback on using our new voice model is incredible,” says Rhyan Johnson, WellSaid Labs Senior Voice Data Engineer. “Using our new respelling system, content creators love the fact that words are being pronounced the way they choose, with the right intonation or regional preference to meet their brand’s voice identity. You say tomato, I say tomahto. And, so do the WellSaid Labs’ voice avatars.”
New model focuses on improving the voice avatar’s correctness
“More words are pronounced correctly, more often. Sentence intonation is generally more natural, including questions, which are tough for other systems. We’ve also created our own text verbalization model to empower the AI to be smarter with non-standard words such as a dollar amount, a year, or a phone number. And it also does better with specialized text and speaking URLs, acronyms, or abbreviations,” explained Johnson.
WellSaid Labs’ Voice Avatars all come from real voice actors. Content creators now have even greater ability to ensure pronunciation and tone are exactly what they want whether narration, promotional, conversational or for a unique custom character. Users can now type $30M, or the year 2022, and the system should interpret the text correctly as “thirty million dollars” or “twenty twenty two,” instead of “dollar thirty M” or “two thousand and twenty two”,
Recommended AI News: Kemin Industries Launches Cloud-based Data Management Platform KEMINCONNECT
[To share your insights with us, please write to email@example.com]