Deepfakes, Voice Commerce and the Future of AI in Business
A deepfake uses technology to generate audio (or video) of a person saying or doing things they’ve never actually said or done. It is generated by machine learning tools that extract a subject’s usable audio data from press conferences, chat recordings, film appearances, TV interviews, and more. Deepfakes derived from a mediocre algorithm or limited sources may sound stilted, robotic, or unconvincing, while more robust deepfakes can fool listeners into thinking it’s the real thing.
Deepfakes for Good
In Val, the new documentary by and about actor Val Kilmer, who is unable to speak after surviving throat cancer, producers decided to have his son read the narration. But he could just as easily have opted for a deepfake: Just last year, he commissioned Sonantic, a British AI company, to develop an artificial voice for him. Today, he can use this program to convert his written words into a spoken statement. For Kilmer and others in similar situations, deepfake technology is a blessing.
Other uses of deepfake technology inspire ambivalence or even anger. When the documentary Roadrunner: A Film About Anthony Bourdain aired earlier this year, many friends and admirers objected to director Morgan Neville’s decision to re-create Bourdain’s voice for a few short segments of the film.
The Dark Side of Deepfakes
Of course this technology in the wrong hands can be purely malignant, as recent fraud attempts have demonstrated. More people are recording and sharing their voices than ever before, which makes it easier for bad actors to scrape audio information for deepfake construction. If someone is constantly posting TikToks, always keeping Discord open, or has a YouTube channel, his or her voiceprint is not secure. A fraudster can use these sources to construct an artificial replica of the voice. While this might sound futuristic, exploit attempts based on this technology are happening every day.
In mid-2020, one firm was nearly compromised by a deepfake voice message allegedly from the company’s CEO. In this particular scenario, they got lucky, but a corporate audio deepfake attack succeeded in 2019, costing unnamed business hundreds of thousands of dollars. The fraudsters built their fake voice through publicly available samples of the intended victim’s real voice, including investor calls, TED Talks, television and YouTube appearances. And while these scams intended to fool employees who have access to company bank accounts are relatively uncommon, over the last five years, the incidence of deepfake scam attempts rose 350%.
A Growing Market for Voice Commerce
Voice tech is proving more valuable every year with the explosive growth in the digital hub market. By the end of last year, almost one in every five consumers had made a purchase through a voice commerce platform like Amazon Echo; next year, voice commerce is expected to account for $40 billion in sales.
Voice commerce, when working properly, is almost frictionless. It appeals to everyone, from homebound senior citizens to youthful tech enthusiasts alike. As more and more people opt-in to digital commerce, opportunities for voice fraud will grow unless steps are taken to keep these channels secure.
Read More AI ML News: Baidu’s Futuristic AI-based EV Venture Ready to Succeed Volvo’s Legacy in 2023
Finding the fakes
Sometimes clues to fraudulence are hidden in high or low-frequency (Lofi) sounds that our ears cannot detect. So deepfakes that don’t sound quite right are often layered with static or background noise in an attempt to disguise their flaws and distract a human listener. But thankfully an AI-based verification algorithm can detect signal issues, including quantization, codec artifacts, dropped frames, and packet loss. On top of that if the same number makes frequent dubious calls, it can be blacklisted while data relating to legitimate callers’ voiceprints may be stored for purposes of future verification.
Looking Ahead in 2022 With Deep Fakes and Voice Commerce
When Sonantic first attempted to recreate Val Kilmer’s voice, they discovered that many of their sources, like films and interview appearances, left a lot to be desired with all the background noise. They eventually discovered better archival files, but a few years from now deepfake creators may not even need high-quality sources to deliver a convincing replication.
Google Cloud Updates: Google Cloud Unveils New Features to Bigtable; Adds Autoscaling to Improve Overall Manageability
We are producing more recorded audio than any previous generation. And, of course, the algorithms and AI used for deepfake creation are getting more sophisticated by the day. If large institutions want to take part in the exciting and growing voice commerce market, it’s incumbent that they protect themselves and their customers.
[To share your insights with us, please write to sghosh@martechseries.com]
Comments are closed.