AI Research Could Help Improve Alexa’s Speech Recognition Model by 15%

By AIT News Desk On Apr 26, 2019

Researchers from John Hopkins University and Amazon published a new paper describing how they trained a deep learning system that can help Alexa ignore speech not intended for her, improving the speech recognition model by 15%.

“Voice-controlled house-hold devices, like Amazon Echo or Google Home, face the problem of performing speech recognition of device directed speech in the presence of interfering background speech,” the researchers stated in their paper.

To achieve this, the researchers trained a neural network to match and recognize the subsequent speech of whoever utters the “wake word” – usually ‘Alexa’ – and to ignore interfering speech from other people or media.

“The challenge of this task is to learn a speaker representation from a short segment corresponding to the anchor word,” the researchers said. “We implemented our technique using two different neural-network architectures. Both were variations of a sequence-to-sequence encoder-decoder network with an attention mechanism,” the researchers said.

Using NVIDIA V100 GPUs with the OpenSeq2Seq toolkit for distributed and mixed precision training of sequence-to-sequence models, built using TensorFlow, the team trained their algorithm on 1,200 hours of live data in English from Amazon Echo.

ChatSee.ai Raises $6.5M led by True Ventures to Tackle the Growing Problem of AI Agent Failures

Jun 12, 2026

Applogika Achieves AWS Resiliency Competency

Jun 12, 2026

New IBM Study Finds CIOs and CTOs Face Growing AI Control Gap as Enterprise Deployment Scales

Jun 12, 2026

Prev Next 1 of 42,400

By modifying the baseline network developed, the team added an additional input that augments the attention mechanism by prioritizing speech similar to the anchor word. “During training, the attention mechanism automatically learns which acoustic characteristics of the wake word to look for in subsequent speech,” the team included.

The team also developed a mask-based model that more explicitly matches the input speech with the acoustic profile of the anchor word.

In the end, the first approach performed better, achieving the 15% improvement, while the second model only achieved a 13% improvement.

The algorithm uses NVIDIA GPUs for both training and inference, the researchers said.

AI Research Could Help Improve Alexa’s Speech Recognition Model by 15%

Quick Links

Visit Our Other Sites

Follow Us

Interested in our Customized Editorial Services?

Please fill your details and we’ll get in touch with you!

NEWS

INTERVIEWS

INSIGHTS

AI RADAR

SERVICES

SUBSCRIBE

CONTACT US

Brought to you by

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought.

Copyright © 2026 AiThority. All Rights Reserved. Privacy Policy

AI Research Could Help Improve Alexa’s Speech Recognition Model by 15%

Quick Links

Visit Our Other Sites

Follow Us

Interested in our Customized Editorial Services?

﻿Please fill your details and we’ll get in touch with you!

NEWS

INTERVIEWS

INSIGHTS

AI RADAR

SERVICES

SUBSCRIBE

CONTACT US

Brought to you by

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought. Copyright © 2026 AiThority. All Rights Reserved. Privacy Policy

Please fill your details and we’ll get in touch with you!

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought.

Copyright © 2026 AiThority. All Rights Reserved. Privacy Policy