Google’s SpecAugment Achieves Best-In-Breed Speech Recognition Without a Language Model

By Viraj T On Apr 23, 2019

Researchers at Google are applying computer vision to images generated out of sound waves to develop Best-In-Breed Speech Recognition Without Language Models. AI researchers state that the latest SpecAugment method does not need any additional data or language models in order to recognize human speech precisely.

SpecAugment works by applying data augmentation of visual analytics to spectrograms (visual representations of speech).

“An unexpected outcome of our research was that models trained with SpecAugment out-performed all prior methods even without the aid of a language model,” Google AI resident Daniel S. Park and research scientist William Chan said in a blog post today. “While our networks still benefit from adding a language model, our results are encouraging in that it suggests the possibility of training networks that can be used for practical purposes without the aid of a language model.”

SmartBear and Carahsoft Expand Partnership to Enhance the Quality of Software Development in the Public Sector

Feb 11, 2026

Roboworx Adds AI-Powered Predictive Analytics to Robot Service Manager

Feb 11, 2026

ElevenLabs secures first-of-its-kind AI Agent insurance

Feb 11, 2026

Prev Next 1 of 41,808

A combination of SpecAugment and LibriSpeech960h was applied for speech recognition which obtained a 2.6% word error rate. LibriSpeech960h consists of –

1,000 hours of spoken English
260 hours of telephone conversations in English

Automatic Speech recognition capabilities work by converting human speech into machine-readable text before sending out the answers. Known as conversational AI, the technology is used in a wide range of products such as Amazon’s Alexa. Google says that super conversational AI capabilities will only help in the adoption of the technology and the products associated with it.

Already advancing computing capabilities have drastically lowered errors in speech recognition. Isolating background noise improves Alexa’s speech recognition capabilities by 15%.

We recently covered the semi-supervised training method for Alexa which will improve voice recognition capabilities by 20%.

Google’s SpecAugment Achieves Best-In-Breed Speech Recognition Without a Language Model

Quick Links

Visit Our Other Sites

Follow Us

Interested in our Customized Editorial Services?

Please fill your details and we’ll get in touch with you!

NEWS

INTERVIEWS

INSIGHTS

AI RADAR

SERVICES

SUBSCRIBE

CONTACT US

Brought to you by

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought.

Copyright © 2026 AiThority. All Rights Reserved. Privacy Policy

Google’s SpecAugment Achieves Best-In-Breed Speech Recognition Without a Language Model

Quick Links

Visit Our Other Sites

Follow Us

Interested in our Customized Editorial Services?

﻿Please fill your details and we’ll get in touch with you!

NEWS

INTERVIEWS

INSIGHTS

AI RADAR

SERVICES

SUBSCRIBE

CONTACT US

Brought to you by

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought. Copyright © 2026 AiThority. All Rights Reserved. Privacy Policy

Please fill your details and we’ll get in touch with you!

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought.

Copyright © 2026 AiThority. All Rights Reserved. Privacy Policy