Atexto join us to dive under the hood of training automatic speech recognition systems (ASR).
Training speech recognition systems
We take it for granted that, when we talk to our voice assistants, they hear us. When we dictate a message to Siri, it transcribes it. Speech recognition is used in a whole manner of areas beyond voice assistants, too. Transcribing videos on YouTube, PowerPoint, dictation, call transcriptions in contact centres and beyond. But speech recognition systems don’t always perform as expected and require training constantly to address bias, new language and dialects.
Why do automatic speech recognition systems need training?
There are many cases, however, where speech recognition systems either fail, or need training. For example, training speech recognition models to remove bias is a core priority for all speech recognition providers today. Studies have shown that the major ASR players technologies perform poorly when used by black people, as compared to white people, for example. Addressing this performance gap is of upmost importance.
Another reason for training ASR systems is expansion into other countries and to cater to other languages and accents. When Amazon, for example, aims to launch Alexa in, say, Brazil, it needs to train its speech recognition models in the Brazilian language, dialect and accents. Understanding languages, accents and dialects is a core part of making AI accessible to the world, as well as for commercial expansion.
Finally, companies need to continue training ASR systems to make sure that new vocabularies and phrases that enter a given language are understood. Prior to 2020, the world wasn’t saying the phrases ‘COVID’, ‘COVID 19’, ‘SARS COV 2’ half as much. A few year’s prior, Tik Tok wasn’t a thing. Language changes, and so ASR systems need to change with it.
In this episode
In this episode, we chat to Esteban Gorupicz, CEO, and Alejandro Heredia, Technology Business Development and Sales, at Atexto about how they help ASR companies train speech recognition systems and tackle the above problems.
Atexto is one of the world’s leading providers in speech recognition training. It’s platform allows ASR providers to crowd source training data for speech recognition systems, label and tag the data, then export it to feed into its models. It also has an ASR benchmarking tool, which allows providers to compare performance across ASR systems.
Check out Atexto