How to train an ASR (automatic speech recognition) engine with Esteban Gorupicz and Alejandro Heredia, Atexto

Kane Simms
April 19, 2021
in Podcast

How to train an ASR (automatic speech recognition) engine with Esteban Gorupicz and Alejandro Heredia, Atexto https://vux.world/wp-content/uploads/atexto-website-hero.jpg 1800 1200 Kane Simms Kane Simms https://secure.gravatar.com/avatar/0901761bb0cd548f6834483c07a2b85beb060b870e6df663d11ec9ec76a164f5?s=96&d=blank&r=g April 19, 2021 April 16, 2021

Atexto join us to dive under the hood of training automatic speech recognition systems (ASR).

AVAILABLE ON ALL PODCAST PLAYERS

Apple podcasts | Spotify | YouTube | Overcast | CastBox | Spreaker | TuneIn | Breaker | Stitcher | PlayerFM | iHeartRadio

Training speech recognition systems

We take it for granted that, when we talk to our voice assistants, they hear us. When we dictate a message to Siri, it transcribes it. Speech recognition is used in a whole manner of areas beyond voice assistants, too. Transcribing videos on YouTube, PowerPoint, dictation, call transcriptions in contact centres and beyond. But speech recognition systems don’t always perform as expected and require training constantly to address bias, new language and dialects.

Why do automatic speech recognition systems need training?

There are many cases, however, where speech recognition systems either fail, or need training. For example, training speech recognition models to remove bias is a core priority for all speech recognition providers today. Studies have shown that the major ASR players technologies perform poorly when used by black people, as compared to white people, for example. Addressing this performance gap is of upmost importance.

Another reason for training ASR systems is expansion into other countries and to cater to other languages and accents. When Amazon, for example, aims to launch Alexa in, say, Brazil, it needs to train its speech recognition models in the Brazilian language, dialect and accents. Understanding languages, accents and dialects is a core part of making AI accessible to the world, as well as for commercial expansion.

Finally, companies need to continue training ASR systems to make sure that new vocabularies and phrases that enter a given language are understood. Prior to 2020, the world wasn’t saying the phrases ‘COVID’, ‘COVID 19’, ‘SARS COV 2’ half as much. A few year’s prior, Tik Tok wasn’t a thing. Language changes, and so ASR systems need to change with it.

In this episode

In this episode, we chat to Esteban Gorupicz, CEO, and Alejandro Heredia, Technology Business Development and Sales, at Atexto about how they help ASR companies train speech recognition systems and tackle the above problems.

About Atexto

Atexto is one of the world’s leading providers in speech recognition training. It’s platform allows ASR providers to crowd source training data for speech recognition systems, label and tag the data, then export it to feed into its models. It also has an ASR benchmarking tool, which allows providers to compare performance across ASR systems.

Links

Check out Atexto

Find out about the ASR World Championships

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
resolution	session	This is a functionality cookie used to collect the horizontal value of the visitor screen resolution. It helps in optimizing the website view to the user.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_111445333_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
ajs_anonymous_id	never	This cookie is set by Segment.io to check the number of ew and returning visitors to the website.
CONSENT	16 years 2 months 25 days 18 hours	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__smSessionId	9 hours	No description available.
__smToken	1 year	This cookie is set by the Sumo. This cookie is used for verifying whether the user is logged in or not.
__smVID	1 month	This cookie is set by Sumo. The purpose of the cookie is not yet known.
_mailmunch_visitor_id	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
AnalyticsSyncHistory	1 month	No description
attribution_user_id	1 year	This cookie is set by the provider Typeform. This cookie is used for Typeform usage statistics. It is used in context with the website's pop-up questionnaires and messengering.
cookielawinfo-checkbox-functional	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
debug	never	No description available.
intercom-id-or0x2acp	8 months 26 days 1 hour	No description
intercom-session-or0x2acp	7 days	No description
li_gc	2 years	No description
li_sugr	3 months	No description available.
mailmunch_second_pageview	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.

How to train an ASR (automatic speech recognition) engine with Esteban Gorupicz and Alejandro Heredia, Atexto

AVAILABLE ON ALL PODCAST PLAYERS

Apple podcasts | Spotify | YouTube | Overcast | CastBox | Spreaker | TuneIn | Breaker | Stitcher | PlayerFM | iHeartRadio

Training speech recognition systems

Why do automatic speech recognition systems need training?

In this episode

About Atexto

Scaling conversational AI with Roger Dill, Swisscom, and Per Ottosson, Artificial Solutions

Hyper-personalised voice AI with Laetitia Cailleteau, Accenture

Content

Training

Consulting

Connect