Voice AI Lego: the importance of modularity

Ben McCulloch
March 25, 2022
in Article, Opinion

Voice AI Lego: the importance of modularity https://vux.world/wp-content/uploads/image.jpeg 1024 768 Ben McCulloch Ben McCulloch https://secure.gravatar.com/avatar/b1f3549c2d953651d69f59ec1fa801a3?s=96&d=blank&r=g March 25, 2022 March 25, 2022

Why a modular approach to voice AI technology selection gives you ultimate flexibility, according to Shawn Edmunds, CRO, Lumenvox.

Let’s say you’re planning to create your first voice AI solution in-house, with your own team.

Do you build every component yourself, from the ground-up? The ASR, the NLU, the dialogue management, the integrations, the TTS? Unless you’re Deutsche Telekom, and have a clear reason and resources to do it, then that’s the best way to burn time and money.

All of this stuff has already been built for you. You just need to stitch it together. Chances are, you’ll go looking for partners to provide the capabilities you can’t or won’t make yourself.

But how do you choose? And what happens when you reach the limits of your chosen technology or you develop new requirements that render it not fit for purpose anymore? Perhaps you select Dialogflow as your NLU. Then what if you outgrow Dialogflow in 3 years – how hard will it be to swap your NLU? What if Dialogflow doesn’t perform so well for certain use cases or on certain channels?

Shawn Edmunds, CRO of LumenVox, says: “You need to pick a partner that scales with you, have the ability to select modules that meet your needs, and be able to swap them when your needs change.”

And it’s not just for companies dipping their toes into voice for the first time…

Shawn says:

“Make it simple, make it cost effective, future proof it, and most important – it needs to scale”

So how does that work? The modular approach

It’s all about modules. You build your voice solution in-house using the best modules that fit your needs; perhaps you need an ASR that can be trained for your use case, or you need to support multiple languages, or distribute into another channel, or any other specific need. You just select the best modules for the job and stitch them together.

Even if you already have existing voice AI capabilities, perhaps you want to change your voice biometric tool. Shawn says it’s all about being malleable – you need to be able to swap components when you need to.

With a modular approach, you have full control over everything, but you don’t have to build everything. You can avoid vendor lock-in and change your architecture as your needs change. You still own and control the full experience.

And nobody may ever know your solution was built with third party components. To the end user, they don’t even care, it just works.

Where you will be if you marry the right voice partner

Having flexibility in your tech stack will be even more important in future, as technology continues to become intertwined in our lives. Shawn says that in the next two to five years:

“Everything’s tied, all your devices are connected, whether it’s your phone, or your car, or an Alexa”

To achieve this future, our tech stacks are going to need a lot of modules, and they’ll be interconnected in various ways.

The final word

You could say that going modular will solve many of your troubles.

You’ll be able to incorporate new technologies as they arise – and in this industry, there’s no telling what will appear tomorrow. New technologies can, and will, drastically alter tomorrow’s voice tech landscape. Companies need to build a tech stack that allows them to evolve with the times and changing needs/expectations of customers.

So you should probably find partners who can help enable that with you.

Listen to the full interview with Shawn Edmunds here or on Apple podcasts, Spotify, YouTube or wherever you get your podcasts.

This article was written by Benjamin McCulloch. Ben is a freelance conversation designer and an expert in audio production. He has a decade of experience crafting natural sounding dialogue: recording, editing and directing voice talent in the studio. Some of his work includes dialogue editing for Philips’ ‘Breathless Choir’ series of commercials, a Cannes Pharma Grand-Prix winner; leading teams in localizing voices for Fortune 100 clients like Microsoft, as well as sound design and music composition for video games and film.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
resolution	session	This is a functionality cookie used to collect the horizontal value of the visitor screen resolution. It helps in optimizing the website view to the user.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_111445333_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
ajs_anonymous_id	never	This cookie is set by Segment.io to check the number of ew and returning visitors to the website.
CONSENT	16 years 2 months 25 days 18 hours	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__smSessionId	9 hours	No description available.
__smToken	1 year	This cookie is set by the Sumo. This cookie is used for verifying whether the user is logged in or not.
__smVID	1 month	This cookie is set by Sumo. The purpose of the cookie is not yet known.
_mailmunch_visitor_id	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
AnalyticsSyncHistory	1 month	No description
attribution_user_id	1 year	This cookie is set by the provider Typeform. This cookie is used for Typeform usage statistics. It is used in context with the website's pop-up questionnaires and messengering.
cookielawinfo-checkbox-functional	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
debug	never	No description available.
intercom-id-or0x2acp	8 months 26 days 1 hour	No description
intercom-session-or0x2acp	7 days	No description
li_gc	2 years	No description
li_sugr	3 months	No description available.
mailmunch_second_pageview	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.

Voice AI Lego: the importance of modularity

So how does that work? The modular approach

Where you will be if you marry the right voice partner

The final word

Voice with X-Ray vision?

Here’s something you’ll need soon enough