Overcoming the shortfalls of voice AI

Ben McCulloch
September 13, 2023
in Article, Opinion

Overcoming the shortfalls of voice AI https://vux.world/wp-content/uploads/analogue-phones-1.jpg 1900 1200 Ben McCulloch Ben McCulloch https://secure.gravatar.com/avatar/b1f3549c2d953651d69f59ec1fa801a3?s=96&d=blank&r=g September 13, 2023 September 13, 2023

Talk is highly detailed. Everything we say can convey meaning, and we don’t often chat in silent environments. This means that for machines to parse human speech, they need to collect as much information as possible, and separate the desired signal from the background noise.

While ASR has improved, there’s still plenty of room for improvement – for example, conversational assistants still struggle with interruptions, which are a normal component of conversations.

One company that’s done incredible things with natural language understanding is Action AI. CEO, John Taylor, spoke all about it on VUX World’s stage at The European Chatbot & Conversational AI Summit in Edinburgh in 2023.

Customers still want to call you

It’s still common for businesses to use call centres, and rather than being replaced by chatbots and voice assistants, phone lines are also being automated so that they serve user’s needs better.

“I’m sure voice is not going away from a customer service perspective. It’s still about 70% of all contacts. Here are a few of the call centre challenges: calls are expensive to handle, customer service isn’t always great via voice, long wait times, difficulties getting first call resolution when you get through, and you don’t always get through to the right person every time.” John Taylor, CEO, Action AI

There are also specific challenges when automating voice detection, transcription and understanding; machines often struggle with detecting the start and end of speech, handling interruptions, dealing with disfluencies (e.g. stutters or repetitions), and recognizing background noise.

Every domain has bespoke needs

Action AI has aimed to solve these problems by tailoring their solution to each client’s specific domain, such as banking or utilities, as they each have different needs.

But while each domain may use specific language and have certain needs, there’s commonalities among many customer service calls.

John Taylor on the stage of VUX @ The European Chatbot and Conversational Summit 2023 in Edinburgh

End of speech detection challenges

One of these is deciding when the customer has finished speaking. Consider this example; a customer was asked for their ID number and they said “oh, my ID number – Let me find that for you,” and then they stopped talking while they went to look for it.

Often, silence is used as a marker to detect the end of an input. In other words, the customer’s unfinished sentence might be taken as their entire utterance, and the assistant would attempt to act on their words, which would lead to errors as they’ve not yet given the information they were asked for.

The challenge is in creating systems that are aware that the customer hasn’t stopped talking yet! Soon after the silence they’ll start talking again and say something like, “got it – my ID number is 123…”

Action AI have considered such scenarios, and their technology will wait for the customer to return and give the vital information.

To enhance the system further, they incorporated GPT models, which allows them to parse complex user utterances. Traditional NLU systems are designed to respond to pre-decided user inputs, whereas LLMs can parse unexpected inputs to glean the user’s intent.

A voice UI should remove friction

Simply, people just want to talk to machines like they would talk to another person. They expect it to work.

For machines, this means they need to be able to glean meaning from diverse and complex utterances, as expressed by diverse and nuanced people!

When John described his speech he said, “I’ll be verbose. I’ll go off subject, I’ll change my mind. I’ll stutter. I’ll be human. That’s okay. It’s valid for me to be human.”

That applies to every single person, and every single customer. Our goal is for machines to be able to understand everyone better.

According to Action AI, that needs domain-specific training and advanced tools to ensure a seamless and effective customer experience.

Watch John’s entire presentation for live demos of Action AI.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
resolution	session	This is a functionality cookie used to collect the horizontal value of the visitor screen resolution. It helps in optimizing the website view to the user.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_111445333_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
ajs_anonymous_id	never	This cookie is set by Segment.io to check the number of ew and returning visitors to the website.
CONSENT	16 years 2 months 25 days 18 hours	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__smSessionId	9 hours	No description available.
__smToken	1 year	This cookie is set by the Sumo. This cookie is used for verifying whether the user is logged in or not.
__smVID	1 month	This cookie is set by Sumo. The purpose of the cookie is not yet known.
_mailmunch_visitor_id	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
AnalyticsSyncHistory	1 month	No description
attribution_user_id	1 year	This cookie is set by the provider Typeform. This cookie is used for Typeform usage statistics. It is used in context with the website's pop-up questionnaires and messengering.
cookielawinfo-checkbox-functional	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
debug	never	No description available.
intercom-id-or0x2acp	8 months 26 days 1 hour	No description
intercom-session-or0x2acp	7 days	No description
li_gc	2 years	No description
li_sugr	3 months	No description available.
mailmunch_second_pageview	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.

Overcoming the shortfalls of voice AI

Customers still want to call you

Every domain has bespoke needs

End of speech detection challenges

A voice UI should remove friction

Why conversational AI should be accessible by design

Debt collections are challenging, and changing