Why should you measure bots like humans?

Ben McCulloch
November 16, 2022
in Article, Opinion

Why should you measure bots like humans? https://vux.world/wp-content/uploads/Articles.png 1120 840 Ben McCulloch Ben McCulloch https://secure.gravatar.com/avatar/b1f3549c2d953651d69f59ec1fa801a3?s=96&d=blank&r=g November 16, 2022 November 2, 2022

You’ll get a variety of answers if you ask anyone with a call centre bot how they measure its success. They might measure containment rates (how many callers were serviced by the bot without any help from a human agent). They might also measure CSAT (customer satisfaction score) and NPS (net promoter score).

Those are flawed success metrics, as Frank Schneider explains to Kane Simms in his VUX World interview.

How do you measure any conversation?

We’re too focused on the numbers these days. You could focus on the accuracy of your NLU (Natural Language Understanding) for example – it’s a very important component of your conversational AI system – but what about resolving customer problems?

Bear with me here. I want to look at this from a different angle. Let’s turn this around and take the focus off AI for a moment. How would you measure the success of a human-to-human conversation between customer and brand in any context?

We’ll boil it down to a simple everyday scenario.

Let’s say you go into your neighbourhood bakery to buy a loaf of bread. You don’t know the staff personally, but you’ve been there before. The first baker says “what would you like?”, so you say you want that sesame loaf they bake so well. Another baker overhears you and places your loaf of bread on the counter. They say nothing and return to what they were doing. As you pay, the first baker says “I know you love this sesame seed loaf – we’ve been trying different recipes so you should come back next Monday to try our improved version.”

You leave the shop with a smile on your face, and then someone appears with a clipboard and asks you to rate each baker. How would you do it? Although the total experience was just fine, if you put each interaction under a microscope you could say the bakers were underperforming.

Baker one asked what you wanted, but didn’t give it to you. Baker two said nothing but gave you what you wanted. Finally, baker one teased you with a new product which you might never buy. So, baker one was pretty ineffective really? They only asked you a question and floated an advert to you but delivered nothing. How high would you rate them? Probably quite low. Baker two could be seen as antisocial because they said nothing, but in fact they gave you exactly what you wanted.

The combined result was that everything went smoothly and you got what you came for with minimal friction! Measuring each baker’s effectiveness gives a slightly different perspective though doesn’t it?

That’s the challenge

It’s not so easy to measure any customer experience whether it’s humans or bots. In the above scenario baker one didn’t need to get the bread because baker two did it. Baker two didn’t need to say anything – would “here’s the loaf of bread you just asked for” or “there you are” have helped the situation at all?

You could say some call centre bots are like baker one – asking questions and then escalating to an expert. But bots can also be baker two – delivering what you need so seamlessly you barely notice, in the background, without any communication.

And when someone talks to a call centre, they don’t stay on the same subject. They might suddenly remember something else they wanted, or get distracted and ask to leave so they can return to the conversation later. Can you measure the success of conversations in those scenarios?

Measure bots the same way you’d measure humans

Speakeasy are thinking differently

As you see, we’re not dealing with situations that can be assessed on simple criteria. You really should check out Speakeasy AI’s whitepaper. Rather than focusing on metrics such as containment (which is a poor metric because proper bot integrations allow both live agent and bot to work as a team rather than competitors) they propose measuring the ‘correctness’ (how closely the response aligns with the truth) and ‘fluency’ (how smooth and effortless the flow of conversation is) of conversational AI.

It’s all about measuring your bots as you would measure your live agents, rather than focusing on out-of-date metrics:

“If you introduce measures around business KPIs, contextual awareness, courtesy, and generosity, then you’ll train your AI to be so much more than just error-proof. When done right, conversational AI can significantly improve the quality of your customer service — we just need to assess and train virtual assistants with as much enthusiasm as we invest in human agents.”

You can download the whitepaper – it’s baked perfectly and full of goodness 😎

You can hear Frank’s full interview here where he gives some great insights into conversational AI.

This article was written by Benjamin McCulloch. Ben is a freelance conversation designer and an expert in audio production. He has a decade of experience crafting natural sounding dialogue: recording, editing and directing voice talent in the studio. Some of his work includes dialogue editing for Philips’ ‘Breathless Choir’ series of commercials, a Cannes Pharma Grand-Prix winner; leading teams in localizing voices for Fortune 100 clients like Microsoft, as well as sound design and music composition for video games and film.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
resolution	session	This is a functionality cookie used to collect the horizontal value of the visitor screen resolution. It helps in optimizing the website view to the user.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_111445333_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
ajs_anonymous_id	never	This cookie is set by Segment.io to check the number of ew and returning visitors to the website.
CONSENT	16 years 2 months 25 days 18 hours	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__smSessionId	9 hours	No description available.
__smToken	1 year	This cookie is set by the Sumo. This cookie is used for verifying whether the user is logged in or not.
__smVID	1 month	This cookie is set by Sumo. The purpose of the cookie is not yet known.
_mailmunch_visitor_id	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
AnalyticsSyncHistory	1 month	No description
attribution_user_id	1 year	This cookie is set by the provider Typeform. This cookie is used for Typeform usage statistics. It is used in context with the website's pop-up questionnaires and messengering.
cookielawinfo-checkbox-functional	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
debug	never	No description available.
intercom-id-or0x2acp	8 months 26 days 1 hour	No description
intercom-session-or0x2acp	7 days	No description
li_gc	2 years	No description
li_sugr	3 months	No description available.
mailmunch_second_pageview	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.

Why should you measure bots like humans?

How do you measure any conversation?

That’s the challenge

Speakeasy are thinking differently

Holistic AI services: the future of CX?

Why you should take data privacy more seriously