Bots take many shapes and forms, and each has their unique challenges. How about this one though – a digital human installed in a busy train station concourse in England?
It responds to voice, which can include all sorts of distinctive regional UK accents (the station’s in Newcastle where accents are strong) as well as people speaking other languages entirely? And people might ask it many more things besides “where’s platform three”? Now that’s an interesting use case! It’s well worth a closer look.
Luckily, Robert Cunningham, Innovation Manager at LNER (London North Eastern Railway, a British train operating company) spoke all about it with Kane Simms in a VUX World interview. The assistant is called Ella and she’s a great example of where voicebots could be heading.
Freeing staff to help customers
Train operating companies in Britain don’t just manage trains – they also take care of the stations their trains use. LNER manages 11 stations and has station staff who provide advice to travellers, but it’s not their primary role. Safety is the main concern of the staff.
After LNER’s CIO learned about a European rail operator’s experiments with automated customer service they saw an opportunity to try it for themselves. They wanted to see if a conversational avatar on the station floor could provide answers to the regular questions their staff answer – FAQ’s such as “where is platform three?” or “where can I buy coffee?”
This should free up their staff to help customers who have acute problems – if a customer is ill, or if there’s a safety issue for example.
So many questions
While LNER have a chatbot on their website for navigation and FAQs, it’s available to anyone between Tunbridge Wells and Tokyo. On the other hand, Ella was going to be positioned on a station floor – their pilot project was based in Newcastle station in the North East of England. This meant that the context of the assistant was going to play a significant role.
Robert’s team researched the project so that Ella could answer simple questions like an employee who worked every day in that station. They interviewed Newcastle station staff and found that “they’re asked a much wider range of questions than are rail related.” For example, customers would ask about products they could buy in the station, or even about places to visit outside the station.
In order to refine their approach, Robert’s team asked “what do we think the needs would be of a customer walking through the station?” They discovered a great deal, for example there are many people walking through the station who aren’t travelling by train – they may be there to collect someone or just visit the shops.
So while Ella needed to answer questions about rail information, the assistant also needed to answer a diverse range of questions related to other things. They split question types into eight categories altogether. Ella is a branded LNER assistant with an accompanying avatar, but she’s trained to answer questions about other rail operators who also use the station.
One area where LNER’s staff struggle is talking to foreign customers who don’t speak English – their aim for Ella was to have her answer questions in a variety of languages too.
That awkward moment before the conversation begins
Robert’s team had done their research and had clear expectations of what customers would ask. That informed the assistant’s intents, and they knew there would be specific keywords they needed to detect. Speech detection turned out to be a big challenge though.
Training speech recognition systems for accents and noisy environments
Accents in the North East of England can be very strong. Robert’s team had to train the assistant to understand accents from that region, so they enlisted volunteers from LNER to say UK station names. This gave them training data to prepare the Natural Language Understanding (NLU) component of the bot.
But that’s only part of the challenge. While the system could be trained to understand regional accents in an office environment, stations are noisy! Tannoy announcements, the squeal of trains braking, and the hubbub of people passing through all add up to a constant din that affect the bot’s microphone’s ability to hear the user’s utterance. Therefore, they had to train their voice AI to work in noisy environments.
The value of transcriptions
The team gained many more learnings besides. Due to the loud environment, customers were never sure the virtual human had heard them. Their solution was to display what the customer had said before the assistant replied to it, transcribing the user’s utterance on screen in real time. This is a great idea – not only does it give feedback that the system is listening, it also swiftly lets the customer know if they’ve been heard but misunderstood too.
And they also found that starting a conversation with Ella wasn’t always fluid. People would approach Ella and then stand and wait. There’s no touch screen – the interaction is all vocal. As people aren’t used to this kind of experience yet, Robert’s team had to tweak the system to make it feel natural. For example, once the interaction started people interrupted Ella often. Although that’s common in human conversation, it’s a problem when Ella’s Natural Language Processing (NLP) wasn’t ready to detect an input.
As Robert knew, the bot had to fulfil it’s purpose otherwise the public would give up. “If it’s not resolving an answer, and they actually then walk off and talk to a member of staff to get resolution, basically, the avatar has failed to do what it was supposed to do.”
Ella’s hella good
Ella was in Newcastle station for a 10-week trial. So, how successful was she? For Robert’s team, the focus of this pilot was only to find out if Ella could resolve customer queries, and if she reduced the number of questions their staff got asked.
He says the data validated the pilot. When people wanted to ask train times, platforms and FAQs the assistant was extremely successful.
In order to develop the concept further, there were technical hurdles related to deploying a voice-first avatar within a busy station that need resolving though. The main challenge being cost. Digital avatars at this moment in time are expensive to create, and the hardware required to have these things on every station platform and in every carriage, is cost prohibitive at present.
It’s well worth checking out the full interview as Robert gives enlightening feedback on their experiences with avatar design, multimodality, dealing with station noise and more.
This article was written by Benjamin McCulloch. Ben is a freelance conversation designer and an expert in audio production. He has a decade of experience crafting natural sounding dialogue: recording, editing and directing voice talent in the studio. Some of his work includes dialogue editing for Philips’ ‘Breathless Choir’ series of commercials, a Cannes Pharma Grand-Prix winner; leading teams in localizing voices for Fortune 100 clients like Microsoft, as well as sound design and music composition for video games and film.