fbpx

Gen AI could revolutionise voice too

Gen AI could revolutionise voice too 2560 1707 Kane Simms

How has generative AI impacted the voice space?

We’ve heard a great deal about LLMs being used in chatbots. While the most famous product is undoubtedly ChatGPT, LLM-based chatbots have been springing up everywhere, to varying levels of success.

But what about generative AI for voice? There hasn’t been much fanfare about specific deployments.

That’s why it was great for Cooper Johnson, JOB, Gridspace to share his thoughts with Kane Simms on the VUX World podcast. Cooper appeared in the CAI hot seat after he was voted by the attendees at this year’s Unparsed, as he’d inspired a lot of folks with his presentation there.

Where we’re coming from

In the early days of voice AI, systems were primarily rule-based. These systems relied heavily on pre-programmed responses and strict dialogue flows, making them rigid and prone to errors when users deviated from expected inputs. Designers had to anticipate every possible user response and program the system accordingly. This approach was not only labour-intensive but also limited the flexibility and naturalness of conversations.

For instance, error recovery in these systems required designers to write extensive fallback dialogues, which could result in a stilted experience as designers had to consider every possible deviation a user might follow and try to design a solution to get them back on track. They ended up in a constant feedback loop of discovering where users got stuck, designing solutions, and then testing and reading transcripts to see if it worked. Meanwhile they’d discover more users who got stuck! It could take weeks or months to fix a dead end, but the users needed it right at the moment they went off track.

Moreover, these traditional systems struggled with understanding the diverse ways in which users express themselves. Regional dialects, colloquialisms, and nuanced speech often confused these systems, leading to frequent breakdowns in communication. As a result, users were often left frustrated, unable to complete their tasks effectively.

Putting Gen AI in the mix

With the advent of generative AI, the landscape of voice AI has dramatically changed. Gridspace’s voice agent, Grace, is powered by large language models (LLMs) that allow for more fluid, natural conversations. One of the key advantages of this technology is its ability to handle a wide range of user inputs without extensive pre-programming.

Generative AI enables Grace to understand and respond to user queries in real-time, even when those queries are unexpected or off-script. This flexibility is a significant improvement over traditional systems, where any deviation from the expected dialogue path could result in a conversational dead-end. Grace’s ability to insert disfluencies or filler words, such as “hold on a second,” mimics human conversation, making interactions feel more natural and engaging.

Moreover, the integration of a robust knowledge base allows Grace to provide detailed, contextually relevant information without overwhelming the user with unnecessary details. This approach not only improves the user experience but also reduces the cognitive load on the user, making interactions more efficient and satisfying.

Another critical development is the use of personality sliders within the Gridspace platform. These sliders allow designers to adjust the tone, empathy, and verbosity of the voice agent to match the brand’s personality or the specific needs of the interaction. For instance, in a customer service scenario where empathy is crucial, Grace can be adjusted to provide more emotionally responsive dialogue, enhancing the overall user experience.

It’s time to talk about voice again

The advancements in gen AI have led to a significant improvement in the quality and reliability of voice interactions. Users are now able to engage with voice agents like Grace in a way that feels more natural and human. This shift has not only improved user satisfaction but also expanded the potential use cases for voice AI.

From customer service to healthcare, voice agents are now capable of handling a broader range of tasks with greater efficiency and accuracy.

Additionally, the ability of voice agents to recall and use contextual information throughout a conversation has opened up new possibilities for personalised interactions.

So, you see, while LLM-based chatbots have had a lot of headlines, we may soon see more attention on LLMs in voice experiences. As these systems continue to evolve there’s great potential for personalised, natural interactions that can meet a wide range of user needs.

For us practitioners, we need to have more modalities available than chat. Some use cases are better suited to voice. Thankfully, it looks like the purposeful application of LLMs in the voice space can help us to help more users

    Share via
    Copy link
    Powered by Social Snap