When designing or developing a conversational assistant, like a chatbot or a voice assistant, there are three fundamental things the AI assistant should be able to do.
This might sound basic, but with all the renewed interest in conversational AI, fuelled by large language models and ChatGPT, it’s worth layout out this stuff because this doesn’t change.
The three primary capabilities your AI assistant requires are:
1. Understanding language
This relates to the system’s ability to understand a user. This sounds silly and obvious, but it’s the first stumbling block for conversational AI systems.
We have mature tools available for natural language processing, speech recognition, and understanding. Intent-based NLU systems that have formed the foundations of most chatbots over the last 5 years, and are generally sound at predicting a user’s intent (provided they’re trained appropriately and optimised frequently).
Therefore, broadly, the language understanding aspect of conversation design has been well-addressed.
The role of LLMs in understanding
We’re in exploratory stages with large language models, but this technology has the potential to take an AI assistant’s ability to understand to human-level accuracy.
Our early experiments indicate that there’s value in having LLMs compliment intent-based NLU systems by classifying longer utterances that intent-based systems have always, and will always, struggle with, as well as for intent-based training data creation. (LLM capabilities are broader than this, of course, but our clients operate high value, high consequence use cases, which aren’t fit for LLM-centric approaches today.)
Although it’s early days with LLMs, and there’s many gaps in terms of safety and quality assurance, it’s still safe to say that the broader technology’s ability to understand an input isn’t the limiting factor of conversational AI success.
2. Access to knowledge or capability
Obviously, for an AI assistant to be useful, once it’s understood someone, it needs to be able to provide an accurate response.
Various methodologies have been developed for accessing and reasoning with information. The fundamental point of any kind of chatbot or voice assistant is that it has content with which it can answer questions. Or, it has the ability to retrieve data from business systems. Or, the ability to write to business systems. Without such knowledge or capability, the agent would have nothing to offer, rendering it redundant.
There are many ways to structure knowledge for chatbots, from external knowledge bases, to databases, to hard coded responses. Providing knowledge through an AI assistant isn’t an issue today.
From a data retrieval or data-posting perspective, this is typically a solved problem, too. Many businesses have been through some degree of digital transformation and have their business data available via APIs.
They also have the means and security to provide access to those APIs, so they can be utilised in a conversational interaction. And even if they don’t, it’s not rocket science. The path to follow is well-trodden and clear.
This, therefore, is also not a limiting factor in conversational AI success.
3. Conversation management
An AI assistant’s ability to manage a conversation is arguably the single biggest determiner of success. Yet, it’s the most underserved, misunderstood and negated part of most CAI initiatives.
I previously wrote about adjacency pairs and expandable sequences, the building blocks of conversation design. These are two critical elements that form part of conversation management, but there’s more to it than that.
To manage a conversation well, you need to design the end-to-end interaction in such a way as to give users the best possible chance of meeting their needs.
That means you need to have a good handle on conversation context and state. You need memory, reasoning, learning.
You need an understanding of business rules and logic. You need to cater for core primary conversational skills, and secondary conversational skills.
The challenges of conversation management
The challenges in getting this right today are abound. No tech platform I’ve seen to date provides out of the box capabilities to handle the complexities of a conversation particularly well. At least, not in any consistent or standardised way.
For example, IBM Watson uses actions and steps, whereas Google DialogFlow CX has Flows and Pages. None of them come out of the box with primary conversation skills and leave it up to developers to craft the scaffolding for each new conversation.
For now, regardless of the technology you’re using and the NLP that sits underneath it (LLMs or otherwise), to have a meaningful and successful conversation with customers, you’re not going to be able to avoid these three fundamentals. Get them right, and you’re well on your way to delighted customers and great CX.
Stay tuned for the next few articles on the best way to structure your dialogue management systems to make sure you can handle conversations successfully and consistently.