When creating AI voice assistants and chatbots, we are witnessing a similar phase of uncertainty and exploration as we did with the web in the late 90s. Through understanding the basic structure of dialogue, including patterns and sequences, we can start to design more predictable and robust conversations.
—
Tickets for #Unparsed, the world’s first #conversationdesignconference, are on sale now!
It’s taking place in London this July 24th and 25th, brought to you by VUX World and labworks.io and featuring some of the most renowned experts in the #conversationalai space.
Visit the Unparsed website for more information.
See you there?
—
The need for conversation design norms
Just as the web did, the field of conversational design needs to evolve, creating established norms and practices that guide how we structure AI-human dialogues.
In the late ’90s, when the Internet was blooming, web design was a relatively uncharted field. The creation tools were accessible to everyone, but there was no definitive standard or ‘best practice’ in designing websites.
The early web was filled with experimental and often amateurish designs. Fast forward to today, web design has evolved, drawing from disciplines like graphic design, typography, and photography.
When designing a conversational agent, one of those norms ought to be the understanding of basic conversation mechanics, including the structure of dialogue.
Understanding the ‘Structure of Dialogue’
At the heart of every conversation are patterns and sequences. One primary type of sequence is known as adjacency pairs. This works on the principle of action and reaction. For example, an invitation prompts either acceptance or rejection.
If I was to say “Hi”, it’s an invitation for a conversation. You might say “Hi” back (acceptance), or smile and continue walking (rejection).
This two-turn conversation forms the basis of many of our daily interactions. These pairs can be expanded through preliminary questions or subsequent inquiries, crafting a more extensive and interactive dialogue.
Adjacency pairs
Adjacency pairs form the basis of any conversation, constituting an invitation and its acceptance or rejection. They are basic two-turn constructs like:
“Do you want to go out for a pint?”
Followed by a response:
“Yes, sure.”
Under ideal conditions, the conversation could jump from beginning to end with the fulfilment of a condition as above. However, conversations can be more complex, and these pairs can expand depending on several contingencies.
Expandable sequences
The concept of ‘expandable sequences’ becomes essential here. A conversation isn’t merely an exchange of words but is based on a sequence of actions, which can expand or contract based on the context.
The base sequence can be expanded with preliminary questions, additional sequences inserted between the two pieces, and sequence closers at the end.
Preliminary questions
For instance, a preliminary question might be the speaker first checking the availability of the listener before making an invitation, leading to an expansion of the sequence. For example:
“What you doing this Thursday?”
“Nothing, why?”
“Do you want to go out for a pint?”
“Yes, sure.”
Additional sequences
Or, the listener might insert an additional sequence in between the pair by asking clarifying questions or gathering more information, such as:
“What you doing this Thursday?”
“Nothing, why?”
“Do you want to go out for a pint?”
“Where you thinking?”
“The Dog and Duck.”
“Yes, sure.”
Sequences can expand further as the pair negotiate the venue, meeting time and so on.
Sequence closers
Finally, a sequence closer is something that wraps up the conversation, signalling to both parties that the sequence is over:
“What you doing this Thursday?”
“Nothing, why?”
“Do you want to go out for a pint?”
“Where you thinking?”
“The Dog and Duck.”
“Yes, sure.”
“Cool, I’ll see you later.”
“See you soon.”
Relevance for conversation design
The concepts of adjacency pairs and expandable sequences are crucial in designing an adaptable chatbot or voice assistant. Your adjacency pairs are essentially the constructs from which you’ll gather the required information to fulfil your use case.
For example, let’s say you’re creating an assistant for booking an appointment at a hair dressers. You’d need to gather an appointment day and time. That’s two adjacency pairs.
“What day would you like to come?”
“Tuesday”
“What time?”
“10am”
That sequence might expand if your customer asks “Do you have anything on Wednesday?” and further still if you end up negotiating a date and time:
“What day would you like to come?”
“Do you have anything on Wednesday?”
“We could do 2pm?”
“No, that won’t work. Anything earlier?”
“I’m afraid not. We have 10 am Tuesday?”
“Sure, that’ll work.”
With every question or response, you should anticipate and design for how a user might expand the sequence. Listening to and reading real conversation transcripts can inform your expandable sequences, too.
Focusing first on the required adjacency pairs, and then on their expansion and closing, will help you craft conversations that handle the twists and turns of a conversation.
The next stage is implementing this in your tool of choice. A challenge that we’ll address in our next article.
—
This article was written in summary of the VUX World podcast episode with Bob Moore, IBM. Listen to the full episode on YouTube, Apple Podcasts or Spotify.