Three smart ways to use LLMs alongside your NLU

Rebecca Christie
October 19, 2023
in Article

Three smart ways to use LLMs alongside your NLU https://vux.world/wp-content/uploads/Modern-Minimalist-Simple-Technology-Youtube-Channel-Art.png 2560 1440 Rebecca Christie Rebecca Christie https://secure.gravatar.com/avatar/61a62614f405a15d4978524b5df65a86?s=96&d=blank&r=g October 19, 2023 October 19, 2023

LLMs have made quite a mark haven’t they? They were just a whisper within conversational AI in the past few years. Now they’re getting an unbelievable amount of exposure. It’s been the year of ChatGPT.

Due to the fact that LLMs can convincingly talk about pretty much anything, most of the time, people have started to wonder if we need NLUs anymore. Why bother to spend time and money refining your intents, training data and entities, when an LLM can happily chat away for hours without it? And weren’t NLU-based bots too restrictive anyway? They could only take users down the paths you had predefined, so they couldn’t help anyone who came with a need you’d never considered.

In reality, though, it’s not either/or. And you should consider what happens when you take the best of both worlds. Each has their strengths and weaknesses – when used together, you can solve many of the issues the conversational AI industry has struggled with for years.

That was the focus of a recent ServisBOT webinar with Cathal McGloin, CEO, ServisBOT, and Kane Simms.

Here’s three smart ways to use LLMs alongside your NLU.

Using an LLM-fronted bot for improved semantic understanding

NLUs need to be trained so that the various things a user can say to it will be classified to specific intents. While training gives you control over the result – for example you can train the NLU so that a user who says ‘what’s my balance’ will be matched to the balance_check intent – it also means that you have to define, in advance, the various things users might say and need to find examples of those utterances. You define the rules that will be used to filter language.

LLMs are different. The likes of ChatGPT have been trained on a vast dataset which means that it should be able to predict how language behaves. That means that users can ask ‘what’s my balance’ as well as ‘how much money have i got’ or even colloquial phrases such as ‘have I any dough’ and the LLM should be able to match those words to the most likely intent (without giving tips on bread-making).

This means there’s great potential for using LLMs to front-end conversational AI assistants, where they can parse the user’s input to glean their need from it, and then route it to the correct intent.

Due to the fact that LLMs have been trained on vast datasets, they should be better suited at picking up the various ways users can say things.

Cathal presented a demo for an embassy where a user was asking about visa applications. Semantically similar phrases such as ‘how much does it cost for a visa’ and ‘how much does it cost to apply for a visa’ were correctly identified by the LLM as the user’s need to find out the price. The NLU treated each differently however, and while it provided pricing as a response to the first utterance, it treated the second utterance as a request to find out about the application process. That’s because the word ‘apply’ had higher weighting over ‘cost’ in the NLU model, which led to the NLU misinterpreting the question.

While an NLU could be updated to include this new utterance, and correctly match it to the right intent, as Cathal says, the benefits of using an LLM instead are: “twofold – one is it understands meaning and I don’t have to tell the bot the difference between applying for a visa and the cost of the visa anymore… and the second big one is, I actually don’t have to give my bot any training data. I just have to give it very clear language to say ‘this is the intent called ‘cost of a visa’ and I use the LLM to [know] when to trigger it.”

You still want to create intents, to ensure that users are directed down suitable paths, but your need for training data could be reduced if you use an LLM in this way, according to Cathal.

Creating guardrails around an LLM with a pre-designed flow

Here’s an innovative hybrid.

Flowcharts are common artefacts used when designing conversational AI assistants. They essentially allow you to pre-design the start, middle and end of a conversation. You start by outlining the boundaries of the experience (who the bot is, and what it can and can’t do), then the middle is where important information is shared or collected by the bot, and the various endings are the resolutions of different user needs.

In the past, the flowchart would define the paths a conversation could go down, and then the NLU would be used to ensure that it actually works in a live conversation. The NLU would capture the things users say and route them to the path that seems most correct, based on how the NLU was trained.

Cathal presented an alternative design. A flowchart was used to define the experience, but there was no NLU. Instead, the user’s inputs were being sent to ChatGPT which generated the response.

Guardrails within the design, meaning that the LLM isn’t free to respond in any way it sees fit. For example, jailbreaking LLMs is an unfortunate issue that needs to be considered when utilising them. Jailbreaking is basically hacking. Bad actors try to find a way to get the LLM to share something that they’re not supposed to, such as getting a bank’s bot to reveal how to break into the bank.

Cathal had predefined the experience within his flow, by telling ChatGPT to play the role of a security analyst, and telling it that it shouldn’t respond to users who are trying to elicit information that the bot shouldn’t provide.

This highlights how LLMs require us to change our thinking when creating conversational AI. Rather than designing everything we want to include in the bot, instead we give the bot a vast information resource and tell it everything we want it to exclude from its responses.

There are benefits to working this way which in the past would have required an enormous amount of work. For example, it’s possible to ask the bot a question in German, have it formulate a response for you that was sourced from an English document, and then the bot replies to you in German. In the past that would have required the coordination of the design, development and localisation teams to achieve, but with an LLM it’s apparently simple and easy, according to ServisBOT.

With this approach, you forego the challenge of training an NLU, and instead need to define how the LLM is constrained. It’s questionable how much time you save, as you likely need to regularly update the guardrails around the LLM as new issues are discovered.

Using an LLM to test and train a bot

NLUs are never ‘done’. They don’t work well when they have a ‘small’ amount of data (generally speaking, less than 50 utterances per intent, but ideally you want 100s or 1000s of utterances per intent). The challenge is that the more you add into them to try and ensure that the bot will interpret a user’s utterances correctly, the more likely you are to have confusion within the model, where it has false positives and false negatives.

It’s common practice to keep refining an NLU’s training to try and improve this, but it’s time-consuming work. You need to identify the confusion, find and add data to improve it, then train it and test it, and then analyse whether you’ve made it more robust or possibly made it worse. It’s not uncommon to discover that a change you made to the data with the best intentions made the model worse.

LLMs could help here. As they’re vast stores of data that contain various ways people say different things, an LLM could be used to first test the NLU with semantically similar utterances to check how well it identifies them, and secondly add additional data if it finds a weak spot.

Automating the testing of an NLU and the generation of new training data to strengthen it could potentially make the management of the NLU much easier. On any project the NLU’s training data should grow as the bot has more interactions. That’s good practice – you improve the training data as you observe how users talk with your bot. The challenge is that over time it becomes harder to manage. Using an LLM in this way can help a great deal to stay on top of the complex relationships between your intents and training data.

Summary

There’s years of hard-earned knowledge around the design and maintenance of NLUs. They work well when you know what your user wants to do and how they’re likely to ask for it, and the process you need to go through to service their needs. There’s no reason to bin something that works just yet – a well trained NLU is robust enough to service most user’s needs.

While there are some who have been working with LLMs for years, they’re still a black box. As Cathal shows, there’s plenty of inventive ways to utilise an LLM alongside your NLU to glean the benefits of both. They can help users who have an unusual need, or express themself in an unexpected way. That stuff happens everyday with most bots, so LLMs are an asset if they help more users achieve their goals.

Why pick one or the other? When they’re combined you can help more users. Isn’t that what this is all about? Before we go all-in on one piece of technology, we should consider how the tech serves the people who use it, in all their various ways. With that in mind, you can see the benefits of both an NLU and an LLM.

Thanks to Cathal McGloin and the ServisBOT team for joining us for the webinar. If you’d like to take ServisBOT up on their offer of a Complimentary Bot Accuracy Analysis, you can sign up here.

Complete a short anonymous survey from ServisBot to share how your business uses NLU and LLM in building conversational AI experiences for a chance to win an iPad or a Google Pixel tablet!

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
resolution	session	This is a functionality cookie used to collect the horizontal value of the visitor screen resolution. It helps in optimizing the website view to the user.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_111445333_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
ajs_anonymous_id	never	This cookie is set by Segment.io to check the number of ew and returning visitors to the website.
CONSENT	16 years 2 months 25 days 18 hours	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__smSessionId	9 hours	No description available.
__smToken	1 year	This cookie is set by the Sumo. This cookie is used for verifying whether the user is logged in or not.
__smVID	1 month	This cookie is set by Sumo. The purpose of the cookie is not yet known.
_mailmunch_visitor_id	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
AnalyticsSyncHistory	1 month	No description
attribution_user_id	1 year	This cookie is set by the provider Typeform. This cookie is used for Typeform usage statistics. It is used in context with the website's pop-up questionnaires and messengering.
cookielawinfo-checkbox-functional	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
debug	never	No description available.
intercom-id-or0x2acp	8 months 26 days 1 hour	No description
intercom-session-or0x2acp	7 days	No description
li_gc	2 years	No description
li_sugr	3 months	No description available.
mailmunch_second_pageview	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.

Three smart ways to use LLMs alongside your NLU

Using an LLM-fronted bot for improved semantic understanding

Creating guardrails around an LLM with a pre-designed flow

Using an LLM to test and train a bot

Summary

Yes, ChatGPT has a persona and you should be careful

New research: customers and agents tell CX leaders where they should focus