The 3 types of NLU systems in conversational AI

Kane Simms
February 13, 2023
in Article

The 3 types of NLU systems in conversational AI https://vux.world/wp-content/uploads/3-types-of-NLU-website-hero.png 1600 1200 Kane Simms Kane Simms https://secure.gravatar.com/avatar/26839585565b6484d0560f5e365378f0?s=96&d=blank&r=g February 13, 2023 February 13, 2023

We recently spoke to Raj Koneru and Prasanna Arikala of Kore AI on the VUX World podcast, discussing Large Language Models (LLMs) and the forecasted impact they’ll have on the creation of enterprise AI assistants.

Raj shared his thoughts on the types of NLU systems that exist today, and the benefits of each. This will help creators understand a little more about the way LLMs work and how you can tune them vs the industry standard intent-based NLU models.

3 types of NLU from VUX World - intent based, zero-shot and few-shot LLMs

Three types of NLU

1. Curated, intent-based model

This approach involved using an intent-based NLU with customised intents and training data, which is the most common approach used by most businesses today. Here, you gather your own training data to form your own intents based on your business needs.

User utterances are then matched and classified against your intents based on the model’s ability to find a pattern between the utterance and sample training data it has in its model.

This works well for simple utterances, but struggles to understand things like long form sentences and utterances that are distinctly different from your sample training data.

Most NLU systems have used this approach so far, but the emergence of Large Language Models over the last 3 or so years is changing this.

2. Zero-shot model

This approach involves using a transformer-based Large Language Model (LLM) to generate understanding of a customer utterance without the need to provide training data.

Large Language Models are trained on billions of data points and huge corpuses of data from readily available text online. They use sources such as Reddit, Wikipedia and others to train models on how to identify and reproduce patterns in language.

These advanced pattern matching systems perform great feats and can be used out-of-the-box to do things like intent classification and entity extraction.

Most of the LLMs available today, because they’re trained on general text data from the web, they’re not honed for specific business purposes. This means that out-of-the-box performance might only get you so far.

Also, because of the inherent limitations of pattern recognition, they’re prone to making a few mistakes here and there. This can result in some utterances being misclassified. However, I haven’t seen an assistant built on an intent-based system to date that doesn’t trip up and misclassify (or not match) on some utterances, either.

3. Few-shot model (hybrid):

This approach takes the best of both worlds and uses word embeddings to tune LLMs according to a few example phrases of the types of utterances you’d expect for a given intent.

This is how you can tune a Large Language Model to a specific use case or set of intents. By feeding it a few examples of different training phrases, you can provide it with additional context and influence how it classifies something.

This not only means that you can tune it for your specific business use cases, but providing some sample data means you can reduce the likelihood of it misclassifying.

According to Raj, you could even use an LLM to generate sample training data, which you’d then use to train your few-shot model. This can give you the efficiency of a zero-shot model, whilst ensuring that the model is tuned to your business needs. This gives you even more control, as you’re able to both influence the training and tuning of the model, as well as validate the output from it.

Multi NLU approach

“LLMs are highly accurate at classifying an intent, except when they get it wrong.”

Raj Koneru, CEO, Kore AI

As mentioned, an LLM misclassifying an intent can happen because LLMs are trained on world data from across the internet. They’re not highly tuned for your business use cases.

For an end user to ask ChatGPT a question, for example, and ChatGPT gets it wrong, it’s not consequential. For a user to ask a question of a business and the business gets it wrong, that is more consequential, especially for high-emotion or important use cases.

Therefore, the best approach is to utilise all three models above where relevant. And we’ll be diving into how you can architect this arrangement with Kore AI in an up-and-coming post.

Stay tuned.

For more information on Kore.ai, you can book a demo with the team or book a free consultation.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
resolution	session	This is a functionality cookie used to collect the horizontal value of the visitor screen resolution. It helps in optimizing the website view to the user.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_111445333_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
ajs_anonymous_id	never	This cookie is set by Segment.io to check the number of ew and returning visitors to the website.
CONSENT	16 years 2 months 25 days 18 hours	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__smSessionId	9 hours	No description available.
__smToken	1 year	This cookie is set by the Sumo. This cookie is used for verifying whether the user is logged in or not.
__smVID	1 month	This cookie is set by Sumo. The purpose of the cookie is not yet known.
_mailmunch_visitor_id	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
AnalyticsSyncHistory	1 month	No description
attribution_user_id	1 year	This cookie is set by the provider Typeform. This cookie is used for Typeform usage statistics. It is used in context with the website's pop-up questionnaires and messengering.
cookielawinfo-checkbox-functional	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
debug	never	No description available.
intercom-id-or0x2acp	8 months 26 days 1 hour	No description
intercom-session-or0x2acp	7 days	No description
li_gc	2 years	No description
li_sugr	3 months	No description available.
mailmunch_second_pageview	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.