In a recent episode of VUX World, I had the opportunity to discuss current trends in enterprise adoption of generative AI and large language models, with Matt Taylor, Chief Product Officer and Co-Founder of Knowbl. Matt shared some perspectives on how businesses are adopting LLMs, the challenges and limitations of Retrieval Augmented Generation (RAG) and some of the ways Knowbl is approaching reliable AI service delivery.
Focusing on Internal Use Cases: Minimising Risks
A significant trend Matt highlighted, which I’ve also observed from the marketing and positioning of vendors in the CCaaS space, is the current focus on internal, staff-facing use cases for generative AI.
Most enterprises that are exploring generative AI aren’t putting it in front of their customers. They’re using it for things like internal knowledge search, or contact centre agent-facing use cases, such as call summarisation.
This approach is taken to reduce the risk of generating inaccurate or irrelevant responses to customer queries. In theory, these companies are working things out in a risk-free environment until such time when they have more ‘control’ over the AI models they’re using. Presumably, once they get comfortable with this ‘control’, they’ll consider releasing something publicly.
This is a fair enough stance, with one exception: generative AI models will always have the risk of hallucination, and you can never fully ‘control’ them.
The quest for control: removing hallucinations
This is one of the primary trends of 2023; the efforts to put in place guardrails on AI models so that you can remove or mitigate the hallucination problem. Everybody is trying, and a few claim to have solved it sufficiently enough, but as with all of this stuff, the proof is in the pudding.
The challenges with RAG and semantic embeddings
Two trends in attempting to remove hallucinations are Retrieval-Augmented Generation (RAG) and semantic embeddings.
RAG is used to ground LLMs in external, company-specific data. However, Matt pointed out a critical limitation: the inability to guarantee output consistency. This alone doesn’t solve the hallucination problem. For enterprises, this unpredictability is a deal-breaker.
Many vendors layer on top of RAG a whole bunch of additional prompts to check for things like profanity, accuracy, and to turn retrieved data into sufficiently conversational responses. However, all of this prompt chaining behind the scenes isn’t strictly guaranteed to fix the hallucination problem, and businesses can’t vet the responses for accuracy at scale.
This is enough to have the majority of enterprises shy away from customer-facing generative AI implementations.
Knowbl explored semantic embeddings, which use vector representations to match the user’s query with relevant content. While effective in search applications, Matt explained that this method lacked conversational elements like contextual follow-ups, which are essential for a natural interaction.
Mitigating risk and leveraging LLMs for enterprise use cases
To manage some of these limitations, Matt explained Knowbl’s approach to using large language models for enterprise AI applications.
1. Contextual Awareness: Patented technology apparently enables the AI to comprehend previous queries, enabling a more coherent and context-rich conversation.
This isn’t the same as many are trying today. Most people trying to use LLMs to manage context are purely just keeping the conversation transcript and feeding that back through the model at each turn of the conversation, with each prompt. This winds up creating ever-lengthening prompts and actually increases the risk of hallucinations and errors the longer the conversation becomes.
2. Entity Extraction: Traditionally a challenging aspect that, without LLMs, can require bucket loads of training data. Even with LLMs out of the box, I’ve personally experienced mixed results with the raw models. Knowbl leverages LLMs to streamline this process, which Matt claims enhances the accuracy and efficiency of extracting relevant information from conversations.
3. Expedited Workflow Development: By utilising transcript inference, Knowbl accelerates the process of building out complex workflows. This uses transcripts from real conversations with agents, summarises the most commonly traversed conversation pathways, and builds workflows automatically from this.
There are other vendors I’ve seen which have the ability to describe a use case, then have the platform generate the conversation flows. Whether they’re all based on real transcripts though, I’m not so sure.
4. Content Transformation: Most businesses content and knowledge isn’t in great shape. So most folks that attempt to use RAG simply feed the machine with whatever data a brand has. This is not only usually flooded with inaccuracies and out of date content, but it’s certainly not written in a conversational nature.
You might think that this is where LLMs would shine, however, even if the data is in good shape in the first place, you’ve still got the risk of hallucination and an unpredictable user experience.
Matt told me about how Knowbl first spends time to make sure content and knowledge is clean and up to date, then uses summarisation and rephrasing techniques to transform content into a more conversational nature. Then, there’s a human in the loop checking the content and approving it before it’s fed into the knowledge base. This is then the content that is used verbatim in agent responses.
So rather than using an LLM to rephrase and summarise content after retrieving it from a knowledge source, Knowbl use the technology to clean up the data and get it ready for consumption before the content is put into the knowledge base in the first place. That’s pretty novel.
There were a host of other techniques we discussed during the podcast which businesses and developers can use to increase the effectiveness of their AI agents and to leverage LLMs and generative AI for its strengths.