Voice AI for customer experience: why it’s back and how to make it work

Voice AI for customer experience: why it’s back and how to make it work 1610 907 Kane Simms

An old red rotary telephone covered in dust and cobwebs sits beside a modern smartphone displaying an incoming call. Bright golden-yellow digital waveforms and glowing communication symbols flow from the vintage phone, symbolising the revival of voice communication through modern AI technology.

After years of investment in apps and digital self-service, most businesses assumed customers would call less. According to Henry Vaage Iversen, Co-Founder and CCO at boost.ai, even in the Nordics, where digital adoption is high, phone calls remain the biggest customer service channel. Furthermore, according to a 2025 McKinsey analysis, 57% of customer care leaders expect call volumes to increase over the next one or two years.

Henry laid this out in a recent conversation on the VUX World podcast. If voice AI isn’t on your roadmap yet, now is a good time to take a closer look.

The technology has caught up

Voice AI struggled five years ago, mostly because the technology wasn’t meeting consumer expectations. Speech recognition wrestled with accents and dialects and conversations felt robotic because of the NLU models behind the scenes. Generative AI changed that, making conversations significantly more natural. This is leading to more businesses prioritising voice AI.

A couple of years ago, around 20% of boost.ai’s customers were deploying voice. Now, most new projects include voice as a core component. 2026 is when the significant uptick occurred.

Automation rates that actually move the needle

According to Henry, early voice AI projects typically contained around 20-30% of incoming calls. boost.ai is now seeing customers reach 60-70%, with some hitting 75% containment on inbound calls. That means three-quarters of calls are handled end-to-end by AI, without a human agent involved.

Henry frames this as a capacity story. When volumes keep rising and your team stays the same size, AI keeps the operation from breaking.

Don’t just add voice to your existing chatbot

CX teams often learn this the hard way. You can’t lift a chatbot onto a phone line and call it a voice bot. Voice users behave differently. They’re more sensitive to latency. They notice immediately if the AI mispronounces something or hesitates in the wrong place. The conversational flow is more fluid than chat, with more turns, more small talk and more variation. Voice also removes everything chat gives you visually: no links, no forms, no images, no tables.

Some things that work perfectly in a chat interaction fall apart on voice. Henry gives the example of asking a customer to provide an address in Finland, where street names can run to 30 characters. Voice doesn’t suit every step of a journey. Therefore, the smarter play is sometimes multimodal: handle the conversation on voice, then drop into a messaging interface for the parts that need structured input.

Guardrails matter more than you think

For anyone working in a regulated sector such as banking, insurance or government, hallucinations are still a real concern. boost.ai uses separate LLMs to monitor conversations in real time. Multiple AI systems watch the conversation as it happens and step-in if something goes wrong. Guardrails live in these monitoring layers, not in the instructions given to a single model. It won’t get to zero, Henry says, but they’re getting close.

Transparency matters just as much. You need to see how the AI reached a conclusion, where it fetched information and what happened at each step. A consumer can accept most hiccups when chatting with a tool like Claude, because the value outweighs the occasional wrong answer. A bank handing out a quote, processing a claim or offering financial guidance operates under different stakes. One wrong answer can make headlines.

Test it like it’s live

Before going to production, boost.ai creates voice agents that call the customer-facing voice agent directly, simulating real conversations across different customer profiles, ages, sentiments and dialects.

Voice is probabilistic, so simulated testing at scale surfaces the edge cases that matter.

Rebuild the journey from the ground-up

Most organisations approach voice AI the same way they’ve approached every other AI initiative: find something that exists, add AI to it, and make it slightly better. Henry argues this is the wrong frame entirely.

Organisations that see the best results rethink how customers interact with the business from the ground-up. AI becomes the interface across every channel, holding context throughout. A customer shouldn’t have to start a conversation again because they switched from chat to voice. The AI agent should know.

That kind of thinking takes more than a technical team. The people who understand the customer, the CX professionals and the contact centre leads, need to be central to how these systems are built and improved over time. The technology is more accessible than ever. The experience design is where a lot of work is required.

Two AI agents talking to each other

At a recent VUX World event, the team called all attendees using a voice AI agent trained on my voice, checking dietary requirements and confirming attendance. Some of those calls hit automated call screening on the recipient’s phone. The screening agent answered, asked who was calling and why, the VUX agent explained itself, the screening agent relayed that to the human, got a response and passed it back. This is a good example of two AI agents completing a full exchange. No humans involved until the very end.

Businesses need to be ready for this. When customers have their own AI agents making calls on their behalf, volume will increase significantly. The organisations that build solid voice AI infrastructure now will be the ones positioned to handle it.

Listen to the full episode with Henry Vaage Iverse.

Share via
Copy link
Powered by Social Snap