Why you should take data privacy more seriously

Why you should take data privacy more seriously 1120 840 Ben McCulloch

Sometimes I tell people I work in conversational AI and they say “I don’t have an Alexa or Google Assistant because I don’t want tech companies to record everything I say.”

Have you experienced the same?

We could break down their statement. We could point out that their mobile phone also has the potential to record all their conversations (plus it’s permanently with them while Alexa is wired to the wall and can’t move), or that Alexa requires the wake word to begin listening (you must say a specific word or phrase to activate the device). However we must accept that people take their personal information seriously, and they’re afraid that their voice-first devices don’t take it seriously. That affects the entire industry.

And at the same time regulations are coming into play – GDPR, California’s CCPA, PCI – that are intended to regulate how companies deal with user’s data.

It’s just too risky to not take data security seriously. That’s according to Patricia Thaine, CEO of Private AI. She elaborated in her VUX World interview with Kane Simms.

What’s the problem?

Image of a shop's 'click and collect' area

I bought online and came to collect it, but what did they collect about me?


In order to improve a conversational assistant, you need data. Data that reveals what users want, or shows how customers really talk, or where they struggle in your app, or many other possibilities. Data is the stuff we need to ensure we’re making the best product possible.

But it’s not a one-way street. Everything you store can expose you to risk. It’s about the user’s personally identifiable information (known as ‘PII’). That needs to be taken seriously.

You may think you’ve got nothing to hide – you’re not up to any ‘dodgy’ stuff when you deal with customer data. As Patricia says, “there’s starting to be an understanding that you never know what it is that you might need to hide”. When collecting the data you want, you might collect a whole lot more besides.

For example, our speech reveals a lot about us. “You can tell pathologies from voice. You could tell socioeconomic backgrounds, educational background… There’s a lot of data that voice carries, in addition to the ability to really identify you since it is used as an authentication method. So there’s a lot of space for misuse.”

Think about what you collect

Consider every registration form on every website you ever filled in. You likely had to give your name and email address, and possibly your age, address, phone number, and some more personal data too. That’s routine stuff for brands and users now, but from the perspective of the law those details are becoming very hot potatoes indeed.

And that’s just the start. While it’s a conscious decision to type personal details into a keyboard, you may reveal them in spoken conversation without being aware that they’re being recorded by a device. That’s the danger of voice-first.

Microphones don’t discriminate either. They can collect everything that happens within earshot. For example, if Bill calls his bank from his workplace – there’s the potential that Bill’s co-workers are discussing sensitive customer data in the background. While the bank only aims to collect Bill’s data to process his request, they could unintentionally save audio from his co-worker’s conversation to their servers as well.

What can you do?

You need to know your regulations for a start. While they may relate to one geographical location (GDPR is enforced by the EU) their reach can be much wider. You need to consider whose data you store, where you store it, who else processes it (such as third-party services) and more.

One data-cleaning process Private AI do for their clients is to replace PII with fake details. Then the language models you’ve carefully fine-tuned remain stable and effective, but you don’t have sensitive info on your servers. Even sharing data internally matters – according to GDPR you need to be able to find a specific piece of information and delete it when requested. Even if it’s been copied to various locations you need to delete them all.

You can see why it’s so important to take this seriously – because customers do, because laws are rapidly evolving so businesses might need to treat PII with the same caution as a dangerously toxic substance, because children are starting to receive education in school about data protection. Public awareness about data rights is only going to increase.

It’s likely you want to know more? You really should watch Patricia’s VUX World interview, check out her company Private-ai.com, and she’s happy to field specific questions if you reach out to her on LinkedIn.

This article was written by Benjamin McCulloch. Ben is a freelance conversation designer and an expert in audio production. He has a decade of experience crafting natural sounding dialogue: recording, editing and directing voice talent in the studio. Some of his work includes dialogue editing for Philips’ ‘Breathless Choir’ series of commercials, a Cannes Pharma Grand-Prix winner; leading teams in localizing voices for Fortune 100 clients like Microsoft, as well as sound design and music composition for video games and film.





    The world's most loved conversational AI event is back
    This is default text for notification bar
    Share via
    Copy link
    Powered by Social Snap