Google Assistant

The future is multi modal

The future is multi modal 1144 762 VUX World


A good digital assistant will take context into consideration when providing a user experience.

Now that context can be related to the device that you’re using, could be related to the environment that you’re in, could be related to how much time and attention you have available at any given time.

So for example, if I’m in the kitchen washing up, I might have a bit of time but you might not have my attention and so the experience might need to be different to if I’m sitting in the front room watching the TV, where I do have time and I do have attention or if I’m out for a run wearing headphones and I don’t have either and so in the headphone example, maybe your interactions need to be really short and sharp and transient. In the living room example maybe you use visuals a little bit more and you lean on visuals more and in the kitchen, maybe you use audio first and you try and emphasize using earcons and things like that to make more of an audible experience.

Now, those are just real high-level examples and it’s difficult enough to create one conversation that’s intuitive. That’s natural. That’s easy to use.

Now think about doing that for all of these different devices and think about doing that not just for one third party app that you create but if you are the designers behind Google Assistant, it exists on over a billion devices, in over 90 countries and 30 different languages.

How do you create conversations that, yes, adapt to the different devices that you create as Google, but also the any number of devices that could be created by third-party manufacturers putting Google Assistant in their own hardware.

That is a very complex, very big task but it has to be the task for someone, and that someone is Daniel Padgett, Head of Conversation Design at Google.

He and his team work on creating consistent conversations across modalities for Google Assistant and we had the opportunity to interview Daniel and chat multi modal design for Google Assistant on the VUX World podcast this week.

We talked to Daniel about just how you go about creating genuine multimodal conversations that change depending on the device and context the user is in and where the future of multimodality is going from Google’s perspective.

Multi modal design with Google’s Daniel Padgett

Multi modal design with Google’s Daniel Padgett 1800 1200 VUX World

Google’s Head of Conversation Design, Daniel Padgett, shares how his team approach multi modal design across all Google Assistant-enabled devices.
read more

Conversation design and grounding strategies with Jon Bloom

Conversation design and grounding strategies with Jon Bloom 1800 1200 VUX World

Jon Bloom, Senior Conversation Designer at Google, joins us to share what a conversation designer does at Google, as well as some conversation design techniques used at Google, such as ‘grounding strategies’. read more

Voice search is real and Google is concerned

Voice search is real and Google is concerned 1800 1200 VUX World

Voice search is happening. And Google is under threat. Not short term threat but long term threat.

There’s a new search provider in town and it’s name is Alexa. read more

Google Assistant: 500m active users, popular Alexa skill: 300,000 daily users

Google Assistant: 500m active users, popular Alexa skill: 300,000 daily users 1800 1200 VUX World

At Project Voice in Chattanooga, TN, we saw some light shed on just how many people are using the Google Assistant, and some updated numbers on the kind of traffic one of the most popular Alexa skills handles.

Cathy Pearl, Head of Conversation Design Outreach, Google, towards the end of her presentation shared some numbers on how many monthly active users Google Assistant has.

500 million per month.

Thats across all surfaces and devices, but it’s a lot.

Google is the first of the big players to share numbers like this. Maybe it was waiting for the number to be high enough to warrant talking about.

I’d like to see the same numbers from Amazon. Now I’m wondering whether Amazon is also waiting to reach a noteworthy milestone before releasing them.

Nick Schwab, founder of Invoked Apps and creator of some of the most popular skills on Alexa, including Sleep Sounds and a host of ambient sounds, shared how much traffic his skills handle per day.

300,000 per day.

2% of users try the premium service and 90% of those convert into paying customers.

That’s doubled since we spoke to Nick on the VUX World podcast last year.

Of those 300,000, 15% of them were using a smart display of some kind, such as the Echo Show 10, 8 or 5.

So now that we’re starting to see some solid numbers coming from the platforms and smiths successful developers on those platforms, it would be good to see others follow suit and share the metrics behind their successes in order to validate the opportunity that these platforms present for organisations and individuals.

If you’ve been uncertain about the kind of opportunities that exist on these platforms for your company, or cautious about jumping in and investing, maybe these numbers will start to help you paint a clearer picture about whether you should and the size or the market and opportunity that you could take if you do.

Google Assistant will replace Google Search in its entirety within 5 years

Google Assistant will replace Google Search in its entirety within 5 years 1800 1200 VUX World

Here’s a prediction for you: Google Assistant will replace google search entirely within 5 years.

Sooner, even.

Think about it. What is Google’s ultimate aim? To find THE right result for the user.

That’s why it’s taken on the burden itself of finding hotel rooms, flights, deals and just about anything else it can right there in the search engine results pages.

It’s consumed all facts (more or less) within its knowledge base and it’s even pulling website content into the search results pages in featured snippets to save users the trouble of visiting the site.

Now Google Assistant will replace the voice search on Google search.

And what does a voice interface mean for search results?

Result Zero wins. The holy grail. One result. The best result for your needs based on your context.


Google Assistant is also text-based. So why not replace the search bar entirely with a conversational AI? One where search refinements aren’t refinements at all, but clarifications:

“What year was Rob Williams born?”

“Did you mean actor Robin Williams or singer Robbie Williams?

“The actor.”

That’s not a search refinement, it’s a clarification statement.

Google is able to take your first search term and turn the interaction into a turn in a conversation.

When you have a follow up query, there’s no need to head back to the search bar and delete what you’ve just typed, you can just continue:

“And when did he die?”

Google has been working at integrating the web into Google Assistant for a while starting with the beta launch of speakable markup. This allows web editors to mark up specific sections of content to be read by Google Assistant.

Then look at the last I/O ‘19. Google released the ability to turn YouTube videos into tutorials on the Assistant and web-based ‘How to’ tutorials, too. And featured snippets appear in Google Assistant all the time.

It doesn’t stop there

It goes deeper than that. In-app actions mean that not only can Google Assistant serve website content, but it can also pre-populate Android apps with spoken or typed phrases from Google Assistant.

If you say:

“Hey Google, book me a small hire car for Tuesday”

In-app actions will allow Google Assistant to pre-populate the Enterprise Car Hire app with ‘small car’ and pre-select ‘Tuesday, all day’ as the date.

Once this has been rolled out, it’s obvious that the next step is to work on an integration that goes the other way: from the app, into Google Assistant.

This way, you’ll be able to book your car, hotel, flight, cinema ticket or do anything else you use apps for, right there in Google Assistant, without needing to open the app at all. It’ll just use the app’s APIs and a conversational layer that you’ll be able to include in your apps through a more robust in-app actions feature.

At I/O’19, Google changed its mission from ‘organising the world’s information’ to ‘helping people get jobs done’ and in-app actions aren’t part of that.

So, once you have in-app action integrations working both ways, the APIs and language models can be broken out of the apps and surfaced through Google Assistant on the web, which will add reliable capability to ‘get jobs done’, right from within the most popular website on the planet.

So my prediction, again: Google Assistant will be the front end to the entirety of Google Search within 5 years.

Imagine that. The words information condensed into a simple conversation. All of the Android app functionality available instantly over the web using the same APIs.

So if you have a website or an app, regardless of what your business does, you should consider whether to brace or embrace.

Brace yourself to be forgotten or embrace voice first and jump into Google Assistant this year.

Why aren’t more people making Google Assistant actions?

Why aren’t more people making Google Assistant actions? 1800 1200 VUX World

When thinking about launching an app for a voice assistant, most people, most of the time, will create an Alexa skill instead of a Google Assistant action. But why?

There are over 100,000 Alexa skills and only a few thousand Google Assistant actions.

Now, Google do claim that Assistant has over 1 million actions, and that’s because Google treats its definition of action slightly different to how Amazon defines a skill. But in terms of third party crafted experiences, comparable to Alexa skills, the Alexa skill store dwarfs Google Assistant’s.

Why aren’t people building more Google Assistant actions?

People aren’t creating actions because they don’t think that there are enough users to warrant the investment…yet.

This is largely because of the misunderstanding among some people that voice assistants and smart speakers are interchangeable words.

People think that voice is smart speakers and smart speakers are voice.

People think that voice is smart speakers and smart speakers are voice. Click To Tweet

Perhaps that’s where it started in 2016. But voice is used more often on phones than anything else. Siri and Google Assistant are the two most used voice assistants, despite Amazon’s smart speaker market share dominance.

History repeats itself

What’s playing out right now is the same story that played out when Apple launched the App Store.

When the App Store launched in 2008, everyone started building apps for iOS.

Then Android released the Android Marketplace and… Nothing.

No one was building apps for Android. But why?

Simple. All the users who were using apps were iPhone users. Yes, the Android Marketplace was established in 2008, too, but Android didn’t overtake iOS market share until 2012: four years after the launch of the app store.

Source: StatCounter Global Stats – OS Market Share

Even then, not everyone turned straight to Android. By that point, people had invested thousands of pounds into their iOS apps. They were sitting tight for a while until it became unavoidable.

Google is in the same position again

Through no fault of its own, Google has managed to find itself in the same position with Google Assistant as it was with Android in 2009. It was second to market with the smart speaker, second to launch a developer toolkit and second to launch its actions ‘app store’.

And although Google Assistant is available on over 1 billion devices and is running on the no.1 phone operating system in the world, as long as people associate voice assistants with smart speakers, it’ll continue to struggle.

Google’s task at hand

Google’s task is the same as anyone else trying to compete with Amazon, or anyone else working in the voice industry: helping people understand that voice is more than smart speakers. Voice assistants can exist on any device; mobile, ear buds, car stereos, washing machines, computers, laptops, TVs and, yes, smart speakers.

That’s why it’s leading with the ‘1 billion users’ slogan. It’s trying to give people confidence that Google Assistant is a safe place to invest your time and energy.

And this is a message that they have to hit home because they’re going to struggle to claw back market conceded share on smart speakers. Baidu and Xiaomi are already gaining traction and don’t be surprised if we see more entrants into the market in 2020 from the likes of Sonos who now have Snips’ technology.

So people aren’t creating actions because of a lack of understanding about what Google Assistant actually is and where it exists, which should give Google a clear focus for its marketing and developer/brand relations with Google Assistant in 2020.

This post was triggered by a discussion on the Voicebot 2019 year in review podcast. It’s the answer I wish someone gave 🙂

Not so bold predictions for voice in 2020

Not so bold predictions for voice in 2020 1800 1200 VUX World

So, here we are. The end of 2019. And it’s time for some click bait regarding 2020 voice-first industry predictions.

I told you to prepare for this a few weeks back.

While forecasting can be inspiring and encouraging, it could also be perceived as a holding message.

It’d be easy to hear about how advertising will come to smart speakers in 2020 and think “I’ll wait for that, then I’ll do something”.

Or read about 5G rolling out in the next 12 months and think “I’ll hang on until then”.

Or, maybe you’ll read how the voice mega-trend will cause a back-to-basics review of existing content, worry about how much work that’ll be and kick the can down the street.

Perhaps you’ll hear buzzwords like the importance of ‘compatibility and integration‘ and how ‘environment and context data‘ is key to enabling ‘transaction-oriented consumer intents’ in 2020, glaze over and think “this is just too complex”.

I have no problem with these kind of articles per se. I’ve enjoyed reading some of them. Especially this one from RAIN, this one from Vixen Labs and this one from Rabbit and Pork.

I just think they can often contain hopes and wishes or general, broad trends, rather than sober predictions of what might actually happen in the next 12 months.

And much of this stuff was predicted last year as well. Siri to open up third party development, anyone?

“There isn’t much in those predictions that wouldn’t have been suggested this time last year.”
David Low, Executive Product Manager, Voice and AI, BBC, incoming CEO, The List

While it’s nice to look forward, hope, predict or wish; critical even. It’s also good to understand what you should realistically do in 2020 if we continue to have another iterative year.

And much of this stuff was predicted last year as well. Siri to open up third party development, anyone? Click To Tweet

2020: another iterative year for voice

Perhaps a not so bold prediction (or a bolder prediction), which I subscribe to, is that 2020 will be an iterative year for voice assistants.

That might not be what you want to hear. But think about it. Think about some of the main challenges, like discoverability.

If it was easy to solve, it would have been solved already.

The big players could do something radical to address it, but what are the chances of that? It’s too risky.

The big players could do something radical to address it, but what are the chances of that? It's too risky. Click To Tweet

In reality, they’ll iterate towards solving these problems over time.

That’s because to address something like discoverability, you need to address the current set-up of the platforms, challenge our app-centric mental modal and really consider whether skills are the right solution.

Amazon have far too much invested in skills and too much smart speaker penetration to risk confusing the message or pivoting in a big way. At least not over the next 12 months.

The only platform I could ever see pivoting from the app-centric mental model is Google which, to be honest, already has. Actions aren’t just ‘apps’, anything Google Assistant can do is an action, including performing a web search. That’s how Google can claim to have over a million actions.

Further iteration isn’t really a bad thing

Maybe that’s not such a great prediction for a rapidly evolving space. But so what?

I don’t know why everyone’s obsessed with ‘rapidly evolving’ things anyway. The pace of technological advancement is obviously quick (and quickening), but user behaviour doesn’t change at anywhere near the same pace.

The pace of technological advancement is obviously quick (and quickening), but user behaviour doesn’t change at anywhere near the same pace. Click To Tweet

People will change their habits from screens to speaking, for the right kind of things, over time. People won’t break out of 15 years worth of mobile conditioning or 20+ years of screen-based, keyboard conditioning in the next 12 months.

The reality is, the platforms are already capable of more than people are using them for. It’s the advancement of user behaviour that we should concentrate on.

What the voice industry should do in the next 12 months  

What should happen in the next 12 months is that all of us in the industry should be doing the best work we possibly can, within the areas we can affect and influence, to give users the best possible experiences that increase their confidence and trust in voice assistants and voice interfaces.

Whether that’s putting voice search into your app, building an Alexa skill, putting a voice bot on your website, wherever it makes sense for your users. The main thing should be to provide quality, reliable experiences that do the job they need to, well, and consistently, and give users confidence in the medium.

But, you don't need to start big, you just need to start. Don't get caught short like you did with mobile and social. Click To Tweet

Confidence that’ll build over time.

Confidence that’ll turn into repeat usage, over time.

Confidence that will lead to bigger behavioural changes and unlock this door we’ve been banging on for the last few years.

And for brands, just start. Move the needle. Get off the starting blocks. Just. Move.

With smartspeaker penetration being over 20% in the UK and usage rising, reaching your target market with something they value is a real proposition. And that won’t happen on its own.

Maybe no one in your industry has done it yet. Maybe there aren’t any case studies for you to compare.

But, you don’t need to start big, you just need to start. Don’t get caught short like you did with mobile and social.

We already have the tools. We just need to actually use them

Merry Christmas and all the best for a VUXing epic New Year.

How to create Alexa skills using agile

How to create Alexa skills using agile 1800 1200 VUX World

What’s the best way to approach the development of a hashtag voicefirst experience for Alexa, Google Assistant and voice assistants?

The cool kids would have you use agile and the ‘proper’ project managers would be more comfortable using something called waterfall. But what if there’s another way?

Here I break down how agile development with a waterfall implementation might be the best way for you to a) learn and iterate during development and b) go live with a well rounded, higher quality experience.

Voice strategy: short and long term planning

Voice strategy: short and long term planning 1800 1200 VUX World

When it comes to creating a voice strategy, you need a short term plan and a long term plan.

These are the words of Amazon Alexa’s Chief Evangelist, Dave Isbitski, who joined us on the VUX World podcast to discuss how Amazon are advising brands on voice strategy.

In the short term, you’ll be looking to establish a presence on the common voice assistant platforms like Alexa and Google Assistant. You should look to get started by finding a single use case that adds value to both your end user and your business.

That in and of itself can take time. You’ll go through the same motions as you’d go through with any other project. From discovery and feasibility to design and prototyping to production and implementation.

The difference with a voice strategy is that there are a number of fundamentals you’ll need to have in place first, such as figuring out what your brand sounds like.

In the long term, there’ll be things you’d like to do that Alexa or Google Assistant doesn’t support today. Also, the more you embrace voice, the more you’ll start moving towards having voice as an interface to your business across every touchpoint, rather than a presence on a platform. And to get there, it’ll take a shift in strategy, skills, resources, tools and priorities. That takes time.

So start small, crawl before you walk, but have your eyes on the bigger picture.