Dustin and I sit down for a long awaited Rundown and chat about the recent Google Assistant announcements from I/O, voice research from Microsoft, and much more. read more
Google announced a whole host of new features for Google Assistant on day 1 of I/O ’19. Also on day 1, quietly, and tucked away on the 9th slide of Niomi Makofsky’s presentation; ‘Intro to Google Assistant: Build your first action‘, was the casual statement: “1M+ Actions”.
Google Assistant has 1 million actions
That’s right. Google Assistant has over 1 million actions.
AND it’s only been going for three years.
AND that smashes the number Amazon Alexa skills.
AND Alexa had a two year head start.
Wait, that doesn’t sound right
You’d be right to question how Google has managed to blow the Alexa skill numbers out of the water in such a small period of time. If you’re heavily involved in the voice assistant community, you’ll know how popular Alexa is with developers and how much drive Amazon has behind growing and educating the community.
Although Google Assistant has started to grab its fair share of headline space in the last 12 months, press coverage of Alexa is at worst comparable and, at best, more widespread.
(This is by no means a scientific study, but through looking at my Google Alerts which track articles featuring Alexa and Google Assistant, the number of articles for Alexa typically outweighs the number for Google Assistant.)
Amazon, having constantly engaged in the community, speaking at countless events, hosting and inspiring any number of hackathons, sponsoring and keynoting conferences and developing a decent online presence have managed to grow the Alexa skill base to 143,145 skills as of Jan ’19. That’s across all locals globally.
(That number came from adding together all of the skills in all of the locals from this Voicebot report. So that’s not accounting for the inevitable and potentially substantial duplication you’ll find in all English speaking locals i.e. many US skills are available in the UK and most UK skills are available in the US.)
So 5 years of hard work has gotten Alexa almost 15% the number of actions that Google has gotten in 3 years.
I’m not an Alexa fan boy. I actually think that the announcements Google made at I/O’19 have put Assistant above Alexa in my estimations.
That said, something doesn’t feel quite right about those numbers.
What does Google class as an action?
I think the ultimate confusion stems from what Google classes as an action vs what Amazon classes as a skill.
It’s helpful to look at Alexa skills first for context.
What is an Alexa skill?
For Alexa, a skill is the capability for the voice assistant to do something such as play music, set reminders or book a train ticket. It’s the equivalent of an app on a smart phone. It’s a purpose built, custom made app that runs on Alexa. It could be first party, built by Amazon, such as setting a timer, or third party, built by a developer, such as an interactive story.
What is a Google Assistant action?
Google explain that an action is the ability to extend the capability of Google Assistant. That sounds the same as an Alexa skill. An action is the equivalent of an app, and third party developers can create these apps to add conversational experiences to Google Assistant.
However, if you take a look at the list of available Actions, and scroll through each category, you’ll see that there certainly isn’t 1M+ there.
There’s more to actions than you think
At I/O’19, in the same talk, Niomi Makofsky, Global Product Partnerships, explained the four ways that content creators and developers can craft a presence on Google Assistant.
In the image above, you’ll see that creating bespoke, hand coded from the ground-up actions; the equivalent of an Alexa skill, is just one of four ways that you can create for Google Assistant.
As well as that, you can:
- Add content through using templates; the equivalent of Alexa Blueprints
- Create app actions and slices (the ability to dip into an app to perform a task or dig data out of an app to surface through Assistant)
- Create smart home features, such as lights and lamps as well as custom smart home objects
On top of that, through using schema.org markup on your site, you can have your web content surfaced on Google Assistant.
For web content, Google Assistant currently supports:
And it just announced at I/O’19 support for:
- How to
Now we’re getting somewhere.
Actions are not the same as skills, they’re completely different
Think about that. All of the websites in the world that use schema.org markup are indexed and surfaced on Google Assistant.
Every news site that uses it, every recipe site, every podcast site and, now, every how to and FAQ page across the internet that uses this markup is available for Google Assistant.
And every website that uses it is an action in and of itself
Every time you ask Google to play a podcast, that one podcast is one action.
Every time you ask for a recipe, if that recipe is being served via a website through using schema.org markup, then that individual recipe (or website) is also one individual action.
Every news article (or website) surfaced and, now, every how to and FAQ are all individual actions.
Google has incorporated its search capability into actions
So, from now on, with Google Assistant, you don’t have to build a specific and bespoke conversational experience to have an action. If your website appears correctly in a search result for the above categories, and that search result is what Google Assistant uses, that’s your action.
And then you’ve got Gmail and Google Calendar and YouTube and the other Google-owned products that will all be counted as actions, too.
Google haven’t confirmed this, but how can this not be the case?
Maybe I’m wrong. This is just my interpretation. I haven’t had this confirmed or denied by Google. But its hard to see how Assistant has hit those numbers in that timeframe without this being the case.
And, if that is the case, then Google have not only weaved the web and search into the core Assistant capability, but they’ve also provided plenty of new ways for creators to build for Google Assistant without having to code a bespoke action.
Now that’s smart. Incredibly smart, indeed.
Google didn’t disappoint on the voice front at this year’s Google I/O. Here’s a run through of all the Google Assistant related announcements.
Google Duplex comes to the web
Last year, Google announced Google Duplex, an automated assistant that makes phone calls to complete tasks on your behalf. Things like booking restaurants reservations and hair salon appointments. It’s since been rolled out across select parts of America.
This year, at I/O, Google announced that it’s bringing Google Duplex to the web and extending it beyond voice.
In the same way that Duplex on the phone would handle phone calls on your behalf, Duplex for the web will handle web browsing and form-filling.
Here’s an example:
Essentially, it acts as an overlay to a website and uses your Google profile, Gmail, calendar and other sources of information to complete online transactions for you, taking the pain out of the whole form-filling business.
Also, Duplex for the web will run on top of any site and doesn’t require any action from your devs or site admins.
Announcing the ‘next generation assistant’
Google Assistant is stepping up its game, getting quicker and embedding itself deeper into the Android operating system to become even more useful and woven into the whole Google experience.
Google Assistant gets rapid response
Google is known for speed, as we found out when we spoke to ex-Googlers Adam Banks and Konstantin Samoylov on the podcast. Everything Google does is all about increasing speed and reducing latency. So this next announcement is in line with the Google culture.
Essentially, it’s rapidly increased the speed of the AI that powers the Assistant and put it on device, rather than in the cloud. A process that used to take 100GB of space in the cloud has been shrunk down to 0.5GB, fitting onto a phone and reducing network latency to a point where the response to a query is 10x faster.
Sundar Pichai, CEO, said that the new AI is:
“So fast that tapping to use your phone will seem slow”
One of the frustrating things about most voice assistants is the constant need to say the wake word before you speak.
This is fine for kicking off an interaction, but gets a bit tiresome when you’re interacting constantly over time.
Google is addressing this with ‘continued conversation’. That’s the ability to make several requests at a time or in succession, without having to say ‘Hey Google’ each time.
In the demo video below, continued conversation on mobile rapidly increases the speed at which users can multitask.
Multitask across apps
Being able to shift across tasks and between apps is something that I’ve wanted from Siri for over a year. Being able to hop from your calendar to photos to Evernote and pull information from one source to another means that all of the messing around I tend to do when trying to work on my phone would be removed.
This is essentially what the Assistant will be able to do. It’s the first example since Bixby of the Assistant being able to control the apps on your phone and move data between them. Now you’ll be able to hop from your messages into your photos, filter your photos and send one to a friend, all with three voice commands.
In the below example, you’ll see a demo of how you can look up flight times without even leaving the app you’re in.
This lays the ground work for some pretty compelling workflows and could increase productivity greatly.
Being able to compile and send a complete email without touching the screen is a challenge. I’ve struggled with Siri for a while and it’s not great.
With complex tasks, Google Assistant will be able to do things like understand that when you say ‘set subject line to xyz’ that you mean change the actual subject line of the email and not that you want the words ‘set subject line…’ to appear in the message.
When will these be released?
All of these features are coming to new pixel phones later this year
Google has also done further work to make the Assistant more personal and understand you more.
Picks for you
With picks for you, it’ll use previous behaviour to tailor recommendations on recipes, podcasts and events. That means that when you ask it to recommend a meal for tonight, you’ll get a different response to your parter or friend.
This is fairly simple stuff, but it might be the ground work for a truly personal and proactive assistant that acts on what it understands of you, rather than generic or mass user trend-based data.
This’ll be launching on smart displays for recipes, podcasts and events this summer.
With personal references, you’ll be able to ask for personal things like the ‘directions to mum’s house’ and the Assistant will know that you’re talking about your actual mother, as opposed to the cafe down the street with the same name.
It uses something called ‘reference resolution’ to understand the difference between the two.
The settings for your personal references are all amended and added in the You tab in Assistant settings.
Driving mode lets you do the kind of things you’d expect, such as get directions and play music, but it has some great quirks such as allowing you to continue playing podcasts from where you left off before you got into the car.
For things like incoming calls, rather than take up the entire screen, in driving mode, this appears in the bottom of the interface, keeping the focus and driver attention on the map, rather than the distraction.
The new interface also displays shortcuts on launch to things like food reservations, podcasts, contacts and suggested music.
It’ll be available this summer on Android.
Stopping alarms, no ‘Hey Google’ required
You can also now stop your alarms and timers by saying ‘stop’, no ‘Hey Google’ needed.
That’ll be rolling out in all English speaking locals from today.
Pushing it forward
These are all great announcements and a clear sign that Google is taking Assistant extremely seriously. Having the Android operating system and a network of services like Gmail, calendars, YouTube, podcasts etc. mean that Google is able to thread all of these together to create a seamless, quick and convenient experience.
If it’s all as good in practice as it looks in the demos, I’ll honestly be ditching the iPhone for the new Pixel this summer.
This week, Dustin Coates and Kane Simms are joined by Nick Carey of Potato to discuss the concept of creating an assistant on an assistant. read more
This week, Dustin and I are joined by journalist and author, James Vlahos, to discuss the details of his book Talk to Me: How voice computing will transform the way we live, work and think. read more
How we made Hidden Cities Berlin with Nicky Birch, Michelle Feuerlicht and Nigel James Brown
In this episode, we take a deep dive into the creation of the world’s first voice-first interactive documentary: Hidden Cities Berlin for Google Assistant. read more
All about voice content management and localisation with Milkana Brace and Jonathan Burstein
Today, we’re discussing why you should separate voice app content from your code and logic with Jargon founders, Milkana Brace (CEO) and Jonathan Burstein (CTO). read more
This week, we’re discussing the latest insights and research in voice SEO and showing you how you can get discovered on Google Assistant, with the MD of Rabbit and Pork, John Campbell. read more
The Rundown 002: Alexa’s new hardware and dev tools, Google Home Mini becomes top selling smart speaker and more
It’s been a busy few weeks with both of the top two voice assistant platforms announcing new devices and software improvements, but what does it all mean for brands, designers and developers?
Google Home Mini becomes top selling smart speaker
That’s right, the Google Home Mini smart speaker outsold all other smart speakers in Q2.
Google’s intense advertising over the summer months looks like it could be starting to pay off. It still isn’t the market leader. Amazon still holds that spot, for now.
At the beginning of this year, Google Assistant was a nice-to-have feature in your voice strategy. Google’s progress over the summer and the recent sales of the Google Home Mini now mean that obtaining a presence on Google Assistant is unavoidable for brands looking to make serious play in this space.
We discuss whether you should use a tool like Jovo for developing cross-platform voice experiences or whether you should build natively.
Dustin’s pro tip:
If you need access to new feature updates as and when they’re released, you should build natively. If you’re happy to wait, use something like Jovo.
Google rumoured to be launching the Google Home Hub
It’s rumoured that Google will be releasing a smart display to rival the Amazon Echo Show.
In the podcast, we said that this will go on sale in October. That’s not the case. The actual sale date hasn’t been announced yet.
With more voice assistants bringing screens into the equation, designing and developing multi modal experiences is going to be an increasing area of opportunity over the next year.
Google becomes multi-lingual
Google announced multi-lingual support for Google Assistant. That means that you can speak to the Assistant in a different language and have it respond back to you in that language without having to change the language settings. This is a great feature for households that speak more than one language.
Although this might not be widely used initially, this is a great step forward in providing a frictionless user experience for those who speak more than one language. For brands, this brings the necessity to internationalise your voice experiences closer to home.
Check out the podcast we did with Maaike Dufour to learn more about how to transcreate and internationalise your voice experience.
Amazon announces about a million Alexa devices
Amazon announced a whole host of Alexa enabled devices last week, including:
- Echo Dot V2 and Echo Plus V2
- A new Echo Show (with a 10 inch screen)
- Echo Auto (for the car)
- Echo Sub (a subwoofer)
- Fire TV Recast (a TV set top box)
- An Alexa-injected microwave
- A clock, with Alexa built in
- Echo Input (turns any speaker into a smart speaker)
- A Ring security camera
- A smart plug
- An amp
These new devices, whether they succeed or fail, present opportunities for brands, designers and developers in that they provide an insight into a user’s context. That can help you shape an experience based around that context.
For example, you can now target commuters with long form audio through Alexa while they’re driving. You can provide micro engagement through Alexa while your customer is cooking their rice.
This could be the beginnings of the ‘Alexa Everywhere’ movement, which will be laden with opportunities for those who seek to understand where users are and what they’re seeking to achieve at that time.
Alexa Presentation Language
The Alexa Presentation Language allows you to design and develop custom visuals to enhance your user’s screen-accompanying Alexa experience.
Until now, if you wanted to serve visuals on an Echo Spot or Echo Show, you’d have to use one of 7 design templates. This announcement means that you can create your own designs and even do things like sync visual transitions with audio and, in future, there’ll be support for video and HTML 5.
As with many of the items in this week’s Rundown, there’s an increasing emphasis on multi-modal experiences. Over the next year or so, expect more voice + screen devices. This will mean that you’ll need to start thinking about how you can add value through visuals as part of your offering.
Kane’s pro tip:
Even though there are more options for voice + screen, still focus on creating voice-first experiences. Don’t let the screen take over. Lead with voice and supplement or enhance with visuals.
Alexa smart screen and TV device SDK
This announcement enables device manufacturers to create hardware with a screen that runs Alexa. For example, Amazon will announce the details of how Sony have used the SDK to add Alexa capability to their TVs.
For hardware brands, you can now add Alexa to your products. For the rest of us, watch this space. This is yet further evidence to suggest that voice + screen experiences are going to be something users come to expect in future.
Introducing the Alexa Connect Kit (ACK)
ACK allows device manufacturers to add Alexa to their hardware without having to worry about creating a skill or managing cloud services or security.
Essentially, you can add an ACK module to your device, connect it to your micro controller and hey presto, you have an Alexa enabled device.
It’s the same thing Amazon used to build their new microwave.
Another opportunity for hardware brands to add value to your product line and another signal that Alexa will potentially be spreading further and wider. If you haven’t thought about how this might impact your business and the opportunities you might find in future, this is a good time to start that thought process.
Two final Alexa announcements:
Whisper mode, which enables a user to whisper at Alexa and it’ll whisper back.
Hunch, which is Alexa’s first move to become proactive in suggesting things you might want to do based on previous behaviour.
In unclear whether either of these things require developers to markup their skills for this in any way or whether Alexa will take care of everything for you.
Bixby will be opening up for public Beta in November after a few months in private beta.
There was a webinar this week, exclusive to the private beta members, which included a host of announcements. I’m still trying to get hold of the webinar or someone who can shed some light on it and we’ll try and bring you further news on this on the next Rundown.