This week, we’re discussing the latest insights and research in voice SEO and showing you how you can get discovered on Google Assistant, with the MD of Rabbit and Pork, John Campbell. read more
The Rundown 002: Alexa’s new hardware and dev tools, Google Home Mini becomes top selling smart speaker and more
It’s been a busy few weeks with both of the top two voice assistant platforms announcing new devices and software improvements, but what does it all mean for brands, designers and developers?
Google Home Mini becomes top selling smart speaker
That’s right, the Google Home Mini smart speaker outsold all other smart speakers in Q2.
Google’s intense advertising over the summer months looks like it could be starting to pay off. It still isn’t the market leader. Amazon still holds that spot, for now.
At the beginning of this year, Google Assistant was a nice-to-have feature in your voice strategy. Google’s progress over the summer and the recent sales of the Google Home Mini now mean that obtaining a presence on Google Assistant is unavoidable for brands looking to make serious play in this space.
We discuss whether you should use a tool like Jovo for developing cross-platform voice experiences or whether you should build natively.
Dustin’s pro tip:
If you need access to new feature updates as and when they’re released, you should build natively. If you’re happy to wait, use something like Jovo.
Google rumoured to be launching the Google Home Hub
It’s rumoured that Google will be releasing a smart display to rival the Amazon Echo Show.
In the podcast, we said that this will go on sale in October. That’s not the case. The actual sale date hasn’t been announced yet.
With more voice assistants bringing screens into the equation, designing and developing multi modal experiences is going to be an increasing area of opportunity over the next year.
Google becomes multi-lingual
Google announced multi-lingual support for Google Assistant. That means that you can speak to the Assistant in a different language and have it respond back to you in that language without having to change the language settings. This is a great feature for households that speak more than one language.
Although this might not be widely used initially, this is a great step forward in providing a frictionless user experience for those who speak more than one language. For brands, this brings the necessity to internationalise your voice experiences closer to home.
Check out the podcast we did with Maaike Dufour to learn more about how to transcreate and internationalise your voice experience.
Amazon announces about a million Alexa devices
Amazon announced a whole host of Alexa enabled devices last week, including:
- Echo Dot V2 and Echo Plus V2
- A new Echo Show (with a 10 inch screen)
- Echo Auto (for the car)
- Echo Sub (a subwoofer)
- Fire TV Recast (a TV set top box)
- An Alexa-injected microwave
- A clock, with Alexa built in
- Echo Input (turns any speaker into a smart speaker)
- A Ring security camera
- A smart plug
- An amp
These new devices, whether they succeed or fail, present opportunities for brands, designers and developers in that they provide an insight into a user’s context. That can help you shape an experience based around that context.
For example, you can now target commuters with long form audio through Alexa while they’re driving. You can provide micro engagement through Alexa while your customer is cooking their rice.
This could be the beginnings of the ‘Alexa Everywhere’ movement, which will be laden with opportunities for those who seek to understand where users are and what they’re seeking to achieve at that time.
Alexa Presentation Language
The Alexa Presentation Language allows you to design and develop custom visuals to enhance your user’s screen-accompanying Alexa experience.
Until now, if you wanted to serve visuals on an Echo Spot or Echo Show, you’d have to use one of 7 design templates. This announcement means that you can create your own designs and even do things like sync visual transitions with audio and, in future, there’ll be support for video and HTML 5.
As with many of the items in this week’s Rundown, there’s an increasing emphasis on multi-modal experiences. Over the next year or so, expect more voice + screen devices. This will mean that you’ll need to start thinking about how you can add value through visuals as part of your offering.
Kane’s pro tip:
Even though there are more options for voice + screen, still focus on creating voice-first experiences. Don’t let the screen take over. Lead with voice and supplement or enhance with visuals.
Alexa smart screen and TV device SDK
This announcement enables device manufacturers to create hardware with a screen that runs Alexa. For example, Amazon will announce the details of how Sony have used the SDK to add Alexa capability to their TVs.
For hardware brands, you can now add Alexa to your products. For the rest of us, watch this space. This is yet further evidence to suggest that voice + screen experiences are going to be something users come to expect in future.
Introducing the Alexa Connect Kit (ACK)
ACK allows device manufacturers to add Alexa to their hardware without having to worry about creating a skill or managing cloud services or security.
Essentially, you can add an ACK module to your device, connect it to your micro controller and hey presto, you have an Alexa enabled device.
It’s the same thing Amazon used to build their new microwave.
Another opportunity for hardware brands to add value to your product line and another signal that Alexa will potentially be spreading further and wider. If you haven’t thought about how this might impact your business and the opportunities you might find in future, this is a good time to start that thought process.
Two final Alexa announcements:
Whisper mode, which enables a user to whisper at Alexa and it’ll whisper back.
Hunch, which is Alexa’s first move to become proactive in suggesting things you might want to do based on previous behaviour.
In unclear whether either of these things require developers to markup their skills for this in any way or whether Alexa will take care of everything for you.
Bixby will be opening up for public Beta in November after a few months in private beta.
There was a webinar this week, exclusive to the private beta members, which included a host of announcements. I’m still trying to get hold of the webinar or someone who can shed some light on it and we’ll try and bring you further news on this on the next Rundown.
We’re starting a new feature on VUX World: The Rundown. Dustin Coates and I are getting together each week (or bi-weekly) to discuss the recent happenings in the voice space and how that’ll impact designers, developers and brands.
Alexa settings API
We’re starting off by discussing the Amazon Alexa feature that developers have been clambering for since 2016: the settings API.
With the settings API, you can access the user’s timezone (among other things) and use that within your skill to personalise the voice experience for your users. You can send them targeted push notifications at the appropriate time and use their preferred weather measurement (Celsius or Fahrenheit).
We discuss Eric Olsen’s (3PO Labs) in-depth review of the settings API and how it could be the beginning of something bigger.
Scott Huffman’s 5 insights on voice tech
We also discuss Scott Huffman’s post (VP Engineering, Google Assistant) on the five insights on voice technology and how they should impact your approach. For example, focusing on utilities and understanding what kind of things people use Assistant for at different times of day.
Voysis and Voicebot vCommerce study
We delve into the Voysis and Voicebot study on vCommerce and discuss how voice on mobile is so important, yet how it’s bubbling away under the surface, not grabbing many headlines.
Alexa skills challenge, Storyline and icon creation
Finally, we discuss the latest Alexa Skills Challenge: Games, in-skill purchases on Storyline (check out VUX World with Vasili Shynkarenka, CEO, Storyline) and the new Alexa feature that allows anyone to create icons for their skills.
Where to listen
Today, we’re discussing the Cognilytica Voice Assistant Benchmark 1.0 and it’s findings on the usefulness and capability of smart speakers.
The folks at Cognilytica conducted a study where they asked Google Assistant, Alexa, Siri and Cortana 100 different questions in 10 categories in an effort to understand the AI capability of the top voice assistants in the market.
What they found, broadly speaking, was a tad underwhelming.
All of the assistants didn’t fair too well
Alexa came out on top, successfully answering 25 out of 100 questions and Google Assistant came second with 19. Siri answered 13 and Cortana 10.
The real question is, what does this mean?
Well, if you take a closer look at the kind of questions that were asked, it’s difficult to say that they were helpful. They weren’t typically the kind of questions you’d ask a voice assistant and expect a response to.
Things like: “Does frustrating people make them happy?” and “If I break something into two parts, how many parts are there?“ aren’t necessary common questions that you’d expect a voice assistant to answer.
Granted, they would test whether assistants can grasp the concept of the question. If they can grasp the concept, then perhaps they have the potential to handle more sophisticated queries.
What the study did well was starting out with simple questions on Understanding Concepts, then worked through more complex questions in areas like Common Sense and Emotional IQ.
The trend, broadly speaking, was that most of the voice assistants were OK with the basic stuff, but flagged when they come up against the more complex questions.
Cortana actually failed to answer one of the Calibration questions: “what’s 10 + 10?”
Slightly worrying for an enterprise assistant!
Google gave the most rambling answers and didn’t answer many questions directly. This is probably due to Google using featured snippets and answer boxes from search engine results pages to answer most queries. It’s answers are only as good as the text it scrapes from the top ranked website for that search.
It’s not a comparison
This benchmark wasn’t intended to be a comparison between the top voice assistants on the market, though it’s hard not to do that when shown the data.
Whether the questions that were asked are the right set of questions to really qualify the capability of a voice assistant is debatable, but it’s an interesting study non the less and it’s worth checking out the podcast episode where they run through it in a bit more detail.
This week, we speak to conversation design master, Oren Jacob, about what it takes to create successful conversations with technology.
There are so many complexities in human conversation. When creating an Alexa Skill or Google Assistant Action, most designers try to mimic human conversation. Google itself has taken steps in this direction with the fabricated ‘mm hmm’ moments with Google Duplex.
But does all of this have an actual impact on the user experience? Does it make it better or worse? How natural is natural enough and does it matter?
What other factors contribute to conversation design that works?
PullString CEO and co-founder, Oren Jacob answers all in this week’s episode.
In this episode on conversation design
We get deep into conversation design this week and discuss things like:
- How natural should conversations with voice assistants be?
- Why you shouldn’t just try to mimic human conversation
- The power of voice and what tools designers need to create compelling personas
- Whether you should you use the built in text-to-speech (TTS) synthetic voice or record your own dialogue
- How and why writing dialogue is entirely different from writing to be read
- The similarities and differences between making a film and creating a conversational experience on a voice first device
- The limitations and opportunities for improved audio capability and sound design
- The importance of having an equal balance of creative and technical talent in teams
- What it all means for brands and why you should start figuring that out now
Oren Jacob, co-founder and CEO of Pullstring. Oren has worked in the space in between creativity and technology for two decades.
After spending 20 years working at Pixar on some of the company’s classic films such as Toy Story and Finding Nemo, Oren created ToyTalk.
ToyTalk was a company that allowed kids to interact with their toys through voice.
As voice technology progressed and voice assistants and smart speakers were shaping up to take the world by storm, ToyTalk morphed into PullString, the leading solution to collaboratively design, prototype, and publish voice applications for Amazon Alexa, Google Assistant, and IoT devices.
For over seven years, PullString’s platform, software, and tools have been used to build some of the biggest and best computer conversations in market, with use cases and verticals as diverse as entertainment, media and health care, for brands such as Mattel’s Hello Barbie and Activision’s Destiny 2. It was also used to create, the latest in big-ticket skills, HBO ‘s Westworld: The Maze.
Where to listen
- iTunes/Apple podcasts
- Any other podcast player you use or ask Any Pod to play VUX World on Alexa
Visit the PullString webiste
Follow PullString on Twitter
Read more about how the Westworld skill was created
Check out the details of the talk Oren will be giving at the VOICE Summit 18
Check out the details of Daniel Sinto’s demo of PullString Converse happening at the VOICE Summit 18
Check out the VOICE Summit website
Translating your Alexa Skill or Google Assistant Action is about more than translating the words in your script. It’s about translating the user experience. Maaike Dufour calls this ‘transcreating’ and she joins us this week to show us how it’s done.
Why should you translate your Alexa Skill or Google Assistant Action?
The world is getting smaller. Technology has enabled us to reach and connect with people from every corner of the earth with ease.
Take this podcast for example. It’s listened to in over 40 different countries, most of which don’t speak English as a first language.
In fact, the vast majority of the world don’t speak English and certainly not as a first language.
Amazon Alexa is global
Amazon Alexa is localised for 11 countries at the time of writing. 5 of them don’t speak English as a first language (France, Germany, Austria, Japan, India).
For global brands, having your Alexa Skill or Google Assistant Action available in every country you do business is a no-brainer. But even for hobbyists and smaller scale developers, think about the population of those countries and the potential impact you could have if you Skill was to do well in those locales.
In this episode
We’re being guided through the importance of making your Alexa Skill or Google Action available in other languages and what steps you should take to make that happen.
We discuss why simply translating your Alexa Skill script won’t work and why you need to recreate the user experience in your desired language.
We cover some of the cultural differences between countries and give some examples of why that makes literal translations difficult. For example, the X-Factor in the UK is a nationally recognised TV show. Whereas, in France, it aired for one season and wasn’t well received. Therefore, referencing the X-Factor in a French Skill is pointless.
Maaike tells us about how, when transcreating your Alexa Skill, you might even need to change your entire persona due to the differences in how other cultures perceive different personas. For example, in the UK, a postman is simply someone who delivers mail. Whereas, in France, the postman is a close family friend who stops to chat and knows everybody in the street personally. In the UK, the postman is a distant stranger. In France, the postman is a close acquaintance. That makes for two entirely different personas.
We discuss examples of words and phrases that exist in one language but don’t in another and how that can both open up opportunities and sometimes present challenges.
We’re joined by Maaike Dufour, Freelance Conversation UX Designer, co-founder of UX My Bot and supreme transcreator of voice first applications. Maaike, quite rightly, prefers to use the term ‘transcreate’ instead of ‘translate’ because simply translating the words that make up your Alexa Skill or Google Assistant Action won’t work, as you’ll find out in this episode.
Maaike has worked on voice first UX for a number of years. Having worked with the Smartly.ai team, Maaike now works with Labworks.io and is helping the team break into international markets through the transcreation of popular Alexa Skills such as Would You Rather into other languages.
Where to listen
- iTunes/Apple podcasts
- Any other podcast player you use or ask Any Pod to play VUX World on Alexa
This week, we’re finding out how brands can get started and enter the voice first world of smart speakers and digital assistants. read more
Dr. Pete, Marketing Scientist at Moz, and world-leading SEO oracle, tells all about the voice search landscape, and how you can rank for searches on digital assistants like Google Assistant and Amazon Alexa.
This is a jam-packed episode with deep, deep insights, advice and guidance on all things voice search related. We’ll give you practical ways to compete to be the answer that’s read out in voice first searches, as well as some notions on the current and potential future benefit that could bring.
There are all kinds of stats around voice search, which we’ve touched upon before.
- Gartner predicts that 50% searches will be voice based by 2020
- There are already over 1bn voice searches performed per month
With more people using their voice to search, how will that affect search marketers, content creators and brands?
What’s the difference between a voice search and a typed search?
Is there anything you can do to appear in voice search results?
We speak to one of the search industry’s top sources of SEO knowledge, Dr. Pete, to find out.
Getting deep into voice search
In this episode, we’re discussing the differences between voice search on mobile, voice first search on smart speakers and typed search.
We discuss the absence of search engine results pages (SERPs) in a voice first environment and increased competition for the singularity: the top spot in voice search.
We chat about the search landscape, the effect voice is having on search, changing user behaviour and expectations, new search use cases and multi modal implications, challenges and opportunities.
We get into detail about how voice search works on devices such as Google Assistant and Google Home. This includes debating Google’s knowledge graph and it’s advantages and disadvantages in a voice first context.
We look at the practicalities of serving search results via voice. This touches on the different types of search results, such as featured snippets, and how voice handles different data formats such as tables. We get into detail about the different types of featured snippets available and how each translate to work (or not work) on voice.
We discuss Dr. Pete’s work and studies in the voice first space including his piece ‘What I learned from 1,000 voice searches‘ and what he found.
We wrap up with some practical tips that you can use right now to start preparing for the influx of voice searches that’ll be hitting the air waves soon and help you start to rank in a voice first environment.
Dr. Pete Myers (a.k.a Dr. Pete a.k.a. the Oracle) is the Marketing Scientist at Moz, the SEO giant and search industry leader.
Dr. Pete has been an influential search marketer since 2012 and has spent years studying Google’s search algorithm, advising clients and the SEO industry on best practice and guiding the industry into the future.
His research and writing on the topic has been helping brands keep on top of the search space, improve their rankings and business performance and has helped keep Moz at the top of the industry.
Moz has been at the top of the SEO chain since 2004 and is trusted by the whole SEO industry as the place to go for SEO tooling, insights and practical guidance.
- Follow Dr. Pete on Twitter
- Follow Moz on Twitter
- Read Dr. Pete’s ‘What I learned from 1,000 voice searches on Google Home‘
- Read Dr. Pete’s work at Moz
- Read Why you need to prepare for a voice search revolution on Forbes
Where to listen
We’re talking to ex-Googlers, Konstantin Samoylov and Adam Banks, about their findings from conducting research on voice assistants at Google and their approach to building world-leading UX labs.
This episode is a whirlwind of insights, practical advice and engaging anecdotes that cover the width and breadth of user research and user behaviour in the voice first and voice assistant space. It’s littered with examples of user behaviour found when researching voice at Google and peppered with guidance on how to create world-class user research spaces.
Some of the things we discuss include:
- Findings from countless voice assistant studies at Google
- Real user behaviour in the on-boarding process
- User trust of voice assistants
- What people expect from voice assistants
- User mental models when using voice assistants
- The difference between replicating your app and designing for voice
- The difference between a voice assistant and a voice interface
- The difference between user expectations and reality
- How voice assistant responses can shape people’s expectations of the full functionality of the thing
- What makes a good UX lab
- How to design a user research space
- How voice will disrupt and challenge organisational structure
- Is there a place for advertising on voice assistants?
- Mistakes people make when seeking a voice presence (Hint: starting with ‘let’s create an Alexa Skill’ rather than ‘how will
- people interact with our brand via voice?’)
- The importance (or lack of) of speed in voice user interfaces?
- How to fit voice user research into a design sprint
Plus, for those of you watching on YouTube, we have a tour of the UX Lab in a Box!
Konstantin Samoylov and Adam Banks are world-leading user researchers and research lab creators, and founders of user research consultancy firm, UX Study.
The duo left Google in 2016 after pioneering studies in virtual assistants and voice, as well as designing and creating over 50 user research labs across the globe, and managing the entirety of Google’s global user research spaces.
While working as researchers and lab builders at Google, and showing companies their research spaces, plenty of companies used to ask Konstantin and Adam whether they can recommend a company to build them a similar lab. Upon realising that company doesn’t exist, they set about creating it!
UX Study designs and builds research and design spaces for companies, provides research consultancy services and training, as well as hires and sells its signature product, UX Lab in a Box.
UX Lab in a Box
The Lab in a Box, http://ux-study.com/products/lab-in-a-box/ is an audio and video recording, mixing and broadcasting unit designed specifically to help user researchers conduct reliable, consistent and speedy studies.
It converts any space into a user research lab in minutes and helps researchers focus on the most important aspect of their role – research!
It was born after the duo, in true researcher style, conducted user research on user researchers and found that 30% of a researchers time is spent fiddling with cables, setting up studies, editing video and generally faffing around doing things that aren’t research!
Konstantin Samoylov is an award-winning user researcher. He has nearly 20 years’ experience in the field and has conducted over 1000 user research studies.
He was part of the team that pioneered voice at Google and was the first researcher to focus on voice dialogues and actions. By the time he left, just 2 years ago, most of the studies into user behaviour on voice assistants at Google were conducted by him.
It’s likely that Adam Banks has more experience in creating user research spaces than anyone else on the planet. He designed, built and managed all of Google’s user research labs globally including the newly-opened ‘Userplex’ in San Francisco.
He’s created over 50 research and design spaces across the globe for Google, and also has vast experience in conducting user research himself.