Voice with X-Ray vision?

Ben McCulloch
March 24, 2022
in Article, Opinion

Voice with X-Ray vision? https://vux.world/wp-content/uploads/Screen-Shot-2021-03-01-at-3_35_30-PM.jpg 730 464 Ben McCulloch Ben McCulloch https://secure.gravatar.com/avatar/b1f3549c2d953651d69f59ec1fa801a3?s=96&d=blank&r=g March 24, 2022 March 24, 2022

Learn how Disruptel is disrupting TV content consumption with voice assistant capabilities that can see what you’re watching.

Hang up your pipe, Sherlock

Imagine you’re watching Elementary on your TV and you love the jacket Lucy Liu is wearing. You’d buy it if you could… but how would you find that jacket?

Your phone’s probably already in your hand, so you become Sherlock. You search ‘Lucy Liu jacket Elementary TV’. Where would that data even be? How would you refine that search? You’d add the season or episode number if you knew it – which you probably don’t.

It’s going to take a lot of time to search, and you’ll possibly never find it. Plus, your attention will be away from the TV.

So, what if you could just say to your TV “what jacket is Watson wearing” and the answer appears on screen (Lucy Liu played Dr. Joan Watson in the show).

You get to keep watching – your attention never leaves the TV – and you get what you need. Friction is reduced.

That’s Disruptel aims to deliver.

X-Ray vision

So what’s the big idea?

Disruptel has trained machine learning models to see people and objects within TV content. This means you can ask specific questions about what you see. A voice assistant that can see!

It’s like their system has x-ray vision that can see the metadata attached to what’s on the screen, in real time. If you ask a question, Disruptel uses a combination of metadata, knowledge graphs, computer vision and more, to find the answer – who’s the actor, what’s the show, what’s the jacket and so on. It can even recognise animals and skylines so you could ask “what breed of dog is that?” or “what city is that?”

Giving context to voice tech

But hold on, what’s the bigger idea?

Voice assistants don’t do context well. If you ask a voice assistant “tell me who that is”, chances are the voice assistant will be clueless.

Disruptel has added eyes to the voice assistant. Suddenly it has a much better chance of understanding context. Asking “tell me who that is” can now receive the correct answer “that’s Lucy Liu”. Even more compelling: “who’s that guy on the left?”

Alex Quinn, CEO, Disruptel says, “we call this the world’s first voice assistant that can see.”

Do you think this is only useful for searching Watson’s jacket on TV?

Alex says, “this is really helpful for a lot of things. But starting out, I think that TV and entertainment content is a great domain and stepping stone for us.”

Enter the metaworld (no, not that one)

The Disruptel system looks at video to recognise specific people, places, brands, and even dog breeds. Why limit that tech to TV?

Imagine being able to look at anything in the world and ask about it? THAT’s the potential of Disruptel’s technology.

Alex, when speaking on the VUX World podcast: “At the end of the day, our computer vision systems have been trained on people, objects, products… [whether it’s] recognizing on the screen or in the real world – really makes no difference”.

Imagine you’re walking in busy city streets. You see something that interests you, so you ask about it. You listen to the response while you walk. That’s an incredible and frictionless future. A future that even Amazon and Google have tried but failed to bring about to any great success so far.

Finding the balance between trivia and sales

Disruptel will monetise though inbound advertising: matching what you ask for with a brand that can sell it to you. They’ll serve up specific ads based on what’s in the show and what the user asks.

How carefully the advertising model will be incorporated remains to be seen. When we watch TV, we’re used to adverts, but having one pop-up right in the middle of your favourite show might be a bit too much. Disruptel aims to have the advertising work more like PPC, wherein ads will be served when something is asked.

There’s a challenge here of communicating to users specifically what they can ask about in a given scene. Raising awareness about the capabilities of voice assistants is a long standing challenge.

Thinking further ahead, when the full promise of this technology could be realised in the real world; can you imagine being directed to a pet shop just because you saw a cute pup in the park and asked what breed it was? Striking a balance between inbound and interruption will be key.

Want to learn more about the future of voice assistants that can see? Check out the full episode with Alex Quinn on the VUX World podcast.

This article was written by Benjamin McCulloch. Ben is a freelance conversation designer and an expert in audio production. He has a decade of experience crafting natural sounding dialogue: recording, editing and directing voice talent in the studio. Some of his work includes dialogue editing for Philips’ ‘Breathless Choir’ series of commercials, a Cannes Pharma Grand-Prix winner; leading teams in localizing voices for Fortune 100 clients like Microsoft, as well as sound design and music composition for video games and film.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
resolution	session	This is a functionality cookie used to collect the horizontal value of the visitor screen resolution. It helps in optimizing the website view to the user.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_111445333_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
ajs_anonymous_id	never	This cookie is set by Segment.io to check the number of ew and returning visitors to the website.
CONSENT	16 years 2 months 25 days 18 hours	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__smSessionId	9 hours	No description available.
__smToken	1 year	This cookie is set by the Sumo. This cookie is used for verifying whether the user is logged in or not.
__smVID	1 month	This cookie is set by Sumo. The purpose of the cookie is not yet known.
_mailmunch_visitor_id	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
AnalyticsSyncHistory	1 month	No description
attribution_user_id	1 year	This cookie is set by the provider Typeform. This cookie is used for Typeform usage statistics. It is used in context with the website's pop-up questionnaires and messengering.
cookielawinfo-checkbox-functional	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
debug	never	No description available.
intercom-id-or0x2acp	8 months 26 days 1 hour	No description
intercom-session-or0x2acp	7 days	No description
li_gc	2 years	No description
li_sugr	3 months	No description available.
mailmunch_second_pageview	never	This cookie is set by MailMunch which is email collection and email marketing platform. We do not know the exact purpose of the cookie.
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.