Voice Assistants Are Ready To Name Names

Graham Lovelace, asi’s Media Technologies Director, looks at a major breakthrough for voice assistants and assesses the opportunities this presents

We technophiles are forgiving folk, particularly when it comes to emerging technologies. There’s a tendency to overlook flaws and talk up the potential. We brush off negative reviews and seize on optimistic forecasts. We look at a new device with excitement and see disruption, rather than a dud.

Right now the technology that’s exciting many is artificial intelligence (AI) and machine learning, the wizardry behind advanced search (understanding natural language, even when misspelt) and speech recognition (processing spoken words, even when spoken in heavy accents). The technology learns and improves as it goes along, just like a child learning to talk, or read, or ride a bike.

IT research firm Gartner puts AI and machine learning at the top of its strategic technology trends for 2017. Machine learning (that part of AI that makes machines smarter) is also running high in another Gartner chart. It’s at the summit of the Peak of Inflated Expectations, the perilous pinnacle of the Gartner Hype Cycle reserved for technologies that have enjoyed early success but now need to move beyond easily forgiving early adopters.

There’s a special place reserved for companies that fail to do this, a dark, inhospitable place that Gartner calls the Trough of Disillusionment. ‘Some companies take action; many do not,’ says Gartner. This week we saw such an action in a rapidly emerging new consumer electronics device category, a move that I believe will propel AI and machine learning firmly towards mass consumer take-up and usage. To explain what it was, and its significance, we need to take a few steps back.

OK Google, what’s a voice assistant?

Voice-activated assistants are smart speakers made by Amazon (Amazon Echo, powered by its Alexa voice agent) and Google (Google Home, powered by Google Assistant). They answer queries (using speech recognition software and machine learning), play music, buy items and control security, heating and lighting settings (so are part of a wider ‘internet of things’, or IoT, trend).

The television industry – everyone from TV manufacturers to broadcasters, pay-TV providers, content producers, advertising agencies and brands – is watching this development closely. Voice could become the new interface, replacing the grid-like electronic programme guide (EPG) for programme discovery and navigation, and the TV remote for control.

There’s the prospect of new interactive content formats: Amazon Echo offers ‘skills’, voice-activated apps produced by third-party developers. Amazon and Google are reportedly looking to add phone functionality to their devices, so users can connect with other users (or groups of like-minded users) as well as media owners and content producers, creating a form of ‘responsive media’. There is also the prospect of new advertising formats, possibly linked to television advertising. But an early example of this recently exposed a major weakness with current devices, and threatened an early slide into the dreaded Trough.

Telling whoppers

TV programmes and ads have inadvertently triggered voice assistants. In January a US TV news report about a girl who ordered a doll’s house via Amazon Echo prompted complaints as the news presenter said ‘I love the little girl saying “Alexa order me a doll house”,’ activating devices which then accidentally made purchases. In February a Google Home ad during Super Bowl repeatedly woke devices when characters in it repeatedly used the trigger words ‘Okay Google’.

But the prize for the first intentional ad to trigger voice assistants last week went to Burger King. In the 15-second ad a crew member said he didn’t have time to explain what went into Burger King’s signature sandwich, before leaning into the camera and asking: ‘Okay Google, what is the Whopper burger?’ Activated devices across America then started to read a Wikipedia page which was soon edited by pranksters to say it was ‘made with 100% rat and toenail clippings’. Nice. Google – which hadn’t been in on the hijack – rushed to programme Google Home to ignore the TV ad before irritated consumers turned off their devices for good.

The ad highlighted a weakness: voice assistants cannot recognise specific voices, such as those of their owners, so they are constantly open to being hijacked by anyone or anything near them uttering the trigger words. Until now.

I recognise you …

Google has announced a major tweak to Google Home that will revolutionise the category, and create new opportunities for media research. As of this week in the US, and soon in the UK, up to six people using the same Google Home can create individual accounts linked to their calendars, and Google’s vast memory that stores their media tastes and preferences. Six is sufficient for most families.

It means the device will not only recognise what is being said, but also who is saying it, and respond with the name of the person they are talking to. The set-up simply requires each family member to say ‘Okay Google’ and ‘Hey Google’ two times each – enough for Google’s neural network to detect specific characteristics of a person’s voice, and then match it in milliseconds. The result? Cue the ad.

This is an important breakthrough for Google, one that gives it an edge in the race to catch up with Amazon – though we can expect Echo devices will soon offer a similar level of individual voice recognition.

It opens the potential for voice-activated services to become far more personalised: recommendations of what to watch on TV will be tailored to the person asking.

New insights could be gained into how families decide what to watch; each member’s individual viewing requests could, in theory, be analysed, as well as the link between TV advertising and voice-activated search for advertised brands and, ultimately, purchases.

And there’s also the prospect for passive analysis of what is being said while people watch TV shows and ads: these devices potentially hear everything being said near them.

So new insights and new analytics galore. Now that’s something worth talking about, hey Google?

Originally posted by Graham Lovelace at asi
24th April 2017

Follow Graham on Twitter for daily tweets on the future of TV: @glovelace

Google’s Ajay Vidyasagar, Regional Director – Asia Pacific, will be speaking at the 2017 asi APAC Television & Video Conference on 11th-12th May in Singapore