I’m not much of a cook, but the few times I’ve asked Google Assistant on my Nest Mini to start a timer in the kitchen have been hit or miss. All too often, the timer disappears into a void and Google can’t tell me how many minutes are left. Other times, it takes multiple attempts to set it properly because Assistant struggled with understanding context.
Those problems (and a few others) are about to be resolved. Google’s latest update to its voice assistant, which begins rolling out today, greatly improves its contextual understanding when you’re asking it to perform a task like setting an alarm or a timer. Included in this update is another fix sure to be welcome for anyone who uses voice commands to manage calls and texts: You can finally teach Assistant how to properly pronounce your friend or family member’s name.
Context Is Key
If you’ve interacted with a voice assistant, there’s a good chance you’ve changed the specifics of your command mid-sentence. “Hey Google, set a timer for 20—no, 10 minutes.” Until now, Assistant would’ve likely named your 10-minute timer “20, no.” With the latest update, it understands you made a mistake, and that you just want 10 minutes on the clock.
You’ve also been able to control multiple timers at once with Google Assistant for some time, but if you wanted to cancel one of them, that required some annoying back and forth. Assistant is now much faster at identifying which timer you want to cancel. And if you give each timer a name, like “eggs boiling,” and then you said, “Cancel my egg timer,” the old Assistant wouldn’t understand what you’re talking about, because the names don’t match. The new update corrects that.
With alarms, if you asked Google Assistant before to move a previously scheduled alarm an hour later, it sometimes misconstrued that and set an alarm for one hour from the time of your request instead. Now it understands you were referencing a scheduled alarm, and it will make the adjustment properly.
The updated timer and alarm functions are available on screenless Assistant devices today (like Nest speakers) and will be coming to phones and smart displays at a later date.
These improvements come from a ground-up redesign of the system Assistant uses for natural language understanding. Amarnag Subramanya, a distinguished engineer at Google who leads the NLU and Conversational AI teams on Google Assistant, says it allows for far more natural conversations between us humans and our nonhuman helpers.
“Today, when people want to talk to any digital assistant, they’re thinking about two things: what do I want to get done, and how should I phrase my command in order to get that done,” Subramanya says. “I think that’s very unnatural. There’s a huge cognitive burden when people are talking to digital assistants; natural conversation is one way that cognitive burden goes away.”
Making conversations with Assistant more natural means improving its reference resolution—its ability to link a phrase to a specific entity. For example, if you say, “Set a timer for 10 minutes,” and then say, “Change it to 12 minutes,” a voice assistant needs to understand and resolve what you’re referencing when you say “it.”
The new NLU models are powered by machine-learning technology, specifically bidirectional encoder representations from transformers, or BERT. Google unveiled this technique in 2018 and applied it first to Google Search. Early language understanding technology used to deconstruct each word in a sentence on its own, but BERT processes the relationship between all the words in the phrase, greatly improving the ability to identify context.
An example of how BERT improved Search (as referenced here) is when you look up “Parking on hill with no curb.” Before, the results still contained hills with curbs. After BERT was enabled, Google searches offered up a website that advised drivers to point wheels to the side of the road. BERT hasn’t been problem-free though. Studies by Google researchers have shown that the model has associated phrases referring to disabilities with negative language, prompting calls for the company to be more careful with natural language processing projects.
But with BERT models now employed for timers and alarms, Subramanya says Assistant is now able to respond to related queries, like the aforementioned adjustments, with almost 100 percent accuracy. But this superior contextual understanding doesn’t work everywhere just yet—Google says it’s slowly working on bringing the updated models to more tasks like reminders and controlling smart home devices.
William Wang, director of UC Santa Barbara’s Natural Language Processing group, says Google’s improvements are radical, especially since applying the BERT model to spoken language understanding is “not a very easy thing to do.”
See What’s Next in Tech with the Fast Forward Newsletter
From artificial intelligence and self-driving cars to transformed cities and new startups, sign up for the latest news.