Assuming Direct Control. Voice Assistants and Connected Devices.


Today, voice assistants have nailed down information retrieval quite well. What’s the weather today, how many days till Christmas, and so on. But when it comes to performing control, we start to run into limitations centered around time, safety, and security. A Wi-Fi router typically needs to restart in order to apply configuration changes. Allowing voice to change or remove passwords can lead to abuse and household confusion.

A connected faucet may need time until the water is at the right temperature. And allowing voice to turn it on does not seem like a good idea. Tackling these constraints is an exercise in understanding user intent and implications. Solutions vary across applications, but I’ve rounded up some notes and guidelines in this post.

Reflect Time

Not every intent can finish in an instant. It’s important to have asynchronous speech responses and support for follow-up responses. An intent for an oven that changes preheat temperature may follow-up later with a notice that the temperature has been reached. Even though the oven may also add to this indication, the user may only be within reach of the voice assistant.

Manage Downtime

If the device needs to go offline for an extended period, what are the implications? If the Wi-Fi router has to reboot, the voice assistant most likely does not have a connection. Essentially the entire connected home goes offline for a minute or two. This adds complexity to managing the state of the user’s experience. On the other hand, a security camera rebooting may not be as big of a deal.

Security Adds Friction

There are apps that require logging in to a third party service using OAuth for example. Beyond this, will there be any specific intents that require another layer of authorization? Given a smart lock, you don’t want to expose the function to unlock out of concern that someone outside the premises can do so.

There are some options to secure control. A pin or passphrase may help but is vulnerable to eavesdrop. Emulating two-factor style authentications add a lot of friction. It is usually better to just situate such controls within a mobile app. Voice recognition is another promising solution but not entirely foolproof either.

Safeguards

Never underestimate how users will find ways to utilize controls. Probing how voice controls and a device can be used, appropriately or not, helps plan out safeguards. It doesn’t have to be 100% foolproof, but have enough protections against misuse and abuse. A connected faucet might only run for five minutes unless a timer is specified.

Some Things… Just Don’t

Not every API method available to the device needs to be exposed to the voice assistant app. We need to carefully consider whether it adds to the user experience plus tradeoffs. Use cases and user feedback can further validate needs. Ultimately, a user should have no business changing a Wi-Fi password using voice.

Voice control is a fantastic option in many cases. For media playback, I love that there’s less panic about a misplaced remote. I certainly like to see more smarthome devices expand control options when paired with voice assistants. However, there are many implications in safety and security that can shift the balance between convenience and friction. Hopefully as we progress, more techniques and methods will be developed to enable the control possibilities.