Voice and MultimodalProduct Design

Designing Discoverable Voice User Interfaces

What makes a digital product easy to use?

When a digital product is easy to use, you ‘just know’ how it works, and are seamlessly guided through it by the user interface. They feel like a breeze. Conversely, we’ve all been through those long and arduous onboarding tutorials, found ourselves buried in FAQs, or perhaps Googling a ‘how to’ question about how to get something done. In the worst cases, app tutorials will make headlines explaining what the arcane buttons do, or the interface is invisible to the user.

Voice User Interfaces (VUI) come with the promise of easier, more efficient experiences. However, oftentimes we are met with discoverability challenges that result with frustration and unmet expectations. Why?

In Donald A. Norman’s 1988 seminal work, The Design of Everyday Objects, he explains how the very design of physical objects can inform their use. The work is a staple for the field of UX and CX design. One of the most crucial aspects of the work is when Normal discusses Affordance and Constraint. Affordance is how something conveys possible uses or how it signals what someone can do with it. Constraint is just that — limitations for the product or experience at-hand.

A favorite example from the book is the humble door. Doors that have handles vs push plates — a handle encourages a pull, when a plate encourages a push. The critique of door design born from this book has even created a name for a poorly designed door that either doesn’t signal what to do with it, or — a Norman Door.

norman door

When we design digital products and experiences for mobile or web, designers lean on user expectations and platform convention to make it so people don’t have to learn from the ground up each time. These mental models are useful to consider when approaching any new experience. In User Friendly Cliff Kuang and Robert Fabricant describe that “mental models are nothing more and nothing less than the institutions we have about how something works…” We often leverage Apple’s Human Interface Guidelines, or Google’s Material Design guidelines, because they are the canon of mental models for iOS and Android. If we use the HIG and Material correctly, we can ensure we are designing a highly usable experience for their respective platforms.

Effective UI design contains moments of affordance to help users along. For example, a large green call to action button at the bottom of a check-out page is asking to be pressed — it affords to be pressed (at least in cultures in which green means ‘go’). Netflix and other OTT streaming services are deliberate in truncating the cover art of a horizontal list of titles, so the row affords to be scrolled.

affordance cropped image 1

Designing for affordance and constraint is far more complex for voice user interfaces. While a graphical UI can demonstrate constraint by available features or interface elements — voice begins with a blank slate. The mental model we must design for is approaching a human conversation. And in many cases, the input must be incredibly precise. It’s a return to the command line of MS-DOS, backgrounded by sky-high consumer expectations set by the largest tech companies in the world.

Rather than the user seeing a greenfield of opportunity, the VUI becomes a black hole of friction and confusion.

affordance image 2

We’re invested in this problem space at WillowTree. We think it’s solvable and mitigated through thoughtful user experience design, cleverly practiced engineering, and an embrace of a multimodal approach. In order to set a Voice User Interface up for success, we’ve outlined the following steps and provided a corresponding free, open sourced Figma file which we’re calling the Voice User Interface (VUI) Kit.

affordance image 3 affordance image 4

The framework for designing VUIs is as follows:

1. Prompt

How is the voice UI invoked? How should the UI provide constraints for what is within the scope of the UI?

2. Listen

How will the voice UI let the user know that it’s working and that they are being heard?

3. Acknowledge

How will the voice UI let the user know that they were heard and that they can expect a result, answer, or follow up question? How will the user know that they are being understood?

4. Process

How will it let the user know that the result is currently being considered?

5. Answer

How will the voice UI show the result, answer, error, or follow up question? It’s critical to consider a visual response here vs an audible response depending on the user case.

divider 600

There are a multitude of decisions to make within this framework, but we think the Voice User Interface Kit provides a great place to start for most VUIs. Let us know how you’ve used the kit, or get in touch if you want to work with us to design your next Voice User Interface!

Get the free, open-sourced Voice User Interface (VUI) Kit

Access the Figma Community File

Quiet innovations and a push for mindfulness: Takeaways from Apple’s WWDC 2021 Keynote

From iOS 15, iPadOS 15, and macOS Monterey getting announced to a clear push for...

Read the article