Design for spatial input

Back to WWDC23

Design for spatial input

Learn how to design great interactions for eyes and hands. We'll share the design principles for spatial input, explore best practices around input methods, and help you create spatial experiences that are comfortable, intuitive, and satisfying.

Chapters
- 0:00 - Introduction
- 2:22 - Eyes
- 12:21 - Hands
- 18:36 - Conclusion
Resources
- - HD Video
  - SD Video
Related Videos

WWDC23
♪ Mellow instrumental hip-hop ♪ ♪ Israel Pastrana Vicente: Hello and welcome to "Design for Spatial Input." My name is Israel, and I'm here with Eugene.
We are designers on the Apple Design team.
Eugene Krivoruchko: Today, we'll talk about designing interactions for eyes and hands.
We'll cover what's special about these new input methods, and how to make the best use of them on our platform.
Israel: Let's take a quick look at all the available input modalities.
With spatial input, you can simply look at a button and tap your fingers together to select it, keeping your arm relaxed on your lap.
Our system is designed to interact with UI comfortably at a distance.
In some cases, you can also interact with elements directly.
For example, typing on a virtual keyboard using your fingertips.
Holding your hands in the air can cause fatigue, but we will see that some tasks are better suited to interact directly.
Now, eyes and hands are the new spatial inputs, but you can also use other familiar inputs like voice to search without needing to type.
Or keyboard and trackpad, which are great for getting things done.
Lastly, you can also connect a game controller to play your favorite games.
We are going to focus on the most new and exciting spatial inputs: eyes and hands.
Using your eyes and hands to interact is distinct in a few ways.
First, it's personal.
Your eye movements and hand gestures are unique to you.
An array of cameras inside and outside the device capture all the details of your natural movements in a privacy-respectful way.
Next, it's comfortable.
You can keep your hands resting beside you because the device sees a wide area around you.
Lastly, it makes spatial interactions precise.
The device filters all the data and translates it into accurate interactions that you can use in your apps.
So spatial input, it's a personal input that feels incredibly comfortable while providing you great precision to control your interactions.
Today, we'll be going over how to use your eyes and hands to interact naturally with your apps.
Let's start with eyes.
Eyes are the primary targeting mechanism for spatial experiences.
All the interfaces in the system react to where you look.
And we can effortlessly target any element just by looking at them, no matter how far away they are.
Now, I'll talk about how to make apps that are comfortable to interact with; how to make them easy to target with your eyes; how to make interfaces that respond to where you look while respecting privacy; and finally, how eye intent simplifies our layouts and offers unique assistive options.
In order to build apps that are comfortable for the eyes, we need to consider how your content shows up in the device.
Here's the first thing to consider.
Even though you have an infinite canvas for your apps, you only see the content inside the field of view.
Within the field of view, it's most comfortable to look in the center, and it's less comfortable to look at the edges.
So, design apps that fit inside the field of view, minimizing neck and body movement.
Try to keep the main content of the app in the center of the field of view, the most comfortable area for your eyes.
Looking at the edges of the field of view can be tiring for your eyes, so use these areas for content that you don't need all the time, like secondary actions, which remain accessible and don't interfere with the main content.
Always try to maximize eye and neck comfort in your apps by placing the content inside the field of view.
Now, we should also consider depth when thinking about eye comfort.
Depth is a unique feature of spatial experiences.
Placing your content near or far away creates different feelings in your projects.
But our eyes focus on one distance at a time, and changing the focus depth frequently can create eye strain.
Look to keep interactive content at the same depth to make it feel effortless to switch between UI.
For example, presenting a modal view pushes the main view in the z-axis, and the modal is placed at the original distance.
By maintaining the same Z position, your eyes don't need to adapt to the new distance.
Now, you can use subtle changes in depth to communicate hierarchy, like in this example, with a tab bar on the left and a segmented control at the bottom.
This way, you're using depth meaningfully, while avoiding eye discomfort.
Now that we've seen how to make an app comfortable, we also need to make it easy to use with your eyes.
Eyes are very precise, but there are certain qualities that help our eyes target successfully on UI elements.
Our eyes naturally focus on shapes that guide our attention to the middle of an object.
To help our eyes, use round shapes like circles, pills, and rounded rectangles.
Avoid using shapes with sharp edges.
When you use sharp edges, your eyes tend to focus on the outside, decreasing eye precision.
Also, keep the shapes flat and avoid thick outlines or effects that call attention to the edges.
And lastly, make sure to center the text and the glyphs in your elements using generous padding.
So, always make sure that your UI is designed to guide the eyes to the center of the element.
Now that our attention is in the middle of our elements, let's look at the right size for your controls.
The minimum area that your element needs for eye target is 60 points.
But the element can be smaller than 60 points.
You can achieve the minimum target area combining size and spacing.
Use generous spacing between elements in your layouts.
This will help you target quickly and accurately with your eyes.
Again, it's very important to respect the minimum target area of 60 points, combining size and spacing to make your UI look great and easy to use with your eyes.
Out of the box, standard components have sizes that are easy to target.
Use these components as much as you can.
And if you use your own, make sure to follow our guidelines on sizing.
To learn more about points and layout, check out the session "Design for spatial user interfaces." Now that we've learned about how important the target area is for your eyes, we have to make sure that we maintain this target area in any position in a space.
For that, we need to understand how to scale your UI.
Let's look at two different scale mechanisms.
The system provides dynamic scale for app windows.
You can see how the window scales larger as it moves away, and smaller as it moves close.
Dynamic scale makes your UI fill the same field of view and preserve the size of the target areas, no matter where the window is positioned.
If you use fixed scale instead, your UI becomes smaller as it moves away.
Fixed scale changes the size of the interface and makes your app difficult to use with your eyes.
Let's look at this side by side.
Dynamic scale keeps your UI and the target areas at the same size, while fixed scale changes the size and makes the target areas too small.
When you create custom UI, use dynamic scale to ensure that your eyes can always target all the controls.
Besides scale, orientation also affects the usability of your app.
If the interface is at an angle, it's difficult to read and hard to use.
That's why system windows are always oriented to face people.
But if you create custom windows in your app, always make sure to keep your UI facing the viewer.
As we've just seen, correct scale and orientation of windows and UI are fundamental to ensure accuracy with your eyes.
To learn more about how windows on this platform behave, check out the session "Principles of spatial design." Eyes are a very novel input. and it's really important to make your interfaces respond to your eyes.
When interactive elements highlight, you understand that your eyes are driving the interaction.
Let's see what happens when you look at a group of buttons.
See how they highlight as you look at each one.
All interactive elements should be highlighted, and we do this with a hover effect.
But because your eyes move quickly, the effect needs to be subtle and work on top of any content, like when looking at your favorite photos, reinforcing intention without being prominent.
Thanks to the hover effect, all the system-provided controls highlight when you look at them.
If you create custom elements for your apps, use hover effects to add eye feedback and make your elements feel responsive.
Now, eye intention is very sensitive information.
Privacy is our top priority when dealing with eye data.
The hover effect happens out of your app's process, so you will only get the information of which element is focused when there is an interaction on the element triggered by a gesture.
Hovering on an element with your eyes is a signal for intention.
When you look at something for a long time, we know that you are interested in it.
It is a great opportunity to show you more information about it.
For example, buttons can have tooltips that reveal as you look at them.
Also, tab bars expand when you focus on them, showing a label for each tab.
Lastly, focusing on the microphone glyph inside a system-provided search field will trigger Speak to Search, revealing this layer and allowing you to perform a search using just eyes and voice.
All these system elements give extra information when you need it, while keeping a clean UI when not in focus.
Take advantage of them when creating your apps.
They are also built with privacy in mind to ensure that no focus information is being sent to the app.
Eye intent also provides great opportunities for assistive technology.
For example, using the Dwell Control feature, you can select content just with your eyes.
In this example, focusing on a button for a short time will show the Dwell Control UI and will select the button without needing to perform a tap gesture with your hand.
So, what did we just learn about designing interactions for eyes? We learned about how to make your apps comfortable for your eyes by placing content in front of the viewer, inside the field of view, and using depth responsibly.
Then we looked at how to design interfaces that are easy to use and how to guide your eyes to the center of the elements.
We also reinforced how important it is to respect the minimum target area of 60 points in your controls.
How we should always communicate interactivity and reinforce targeting by adding hover effects to your elements.
And lastly, how we can take advantage of UI elements that reveal extra information on eye intent.
I think this was great, and there's even more.
We've seen that eyes are a fantastic target mechanism, and they become much more powerful when you combine them with hands.
Now, I'll hand it over to Eugene to talk about it.
Eugene: Thanks, Israel.
Let's talk about hands.
Combined with eyes for targeting, hand gestures are the primary way to interact across the system.
Pinching your fingers together is an equivalent of pressing on the screen of your phone.
The system supports other familiar gestures.
For example, you can pinch and drag to scroll, and perform two-handed gestures like zoom and rotate.
Notice how in all these cases, UI feedback continues the motion of the hand, which really helps it feel connected to the gesture.
The gestures work the same way across the system and follow the logic similar to the Multi-Touch gestures.
This means that people can really focus on the experience, instead of having to think about how to perform the interaction.
This is why you should lean on these familiar patterns when designing your experience, and make sure to respond to gestures in a way that matches people's expectations.
In some cases, part of your experience might be a unique behavior that can't easily be expressed with standard gestures.
In this case, you may want to define a custom one.
Here are some tips on how to make a successful custom gesture.
First, make sure that the gesture is easy to explain and perform so that people can learn how to use it quickly.
It is also important to avoid gesture conflicts.
Your custom gesture needs to be distinctly different from the standard system set or common hand movements people might use in the conversation.
This has to be a gesture that people can consistently repeat without strain or fatigue, and that has a low rate of false activations.
Be mindful of people who are using assistive technologies to interact across the system and consider how your gesture will work in those cases.
To learn more about accessibility, check out the session "Create accessible spatial experiences." Gestures can also mean different things to different people, so make sure your custom gesture doesn't send messages you didn't intend.
All of this may be a tricky balance to hit, so it's always worth considering a fallback in the form of a UI affordance.
One of the most exciting aspects of our input model is the opportunity to use eyes as a signal of intent.
Using eye direction combined with hand gestures, we can create precise and satisfying interactions that are not possible on other platforms.
Let's look at the zoom gesture again to see what I mean.
At the start of the gesture, the origin point of the zoom is determined by where within the image your eyes are focused at that moment.
This results in that particular area to be magnified and centered as you zoom in.
As a result, you can navigate the image easily just by looking around and performing this simple gesture.
This feels really magical and 100 percent expected at the same time.
The point you are looking at naturally indicates the intent of that interaction.
Another example of this behavior is pointer movement in Markup.
To draw, you control the brush cursor with your hand, similar to a mouse pointer, but then if you look to the other side of the canvas and tap, the cursor jumps there landing right where you're looking.
This creates a sense of accuracy and helps to cover the large canvas quickly.
These are examples of interactions that use eye direction to make simple behaviors more precise and satisfying.
Eyes are used not only to target elements, but to implicitly provide a more granular location for that interaction.
This is a really powerful aspect of our input model that allows us to respond to interaction in a much more intelligent way.
Now let's talk about direct touch.
Across the system, we support being able to reach out and use your fingertips to interact.
For example, you can bring Safari close to you and scroll the page directly.
You can also use both your hands to type on the virtual keyboard and even have more spatial experiences, manipulating 3D content within your arm's reach.
Interactions at a distance stay comfortable for a long time because it's easy to target controls with your eyes, and your hands can stay rested while performing minimal gestures.
When designing for direct interaction, we have to keep in mind that holding hands in the air will cause fatigue after a while.
Still, certain apps will benefit from placing content within arm's reach for direct touch, like experiences that invite up close inspection or object manipulation; or any interactive mechanic that builds on top of the muscle memory from real-world experiences; and generally, whenever physical activity is at the center of the experience.
Lack of tactile response is another thing to consider when designing for direct interaction.
Every time we touch something in the physical world, our hands receive lots of multisensory feedback, which is essential to our perception.
None of this is happening when we reach out and touch virtual content.
And to make that interaction work, we need to compensate for the missing sensory information with other types of feedback.
Let's look at how we approached this challenge with keyboard buttons.
The buttons are actually raised above the platter to invite pushing them directly.
While the finger is above the keyboard, buttons display a hover state and a highlight that gets brighter as you approach the button surface.
It provides a proximity cue and helps guide the finger to target.
At the moment of contact, the state change is quick and responsive, and is accompanied by matching spatial sound effect.
These additional layers of feedback are really important to compensate for missing tactile information, and to make direct interactions feel reliable and satisfying.
Audio plays a special role in connecting input with virtual content across the system.
To learn more about it, check out the session called "Explore immersive sound design." To recap, here are the takeaways for designing interactions with hands.
Use gesture language consistent with the system so people can focus on the content instead of the interaction.
Be careful to only introduce custom gestures when the desired behavior can't be achieved with the standard set.
Look for ways to improve your interaction using eyes as a signal of intent.
Only use direct interaction when it's at the core of your experience.
And if you do, provide extensive feedback to compensate for missing sensory information.
Today, we have talked about some of the design principles for spatial interactions with eyes and hands.
We talked a lot about comfort and ergonomics.
With so many ways for how software can look, behave, and react to input on this platform, there is more responsibility on designers and developers to make sure these experiences are comfortable and accessible.
By running your app on device, people welcome your work into their space and give it their full attention.
Software is no longer contained within a screen.
Instead, it's allowed to occupy a more significant portion of people's physical surroundings and react to their natural body movements.
Using hands to interact with virtual content is also something very new for most people.
That's why it's so important to guide them by providing clear feedback and to rely on familiar interaction patterns where possible.
As we know from designing for other platforms, great input experience is the one that you don't have to think about.
Software response becomes a natural continuation of your body movement and perfectly matches the interaction intent.
Our ability to use eyes as the foundation of the input model opens up opportunities to respond to interaction with magical precision.
We think it's really powerful, and we hope you will use it to create delightful and novel interactions for the spatial medium.
Please make sure to check out the sessions that we have referenced throughout the talk.
Thanks for watching! Israel: Adios! ♪

Chapters

Resources

Related Videos

WWDC23