-
Create immersive media experiences for visionOS - Day 1
Learn how to create compelling interactive experiences for visionOS and capture immersive video in this multi-day live streamed activity.
On Day 1, we'll show how visionOS 26 can help you tell impactful immersive and interactive stories. You'll learn how to frame your creative ideas for formats like Apple Immersive Video and explore real-world examples from past productions. You'll discover how to design for spatial interaction and tell stories that make the audience part of your experience. And you'll find out how to use SharePlay and spatial Personas to help people connect over your ideas.
On Day 2, we'll dive deep into Apple Immersive Video with Apple Spatial Audio. You'll discover how to create entirely new media experiences, get started with Apple Immersive Video, explore new production workflows, and dig into previous Apple Immersive productions.Resources
Related Videos
Meet With Apple
-
Search this video…
Welcome to the Apple Developer Center in Cupertino. How many of you folks in the room – How many is your first time here in the center? Awesome. This is like half the room. Hi, everybody. Online welcome. We’re so excited to have you here today. My name is Serenity Caldwell, and I'm a visionOS design evangelist with worldwide developer relations. And on behalf of many, many people behind the scenes and people who you hear from over the next couple of days, we're really, really excited to have you here, here at the Developer Center. Our goal is to connect you with content about about visionOS, and connect with your ideas and help you create incredible experiences for Apple platforms. Whether you're immersing somebody in a film or building the next great game, or exploring all new ways to tell stories. And this is one of four developer centers where we regularly host content and workshops, sessions and labs, and we also stream content worldwide. So no matter where you live, you can always learn something new. We're excited to have so many of you, of all of you joining us online as well. Thank you for tuning in. And speaking of streaming, we do ask that you please refrain from doing your own video recording or live streaming while you're in this theater. You are welcome to take photographs or screenshots if you're online, but please leave the video to us at least until you're back on your own sets. You are welcome to use your iPhone, iPad, computer, Vision Pro, or other device to take notes. And if you're in the room with us, you can also get online if you haven't already with the Apple Wi-Fi network. And if you need to recharge at any point, everybody has power at their seats here in the room. And if you're online, I hope you know where your power is. So with that administration out of the way, let's get into the fun stuff. This week, we've taken over the Developer Center to help you learn how to make great immersive and interactive experiences for visionOS, we've got not just one, but two full days of sessions for everybody, both in person and online. And three days of activities for our in-person audience, including hands on time with the Blackmagic Cinema Immersive camera community mixers and seminars on Apple Immersive video production. You'll hear from Apple designers, engineers and production teams, as well as a couple special guests. TBD on day one, we're going to set the scene for you. We'll start with an overview of what you can do as a media creator and a developer on this platform. Then Elliott will take us through the core foundations to keep in mind when creating immersive video. We'll break for a nice little lunch, and then I'll share how you can design impactful, immersive and interactive narrative experiences for visionOS, Nathaniel will explore how you can bring those designs to life using visionOS frameworks like SwiftUI and Realitykit. And Ethan and Alex will help us learn how to create experiences for visionOS that can be shared with people nearby or around the world.
And if you have a question for any of our presenters throughout the day, you can drop us your question online. At the end of the day, we'll be answering the most upvoted questions in a live Q&A panel. And after our Q&A, we will say goodbye to our friends on the live stream. Sorry folks, but the folks in the room will have you go out to the lobby for a community mixer, and we'll have a special hands on experience with Blackmagic's Ursa Immersive Camera, the world's first camera designed for capturing Apple Immersive video. For those of you online, don't fret. We do have plenty about Apple Immersive Video for you starting tomorrow at 10 a.m. Pacific, when we'll be back online and in person to dive deep into the format and what you need to consider as a creator. We'll have a full day of sessions on the technology behind Apple Immersive Video and Apple Spatial Audio Format, as well as production post-production workflows, and we'll hear from some of the crew of past Apple Immersive productions as they share takeaways from working with this new storytelling medium. It's going to be an incredible couple of days. You folks ready to get started? Ready on the line. All right, let's get started. So quick show of hands in the theater. Who here has already started developing a project or content for Apple Vision Pro? All right, half of you. Awesome. Anybody new to the platform in the theater? Just a couple of you. Okay. Well, for those of you who are new and those of you online, let me tell you, if you make media experiences, visionOS offers so many ways to pull your audience into the story. That's because experiences on this platform give people an unparalleled sense of presence.
Whether your audience is immersed in your video content, meeting characters face to face, or exploring new places, and the stories that you can tell impact people long after the experience ends.
We think Apple Vision Pro has the potential to be the ultimate storytelling device, and there are a few reasons for that. First, media content looks incredible on this device. Vision Pro features high resolution displays that are packed with 23 million pixels, which means that stories are crisper, more vivid, and more lifelike than ever before. And just for some context, setting this screen behind me has 22 million pixels. Shrink it. Put it in front of your eyes. That's what you're looking at. And these high quality visuals, they pair with high fidelity spatial audio. So that sound from an experience or a place or a piece of media, it feels like it's coming from all around your audience. This carries over to content outside of a frame, too. Vision Pro is also a spatial computer, so it has depth and dimension, and that means you can unlock some incredible educational moments with 2D and 3D content. And you can use immersion to bring your audience even further into your story. So they feel like they're really there. and they can even experience this with their friends. Now, all of these are incredible reasons to create content for Vision Pro, but add interaction and you unlock the power to create stories that give your audiences agency inside your work, although sometimes at their own peril. Now there are a lot of storytelling possibilities. Clearly. So what we want to do this morning is lay out in more detail the kinds of work that you can make with Vision Pro and the platform that powers it. visionOS we're going to start with some of the basics media on this platform. I'd like to invite Tim Dashwood up on stage. He's going to tell you a bit more about the tools you have available to create incredible 2D, 3D, and immersive media experiences. Tim.
Good morning.
Today we'll explore all of the different types of media available on visionOS.
These include regular 2D video, 3D stereoscopic movies, spatial videos, and for the ultimate immersive media experience. Apple Immersive Video.
I'm very happy to say that with visionOS 26, we've added support for three additional immersive media projection types 180 degree, 360 degree, and wide field of view.
Let's start by exploring 2D and stereoscopic 3D video and how it can be presented on visionOS, including some amazing new features. Now available in visionOS 26 Vision Pro is a great device for watching movies and TV shows, and as a stereoscopic device, it's the perfect way to deliver 3D movies.
2D and 3D videos can be played in-line anywhere within an app. When using an embedded playback experience, where the video appears alongside other UI elements. Here's an example of an inline video playing in Freeform as part of a board full of content.
Note that if you embed a 3D video inline, it gracefully falls back to playing in 2D.
Your 2D and 3D video can also be played in an expanded experience that fills the entirety of an app's window. Here, 3D content can be played stereoscopically with full dimension and depth.
You can also create immersive video experiences for your 2D or 3D content inside an app. Here's an example from Destination Video. A sample code project available from Developer.apple.com. When video plays in the expanded view, pass through video automatically dims, but you as a developer can go further. You can enable custom docking regions to allow your video to play back in a system environment like Mount Hood, or create your own custom environment where you can add features like audio reverb and dynamic light spill to make your content feel like an internal part of that environment. Need to show multiple pieces of content in your app at once. visionOS also supports multi-view video, so you can deliver multiple camera angles of a single event or multiple sources of video simultaneously in Vision Pro.
And new to visionOS. 26. 2D and 3D videos can specify a per frame dynamic mask to change or animate their frame size and aspect ratio, without needing to know to show black bars for letterboxing or pillarboxing.
This is really useful if you need to accentuate a story point, or to combine archival and modern day footage in a single scene, and these kinds of seamless transitions are only possible on visionOS.
But what if you want to create stereoscopic 3D content without the complex production pipelines of a big studio? Well, in visionOS One we introduced a new kind of stereo media known as Spatial Audio. Spatial video is the easiest way to shoot comfortable, compelling stereo 3D content on devices like iPhone or Vision Pro without needing to be an expert in the rules of 3D filmmaking.
Spatial video is just stereo 3D video, with some additional metadata that enables both windowed and immersive treatments on visionOS to mitigate common causes of stereo discomfort.
By default, spatial video renders through a portal like window with a faint glow around the edges. It can be expanded into its immersive presentation, where the content is scaled to match the real size of objects.
The edge of the frame just disappears and the content blends seamlessly into the environment around you.
And if you have a media app designed for multiple Apple platforms, spatial videos automatically fall back to a 2D presentation. This enables you to deliver spatial content for Vision Pro, while also providing great 2D content for people on other platforms.
Spatial videos can be captured today on many models of iPhone with the camera app or in your own app via AV capture device APIs. Spatial videos can also be captured directly on Apple Vision Pro, and with Canon's R7 and R 50 cameras with the canon dual lens.
And if you're looking to edit and combine spatial videos in your app to create a longer narrative. The format is now supported in post tools such as Compressor, DaVinci Resolve Studio and Final Cut Pro for .Mac.
Creators are already capturing and sharing some incredible spatial videos and photos. Check out the Spatial Gallery app for visionOS to experience how people are already using this format to tell stories from new perspectives.
We've been talking a lot about spatial video, but visionOS 26 can now present photos and other 2D images in a whole new way. We call these spatial scenes.
Spatial scenes are 3D images with real depth generated from a 2D image like this. Was there like a diorama version of a photo with the sense of being able to look around objects as the viewer moves their head relative to the scene.
People are already using spatial scenes in their experiences, like the classic car showcase Paradise.
In addition to some spectacular 3D models, people can browse spatial scenes for each car to see them out in the wild.
So we've seen how 2D, 3D, spatial videos, and spatial scenes are all presented on a virtual flat surface in visionOS. This is because those videos typically use what's known as a rectilinear projection, as seen in this photo of the Apple Park Visitor Center. Rectilinear just means that straight lines are straight. There's no lens curvature or warping in the video. And because of this, these kinds of videos feel correct when viewed on a flat surface that also doesn't have any curvature or warping.
But on a spatial computing device, we're not just restricted to flat to a flat surface in front of the viewer. Sometimes you want to tell stories that break beyond the frame.
For that, we can use non-rectilinear projection types that curve around the viewer. visionOS 26 adds support for three of these Non-rectilinear media types 180 degree, 360 degree and wide field of view video in visionOS 26. These immersive video types are supported natively on visionOS via a new QuickTime movie profile called Apple Projected Media Profile or Apmp.
A wide range of cameras are already available to shoot these types of videos with Apmp. Let's take a look.
180 degree video is presented on a half sphere or a hemisphere directly in front of the viewer.
The video completely fills the viewers forward field of view. In this example, it's a 180 degree video of the pond here at Apple Park from the viewer's point of view. It's like being there. This is a great way for content creators to transport their viewers to amazing locations.
360 degree video takes things a step further, filling the entire world with content. With 360 degree video, the content literally surrounds the viewer, giving them the freedom to look wherever they like.
Here's an example of a 360 video captured underneath the rainbow across the street at Apple Park. The viewer can look around at any angle and feel like they are right there beneath the rainbow. Everything looks just as it would if they were there in person.
To achieve this, the 360 video is projected onto the inside of a sphere completely surrounding the viewer, centered on their eyes and filling their field of view whichever way they look.
A rectangular video frame twice as wide as it is high is used to achieve this. The video is mapped onto a sphere around the viewer with an equirectangular projection, or Aqua for short.
180 degree video also uses equirectangular projection, but only for half of a sphere, so it's known as half erect projection. Half erect videos have a square aspect ratio and map their video onto a hemisphere in the same way. 360 does for a full sphere.
For stereoscopic 180 video, we simply have two squares of video, one for each eye.
Many existing stereo 180 videos encode these two squares side by side in a single pixel buffer. This twice as wide as the resolution per eye. This is known as side by side or frame packed encoding. But there's an awful lot of redundancy here because there are two views of the same scene. The left and right eye images are very similar. visionOS takes advantage of the similarity to use a different approach for encoding stereo 3D video.
We use multiview encoding, so you're probably already very familiar with High Efficiency video Coding or Hevc for stereoscopic videos. App Nap uses Hevc or Multiview Hevc.
MV HEVC encodes each eye into its own pixel buffer, and it writes those two pixel buffers together in a single video track. It takes advantage of the image similarity to compress one eye's pixels relative to the other, encoding only the parallax differences for the second eye.
This results in a smaller encoded size for each frame, making MV HEVC videos smaller and more efficient than typical frame packed video. This is really important when streaming stereo video.
And in fact, MV HEVC is the preferred way to encode all stereoscopic media for streaming to visionOS from Spatial Audio to Apple Immersive Video.
So let's turn our attention to one of the most unique types of App Nap App Nap video now supported in visionOS 26 wide field of view video from action cams such as GoPro hero 13 and Insta360 Pro two, these action cams capture highly stabilized footage of whatever adventures you take them on. They capture a wide horizontal field of view, typically between 120 and 180 degrees, and often use fisheye like lenses that show visible curvature of straight lines in the real world. This enables them to capture as much of the view as possible.
Traditionally, these kinds of videos have been enjoyed on flat screen devices like iPhone and iPad, and this is a fun way to relive the adventure. But in visionOS 26, we're introducing a new form of immersive playback for these kinds of action cams. Recreating the unique wide angle lens profile of each camera as a curved surface in 3D space, and placing the viewer at the center of the action.
Because that curved surface matches the camera lenses profile, it effectively undoes the fisheye effect, and the viewer sees straight lines as straight, even at the edges of the image. This recreates the feeling of the real world as captured by the wide angle lens.
Action cams all have different lenses with different shapes and profiles to model these different lens profiles. App Nap defines a bunch of lens parameters. Camera and lens manufacturers can tailor these parameters to describe a wide variety of lenses, and how those lenses map the real world onto pixels in an image. Because it's defined by parameters, we call this projection parametric immersive projection, and it's how the projection of wide field of view video is understood by visionOS.
Playing App Nap App Nap video Immersively puts the viewer's head right where the camera was during capture, even if that camera was strapped to the end of a surfboard. This means that immersive playback is especially sensitive to camera motion.
And excessive motion makes viewers uncomfortable. To help mitigate this, in visionOS 26, playback will automatically reduce the immersion level when high motion is detected, which can be more comfortable for the viewer during high motion scenes. The settings app also offers options so viewers can customize high motion detection to their personal level of motion sensitivity.
Now, if you're anything like me, you probably have a bunch of old Vr180 and 360 footage, but haven't had an easy way to watch or share those videos on visionOS.
So to make converting your 180 and 360 videos from the past a little easier, we've also updated the AV convert command line tool in macOS 26 to convert legacy 180 and 360 content to Apmp.
And if you're not comfortable using the command line, we've also added new presets to the AV convert functionality in Finder on Mac OS, which allows you to simply use the contextual menu on most legacy stereo 180 or 360 videos and then choose one of the new MV HEVC presets.
And for developers, this all leads us to playback in your own apps. App Nap can be played on visionOS 26 by all media playback frameworks like Realitykit, AV Kit, Quick Look, and WebKit. So you can integrate it into whatever type of experience you build.
We're very excited to announce that Vimeo now supports upload and playback of 180 degree and 360 degree App Nap content, in addition to 2D and spatial video.
So even if you don't have an app of your own yet, you can share your App Nap videos to audiences across all platforms.
So that's Apple. Projected media profile. Now, for the ultimate experience, there's Apple Immersive Video, which we're making available to developers and content creators for the first time this year. It's. Here's an example from the Apple TV series wildlife, which transports viewers to meet the elephants at Kenya's Sheldrick Wildlife Trust. This scene would be almost impossible to experience in reality, but with Apple Immersive Video, it feels like you're truly there.
Companies like Canal+, Red bull, CNN, the BBC and Rogue Labs are currently creating Apple Immersive content with Canal+ MotoGP tour de force already available to experience through the TV app.
And we're also very excited to be partnering with spectrum to stream a selection of live LA Lakers regional games early next year and Apple Immersive via the spectrum Sportsnet app.
This is Apple's very first live Apple immersive experience, and we can't wait for Vision Pro owners to feel the intensity of a live NBA game as if they were courtside.
As a creator, this is the perfect time to explore making content with Apple Immersive Video, because you now have an end to end pipeline to help you create, edit, mix, and distribute this content.
You can capture Apple Immersive Video using Blackmagic Designs or Immersive Camera.
Then do your editing, VFX, Spatial Audio mix and color correction on the Mac in DaVinci Resolve Studio.
You can preview and validate using the new Apple Immersive Video utility app, available on the App Store for both macOS and visionOS. And then, if you're preparing your content for streaming, you can create segments in the upcoming version of Apple's Compressor app for distribution via HTTP live streaming or HLS.
Many other companies are already working on support for the Apple Immersive workflow, And this includes color front for dailies and mastering and spatial gen for encoding and distribution.
And we're incredibly excited to announce that Vimeo plans to support hosting Apple Immersive video content.
For developers of pro apps, or anyone who wants to create their own tools to work with Apple Immersive Video. There's a framework in macOS and visionOS 26 called Immersive Media Support that lets you read, write, and stream Apple Immersive content. And there's a code sample called authoring Apple Immersive Video to get you started.
Apple Immersive Video can be played on visionOS 26 by all of our media playback frameworks like reality Kit, AV Kit, Quick Look, and WebKit with support for HLS streaming.
But at this point, you might be asking what makes Apple Immersive Video so special? Well. several of you may have already started shooting with the new Blackmagic Ursa Cinema Camera from Blackmagic Design. If not, those of you here in person will be able to check out the camera later this afternoon during our hands on sessions.
The specs for the Ursa Immersive Camera are astounding.
Every lens and every camera is individually calibrated at the factory using a parametric approach similar to what we saw earlier, but tuned for each individual lens and sensor pair.
The Ursa Immersive captures stereo video with 8160 by 7200 pixels per eye. That's 118 megapixels total at 90 frames a second.
That's over 10 billion pixels per second.
The Ursa Immersive captures up to 210 degree field of view horizontally and 180 degree vertically, with a perceptual sharpness approaching that of the human eye. But resolution and frame rate isn't the whole story.
There are so many more capabilities of Apple Immersive Video unique to the format. I'd like to quickly tell you about a few of my favorites.
The first is Dynamic Edge blends.
Every shot in an Apple Immersive Video can define a custom edge blend curve that best suits its content and framing. This isn't a baked in mask, it's a dynamic alpha blend curve that feathers the edges of the shot at playback time to transition seamlessly into the background.
Another feature unique to Apple Immersive Video is support for static Foveation of the final video encode.
It would be impractical to stream the full 8-K per eye resolution of Apple Immersive Video over the internet at full resolution at 90 frames per second, but it's also undesirable to just scale the image down to 4K because we'd lose too much pixel density in the image.
So instead, a static foveation distribution function can be applied to the image, which dedicates most of the area of the smaller frame size to the pixels of primary importance to the image. Typically the central area of the fisheye lens.
This is one of the reasons Apple Immersive Video can have such a high perceptual acuity with easily streamable file sizes and color front and spatial. Jen have already committed to supporting static foveation on their platforms.
So we've spoken a lot about video, but let's not forget audio Apple Immersive Video offers such a rich storytelling opportunity that we needed an all new audio experience to match the Apple Spatial Audio format, or ASAF, along with the new codec. The Apple positional audio codec, also known as APAC.
We've been using ASAF on all of our Apple Immersive productions, and you'll hear more about it from Doctor Deep send tomorrow, as well as how you can create your own immersive sound mixes using Fairlight. DaVinci Resolve Studio.
And with Apple's brand new and free AAX plugin for Pro Tools, it's called the ASAF Production Suite and it's now available for download on Developer.apple.com.
I sadly don't have time to cover all of the other incredible capabilities in Apple Immersive Video, so I encourage you to check out some of the many sessions scheduled over the next two days to dive in deep. There really is so much you can create for this platform. 2D and 3D video spatial video, photos and Spatial scenes 180 degree 360 degree App Nap video, wide Field of view App Nap video and Apple Immersive Video.
But that's just the beginning. Your media experiences really shine when you pair them in apps with Apple frameworks like sharing your immersive video experiences with people so so viewers can react in real time with each other, even if they're FaceTiming in from the other side of the world. To tell you more about how you can take advantage of visionOS frameworks. I'd like to welcome Adarsh to the stage.
Thank you Tim.
Hi everyone! My name is Adarsh. I'm a technology evangelist for Apple Vision Pro. You just learned how you can deliver an incredible cinematic experience using immersive media.
But what if I tell you that you can go beyond the screen and add a deeper level of immersion, where your audience can even interact with the characters in your story? Well, you don't have to take my word for it. Here's encounter dinosaurs.
That's a visionOS app, and it starts out with a magical interaction, where a butterfly flies to you and lands on your finger as you extend your hand out.
Then a portal to a prehistoric world opens up and the butterfly flies right into that world.
But the real thrill of the experience begins when a lightning strikes.
And there's an earth shattering thunder that fills up the space.
The story then introduces Izzy, a curious yet friendly little baby dinosaur.
And as you continue with the experience, you come across Raja, the Rajasaurus who comes out of the portal into your room and looks you in the eye.
But be careful if you fiddle with him, he may even try to bite you.
Oh, the last one especially was bone chilling.
Well, luckily I have a full blown presentation to warm you up to the idea of bringing similar capabilities into your storytelling apps on Apple Vision Pro this morning, I'll take you through some of the capabilities in our native tools and frameworks that act as fundamental building blocks for making a great interactive experience.
In particular, I'll cover some of the latest features in SwiftUI reality Kit and ARKit. Broadly speaking, SwiftUI helps you create great 2D and 3D spatial interfaces.
Reality Kit gives you the ability to add fully interactive 3D scenes that can blend with the real world surroundings, and ARKit helps you connect that content to the physical world.
Together, these frameworks offer the best and the easiest ways to add immersion and interactivity to your experiences. Hey, but again, don't take my word for it.
Just go to developer.apple.com and check out these amazing sample code projects. These samples show you how to utilize the various capabilities on visionOS, like how to playback immersive media, create custom environments, utilize custom hand gestures to interact with 3D content, build interactive games, and much more.
In fact, there are samples that even demonstrate the capabilities in the latest release of visionOS visionOS 26.
I want to talk about some of the newest and the coolest features, alongside showing some inspiring examples that will help you create amazing storytelling experiences with rich 3D content and engaging interactions.
I'll start with a quick overview of immersion on visionOS.
Then I'll go through some new capabilities in visionOS that can help you create more immersive spatial experiences and interfaces.
I'll also show you how you can bring 3D content to life so that it blends seamlessly with the real world surroundings. I'll take you through different ways in which people can interact with 3D content. I'll cover how multiple people can enjoy an immersive experience together using SharePlay.
And lastly, I'll go through some cool capabilities for creating spatial experiences for the web.
Let's start with a quick overview of immersive scene types. visionOS experiences are made possible using windows volumes and Spaces, and visionOS 26 adds new capabilities to each of these scene types.
I want to focus on Windows and Volumes first.
When an app is launched on Vision Pro, it opens into a shared space where it exists side by side with other apps, and apps typically start out with a window or a volume, which help people stay grounded to their surroundings while maintaining consistent and familiar interaction patterns. Like these are all windowed apps.
With visionOS 26, people can log these windows and volumes to particular locations in their physical surroundings, and the device will remember these positions even after a reboot.
Also, windows can be snapped to vertical surfaces, while volumes can be snapped to horizontal surfaces.
One cool characteristic of volumes is that they can be looked at from any side, and you can even customize the content based on the direction from which a person is looking at your content. Like in this example, the sidewall fades away when the volume is looked at from the side.
So imagine a 3D scene come to life on someone's coffee table, where your audience can explore and interact with your story from multiple angles while your content adapts to their point of view.
So that was windows and volumes. Next, I want to talk about Spaces.
To give you an idea of this scene type, I want to show you the mindfulness app. It's a meditative experience and Apple Vision Pro when you use a space scene type like this, your app is the only one running. It adds a deeper level of immersion by expanding into three dimensions and filling up the entirety of the room.
And you have a lot of different ways to take advantage of this kind of immersion.
You can use the mixed style immersion, where your app can blend 3D content with physical space.
That way, your app can continue to keep people grounded to their surroundings while offering a deeper level of immersion. Or, you could use progressive immersion that lets people choose an immersion level that fits their experience the best, and they can change that immersion level by turning the Digital Crown on Apple Vision Pro.
And there is also an option to completely immerse them using the full immersion.
Now there are a couple of interesting additions to immersive spaces this year. Like previously, progressive immersion style was only supported in the landscape aspect ratio. Now, in addition to landscape aspect ratio, progressive immersion style also supports a portrait aspect ratio.
This can work really well with your vertical screen content. For example, if you have an existing iPhone experience that showcases the characters from your stories, you can recompile it for visionOS while maintaining its original aspect ratio.
The example here on screen is Petite Asteroids. It's a sample project that utilizes the portrait style aspect ratio to take advantage of its vertical design.
Later this afternoon, my colleague Nathaniel will show an end to end workflow of how some of the features of this experience were built.
Another great addition to visionOS 26 is your immersive space. Content can blend in with system environments, so now your audience can choose to interact with your story in any of the system environments. Like, for example, people can watch immersive content on the moon, and yes, even on Jupiter's moon Amalthea, which is a new system environment in visionOS 26. Six. By the way, isn't that cool? Together, these scene types, windows, volumes, and spaces offer a spectrum of immersion that you can utilize for telling great stories on visionOS.
Here's one more example. This is the museum's app that takes advantage of all three scene types to show off historic art and sculpture. While it tells the story behind each of those artifacts through an audio tour.
Next, I want to talk about how you can create spatial interfaces that work with these immersive scene types.
A great starting point for creating immersive interfaces on visionOS is using SwiftUI.
And if you're familiar with building 2D apps on SwiftUI, you can use the same APIs and create rich 3D layouts the same way you're used to.
visionOS 26 gives you some new ways to build 3D experiences and make them even more immersive, like the SwiftUI layout tools and viewmodifiers that you may already be familiar with. Have added first class 3D analogs to give your views, depth and a z position, along with some functionality to act on those.
A couple that I like to highlight today are the depth, alignment and rotation 3D layout starting with depth alignment. It's an easy way to handle composition for common 3D layouts.
For example, a front depth alignment is applied to the name card here to automatically place it in front of the volume containing the 3D model, so any text callouts in your experience can really remain visible. Very simple.
Another addition is the rotation 3D layout modifier. Notice how the top airplane model makes way for the middle one to rotate comfortably. Another really smart feature in SwiftUI these kind of features in SwiftUI can come really handy when you take your audience through an interactive, non-linear story. For example, in a non-linear story that has multiple endings, you may want to present text panels alongside a call to action. And based on the viewer's choice, you can decide which of the pre-determined climaxes you want to show them to. Build a feature like this. You need to use presentations in SwiftUI or presentation component. If you are using Realitykit. With presentations, you can enable transient content like this content card. About the trail that shows a call to action at the bottom.
You can also display menus, tooltips, popovers, alerts, and even confirmation dialogs alongside your 3D content.
Normally, windows and volumes act as containers for your app's UI and its content in shared space.
And now a new feature in visionOS 26, called Dynamic Bounds. Restrictions can allow your apps content like the clouds here to peek outside the bounds of the window, or the volume, which can help your content appear more immersive.
One great aspect of visionOS 26 is that many of these new features were designed to create tight alignment between SwiftUI, Realitykit, and ARKit, which can make your 3D content and UI elements work in lockstep with each other.
For instance, you can have observable entities in Realitykit. And what that means is that your 3D entities in reality kit can send updates to SwiftUI elements. For example, the temperature indicator here can be updated as the 3D hiker's position changes over time.
Another example of tight integration between SwiftUI and Realitykit is the new coordinate system that allows you to seamlessly move objects between different scene types.
For example, notice how the robot on the right that's within the SwiftUI window in a 2D coordinate space can be brought over to the one on the left that belongs to an immersive space within a 3D coordinate system, all very seamlessly.
Switching gears. visionOS 26 also makes interacting with your 3D objects very easy. So if your storytelling experience requires people to pick up and move 3D objects, they can do so using interactions that feel natural and mimic the real world like picking them, changing their position and orientation, scaling them by pinching and dragging with both hands, or even pass an object from one hand to the other. There's no need to implement a complicated set of gestures.
You can apply this behavior to objects in your experience in literally one step.
You can use the manipulable view modifier if you were building with SwiftUI or if you were using Realitykit. You can add the manipulation component to your 3D entity.
It's really that simple. The goal of making these features so simple to adopt was really so that you can keep your focus on the core content of your immersive experience.
That brings me to the next topic. 3D content.
Let me start with an inspiring example.
This is CarAdvice, a special car museum that takes you back in time with some history of vintage cars rendered in beautiful detail. Notice how it starts out with a window and some images, a description of the car alongside a 3D model, and you can drag that 3D model out of that window and into your space to render it in life size. And check out its interiors in great detail.
This experience was built with reality Kit, which is a very powerful 3D engine.
It gives you complete control of your content, and it seamlessly blends your digital content with the real world surroundings.
Realitykit provides a ton of functionality to make your 3D content engaging, like defining the look of your objects, lighting particles, and even adding physics to the objects in your scene. I want to highlight three new features in visionOS in reality Kit that will be very useful for creating immersive experiences, in particular anchoring, environment blending, and mesh instancing.
Let's start with anchoring. The example that I just showed you uses anchoring in reality kit to attach the 3D model of a car to a table or a floor plane.
To attach your 3D models to real world surfaces, you would need to use a feature called Anchor Entity. In reality Kit, you can think of anchor entities as anchors position position in 3D space that can tether your 3D models to real world surfaces like a table.
You can even use anchor entity to filter for specific properties of your anchor. For example, if your app experience is looking for a table with specific dimensions, you can use anchor entity to do that. And once the system finds an anchor matching your criteria, it will Dock the content right onto it.
Next, I want to talk about how you can include your virtual content with real world surroundings using environment blending component.
Think of 3D characters flying out of the screen and blending in seamlessly with your physical space while the objects in the room occlude them very realistically. You can achieve that kind of effect using the environment blending component. Just to give you an example, check out this virtual tambourine that's getting occluded by the real world ways in front of it. To enable this kind of an effect. All I had to do was add the environment blending component to the virtual tambourine and set it to blend with the surroundings. Very easy.
Next up, mesh instancing.
In your 3D experience, there are oftentimes scenarios where you need to render hundreds of copies of a 3D object. For example, when you have pebbles on a beach, birds flocking, school of fish swimming, or some magical rock swirling in such scenarios, creating a large number of clones will result in a large memory and processing footprint, especially since the system will have to draw each entity individually while maintaining their position and orientation in space.
Mesh instances. Component solves for that problem by using low level APIs to draw multiple copies of the 3D object while maintaining the position, rotation, and scale. You can even customize the look of each of those instances. It's super easy to use, yet an extremely powerful feature in reality. Kit. So those were some of the newest and extremely useful features in reality kit that can help you add interactive content to your stories.
Before I move on to the next section, there is one thing I wanted to call out. Reality Kit is just one of the ways in which you can bring 3D experiences to visionOS. There are in fact several other choices. For example, you can bring your own proprietary rendering system to visionOS using Compositor Services framework and draw directly to the displays using metal.
Or you could use one of the game engines like unity, Unreal or Godot to bring your existing content to Vision Pro.
Here's an experience built by Mercedes Benz to show the various key features of their GLC EV.
It's an experience that was built with unity, And it renders the car in beautiful detail.
Next, I'll move on to various ways in which you can interact with content on visionOS. Eyes and hands are the primary input methods on Vision Pro. You can navigate entire interfaces based on where you're looking and using intuitive hand motions.
And now on visionOS 26, hand tracking is as much as three times faster than before.
This can make your apps and games feel very responsive, and there's no additional code needed. It's all just built in.
There's also a new way to navigate content using just your eyes called Look to Scroll. It's a nice improvement, a very lightweight interaction that works right alongside scrolling with your hands.
And you can adopt this into your apps with APIs, both in SwiftUI and UIKit.
Along with hands and eyes. There is a new way to interact with 3D content and visionOS 26. Using spatial accessories.
Spatial accessories give people finer control on the input, and visionOS 26 supports two such spatial accessories via the Game Controller framework. First is the PlayStation vr2 sense controller from Sony, which is great for high performance gaming and other fully immersive experiences. It has buttons, joysticks, and a trigger, but most importantly, you can track its position and orientation in six degrees of freedom.
The other new accessory is Logitech Muse, and it's great for precision tasks like drawing or sculpting. It has a pressure sensitive tip and two side buttons.
One thing I wanted to call out is that you can anchor virtual content to these accessories using an anchor entity in reality. Kit, I spoke about Anchor Entity a second ago and you can do so using reality Kit and Anchor Entity class. And I have an example to demonstrate that this is Pickle Pro from Resolution Games, where you can play a game of pickle ball and have anchored a virtual paddle onto the grip of one of the controllers and the ball onto the other one. So even in a fast paced interaction, such as swinging your paddle to play pickleball, the system can track the controller very precisely.
Using these accessories, you can introduce precise interactions into your stories, like using the Logitech Muse as a magic wand or drumming with the PS VR controller.
Since both these accessories support haptic feedback, your audience can really feel the interactions as you take them through an action or adventure scene, and it's all supported using the game controller framework alongside Realitykit and ARKit.
All right, next let's talk about SharePlay using SharePlay. Multiple people can go through your immersive storytelling experience together, and the way people see each other in a shared experience in visionOS is using spatial personas.
Spatial personas are your authentic spatial representation when you're wearing the Vision Pro, so other people can see your facial expressions and hand movements in real time.
In visionOS 20, spatial personas are out of beta and have a number of improvements to hair complexion, expressions, representation and more.
Also in visionOS 26, SharePlay and FaceTime have a new capability called Nearby Window Sharing. Nearby window sharing lets people located in the same physical space share an app and interact with each other.
This is defined by rock, paper, reality. By the way, it's a multiplayer tower defense game, and it comes to life right there in your space.
And all of this starts with a new way to share apps. Every window now has a share button next to the window bar, and giving it a tab shows the people nearby so you can easily start sharing with them.
The shared context isn't just for those in the same room, though. You can also invite remote participants via FaceTime, and if they are using Vision Pro, they'll appear as special personas, but they can also join using an iPhone and iPad or a mac.
SharePlay is designed to really make it easy for you to build collaborative experience across devices. Let me show you an example.
This is DeMeo, a cooperative dungeon crawler from Resolution Games that's really fun to play with a group.
One of the players here is having an immersive experience using Vision Pro, and our friends have joined in on the fun on an iPad and from a Mac. So that's SharePlay.
Next, I'll show you some cool features with spatial web that can enable you to take your storytelling experiences online.
Safari on visionOS 26 brings in support for a variety of spatial media formats. So in addition to displaying images and 2D videos in Safari, you can also add support for spatial videos in all of the supported formats, including Apple Immersive Video. And you can do that by simply using the HTMLMediaElement.
Alongside spatial media, you can also immerse your audience inside a custom environment using another feature in Safari called Web Backdrop.
The example shown here is an immersive environment from the Apple TV show severance.
Another cool feature is that you can embed 3D models within a web page using the HTML model element, and you can even drag that model out and bring it into your space. These features are really useful when you deploy your immersive stories to the web.
I covered a lot today. I showed you how you could make your experiences more immersive with 3D content. I showed you how to make them more engaging using spatial interfaces and enhanced interaction, how multiple people can experience your story simultaneously using SharePlay, and even take those stories online with the latest additions to Spatial Web.
Together, these features make it very easy for you to tell compelling stories on Apple Vision Pro.
But visionOS 26 offers many more features that I couldn't get to today. Again, one more time. Don't take my word for it. Just go to Developer.apple.com and if you haven't done so already, please check out the WWDC sessions on visionOS. They cover everything from enhancements to using metal with compositor services to updates to SwiftUI enhancements to reality kit support for third party accessories. The new video technologies, and so much more.
And with that, I'll hand it back to serenity. Thank you.
All right. So we've got a lot of information to digest there. So we're going to take a quick break so you can digest this digest some croissants have a little coffee and we'll be back in about ten minutes. See you back.
All right. Welcome back from break, everybody.
Hope folks got some coffee, some croissants, some conversation, all the important C's. As folks start to come back, I want to tell you a little bit about our next presenter. so Elliot works with creators to help them realize their Apple Immersive video dreams, hopes and all of that fun stuff. He's going to come out here and talk to you a little bit about the possibilities of creating content with immersive formats like Spatial Video, Apple Immersive Video, and App Nap. Elliot, take it away.
Thanks, Auntie.
Good morning folks. It's great to be here. My name is Elliot and I help creators bring to life their spatial and immersive projects for Apple Vision Pro.
I picked up my first 360 camera many moons ago, but my journey with spatial video and Apple Immersive Video began a little over four years, and since, I've directed and produced some of our exciting Apple Immersive titles.
More recently, though, I've been supporting some incredibly talented independent creators as they bring their own stories to Apple Vision Pro, whether that's through Apple Immersive Video or Spatial Gallery.
And in fact, I think I can see a few familiar faces, so maybe it's a good time just to say hello.
You know. Pioneering new storytelling methods, new formats, new technologies is always an industry wide effort. And so it's awesome to see so many of the immersive, creative community here with us today. And not forgetting those of you joining us online all over the world.
I can't wait to meet more of you, hopefully later today. Hear more about your projects and how your productions are going. And speaking of those productions, I wanted to share a little thought.
Whilst supporting creators as they bring their stories to Apple Vision Pro. There's always one question that comes up right at the start of every project, and it didn't matter if it was here on submerged with a complex studio shoot with underwater filming, or maybe a simple spatial capture with just one subject, like here with RTX spatial gallery posts. It was always the same question and it went something like this. Usually, okay, this is really cool, but where do I actually start? And so today I want to do as much as I can to help you get started. Now we have a packed event for these types of technologies full of lots of deep dives tomorrow for example. So you'll get to know everything you need to know. But in this session, I want to focus on what the foundations that Apple use are when we create compelling and powerful linear media stories for Vision Pro.
First, I want to touch on how you can design your projects for difference. We'll look at how you choose formats and create differentiation in your story.
Then I want to introduce you to what I like to think of as the creative superpowers of spatial and immersive mediums presence, authenticity, proximity, and connection.
And who knows, I may even introduce you to a mysterious little side project that I'm in the works on. But more on that later.
Let's dive in with how we can design our immersive stories for difference, and keep that in mind when setting out designing Apple Immersive media for Vision Pro, one of the most important questions you can ask is why choose immersive for this project? Because today there are so many incredible storytelling formats to choose from. From the simple but effective like books, the traditional flat screen film and TV perhaps where most of us get our favorite content today there's Podcasts, where just audio alone can take us to new worlds and immerse us. And there's even theater with its power to create a shared human experience that's live.
The most important part of choosing immersive media presentation is to make sure that your story is the right fit for it. That, compared with all these other formats, when told on Apple Vision Pro, it's the best and the most compelling version of it. If you feel this, so will your audience.
And so when it comes to immersive formats, there's really been no better time to start creating. So whether you're just getting started in your career or perhaps you're a seasoned professional working in a studio, we're in this really exciting moment where there's now an immersive tool for every type of story, creator and ambition, just like Tim showed us, we've now got an incredible choice of formats covering different types of immersion, presentation and application.
But tools alone can't bring our stories to life. So it's worth considering each of these formats and what their creative potential could be for you, your stories, and your projects. Ambition.
Maybe you want to capture something genuine, a real moment, but with a sense of depth. But that said, you don't want to dive in at the deep end of 3D filmmaking to make it work. That's where spatial really shines. It's a great way to explore how dimension can pull people into your stories, whether that's for a short film or you just need a simple and approachable capture setup.
Working with 180 and 360 can be fantastic, especially if you're creating immersive experiences from footage that perhaps you already have.
It's perfect for when full immersion is essential, like you're producing an educational experience. Or perhaps you want to bring someone inside a travel story.
Or maybe, just like it's crossed everyone's minds at some point, you want to capture an extreme sport from a truly intense point of view perspective. And with immersive presentation, well, then something like wide Field of View using App Nap would be a great choice for that.
And if you're ready to tell a truly cinematic story, one where presence and incredible detail become part of the language of storytelling than Apple Immersive could be the perfect choice.
It's built for creators who want professional quality, advanced industry workflows, and a way to tell stories that simply didn't exist before.
Of course, your format choice is also going to depend on where your audience find your stories. So if you can, it's important to choose your distribution as early on in the project as possible.
That could be here on the App Store building an app, and it's with an approachable way to create a bespoke experience, perhaps one that brings your immersive videos together with interactive 3D elements. It's a powerful way to design for branded experiences or enterprise, or even just custom experiences that feel totally unique.
Or if your goal is to reach as many people as possible as quickly as possible, then you could stream your film directly through Safari on visionOS 26. It's a fast and effective way to share immersive stories with a wide audience. No app build required, and one I'm really excited about. You could upload and share your work directly on Vimeo. It's a great option for when self-hosting might not be practical within your scope, such as those working on educational, cultural, or perhaps public projects.
And if you are a professional network or studio where we're continuing to support the distribution of the highest quality Apple Immersive Video through the Apple TV app and Federation, it's the best way to deliver premium Apple Immersive stories to audiences all around the world.
Whatever your distribution plans are, it's important to remember that the audience expectations are very high to truly capture their attention. Your stories need to stand apart and offer something that feels different, something that makes people want to explore new forms of entertainment.
So be it for scripted entertainment like here with submerged or non-scripted, like Canal Plus's documentary tour de force, or even enterprise experiences like the training app flight site by Rogue Labs.
Make sure your story feels different in immersive formats. Audiences expect more than just a new view. They expect a new feeling. And to deliver that, you're going to have to ensure that your story has differentiation.
For example, I'm a little bit of a nature buff, and I've seen lots of natural history documentaries on the TV. And I don't know about you, but diving with sharks and wildlife. Episode two felt completely different. It wasn't just another documentary. It gave me a chance to genuinely feel what it would be like to swim alongside sharks, to share a space with them.
And that's not something we can get on TV or even at an aquarium. It's only possible through immersive video experiences.
Now, these new formats have introduced storytelling tools that have changed how we can create and how audiences can connect.
But discovering how you can use them isn't always obvious.
When the teams at Apple set out several years ago, they expected the true power of the medium to be in teletransportation the idea that you could take anyone, anywhere in the world instantly.
They could take you, for example, to the northernmost reaches of Norway, the Lofoten Islands. The landscapes were breathtaking. You could see the wind sweeping over the summits, the waves breaking gently on the ocean.
And with Apple Immersive Video, you could even see the horizon in remarkable detail.
But in trying to tell a story, it was clear that there had to be something deeper.
After a wee while of looking around, it felt a little bit like looking at the front of a postcard. The feeling of immersion alone wasn't quite enough.
That's why in boundless you don't just travel to the Lofoten Islands, you spend time there with those that make it their Playground meeting and surfing alongside what I think are some pretty brave souls in wildlife. For example, you don't just see the rainforest of Borneo's region, you meet those that call it their home. You get to play, share glances that feel unmistakably human. And if you're lucky, you might even get a little scratch on the cheek.
If you look closely at the heart of these experiences and what sets them apart.
You'll find some key foundations that are repeated. And these are the same foundations we're going to be talking about today presence, authenticity, proximity and connection.
As you bring your immersive or spatial stories to life, these elements can become the building blocks of what makes your experience delightful and meaningfully different for your audience. You can think about them as your creative superpowers. But as I'm sure all the immersive heroes amongst you know, with great power comes great responsibility. So let's keep that in mind as we look at each one a little bit more carefully.
Rather than just show you the final result, though, I'd love to bring you along on the journey to share what it takes on how to bring these foundations to life in your own projects, and maybe some of the new considerations and responsibilities that come with leveraging them. So with that, I want to introduce you to a little side project of mine. I'm exploring what's possible to bring to life. Are we sci fi short film I've been wanting to make using Apple Immersive Video based in a mysterious planet. It's about a friendly astronaut's wacky adventures through a shimmering desert landscape. And who knows, maybe even some mysterious creatures too.
Now, alien planets and spaceships are pretty expensive, right? So we're going to be using Pre-visualization or previs and hear how it's shaping up so far. We've got our astronaut friend what looks to be their spaceship, and it looks like we've landed in a mysterious desert. Perfect. Now, whilst this is just my pet project, sadly it is one of the best ways that I can show you what I and the team have learned over the past few years. So we'll be seeing more of our friend in just a moment. But for now, let's dive in to the first Foundation presents.
With spatial and immersive formats on Apple Vision Pro, your audiences can experience your story as if they were right there at the moment it was captured.
And that changes everything, because the viewer can now feel present and a new set of creative considerations come into play.
No longer are you directing stories for a flat, rectilinear screen. In that world, every frame had edges and filmmakers had complete control over what the audience could see. Every shot, every cut designed to guide the attention of the audience inside a fixed window.
But in immersive storytelling, that frame disappears.
You're now presenting within a boundless space a volume where once what was just a camera has now become the audience. The eyes of all of your audiences, with the power of presence, you now have access to some fundamental Mental differences from traditional flat screen video production. Human peripheral vision, for example, and accurate scale and depth. And together, these allow your audience to experience a place or a story as if they were really there.
Presence can enable you to create a deeper connection with your audience. They can sense the pressure, feel the atmosphere. But most importantly, they also feel like they're truly part of it. The NBA All Star Weekend film is a great example of this. It doesn't just show you what happened. It lets you feel what it was like to be there.
Now, this type of new experience is one that can be experienced physically and emotionally. And unlike film, TV, or even theater, you can now take your audiences anywhere.
And we call this high fidelity of presence. It's fundamentally new to both audiences and creators.
But how do we create presence and how do we craft it? Well, a great place to start with that is composition, crafting the basics of the scene that our audience will visit and be a part of.
Let's see what that looks like in practical terms though. It's time to bring in our sci fi project, and I'm going to begin by establishing an opening shot.
This will look pretty similar to anyone who spent time in traditional filmmaking, but remember, in immersive formats, that familiar frame is now unbounded. There is no single point of view.
The first place we'll see that difference is on the camera monitor itself, because it won't look like this. Instead, it will look like this.
And in practice, this is the same image presentation you'd see when using an immersive camera, just like the Blackmagic Ursa Immersive Camera, for example. That's the one I'm hoping to use. It's the exact view you'd see on the onboard camera monitor.
You'll notice a few key things here. The image is now a one by one aspect ratio. Around the edges you'll notice a little bit of compression, right? Objects that should be straight are now distorted, and across the entire frame you can see a full 180 degree scene.
But this new perspective is only half of the puzzle. Because just as important as how we see this on on set is how our audience will see it.
And for exactly the same composition. This is how our audience would see it in Vision Pro.
Here you'll notice a few differences, such as the edges of our view, are softly feathered. That creates a natural sense of focus. The image is now perfectly rectilinear. There's no distortion or curvature, and most importantly, the viewer's field of view stays focused until they move their head and explore the scene.
When we view them side by side, you can really see how big that difference is and what we need to think about when we're on location. With immersive content, we're no longer just framing a shot. We're responsible for managing a viewer's agency inside a 360 degree world.
So what's the best way to compose with these considerations? Well, let's take a look at what we have.
Here's the scene I'm working with. I've framed up on our slightly confused astronaut, perhaps wondering what brought them to crash upon the planet and their ship to fail on them. But I've composed this scene using traditional techniques, right? The rule of thirds balances out the frame. I've lowered and tilted the camera to make the shot feel more dynamic and interesting, perhaps even cinematic.
But I can already see a few of you seasoned, immersive filmmakers wincing in the audience because whilst this looks great here on our camera monitor, things are about to look very different when we step inside the audience's view. And in Apple Vision Pro, the audience would experience this pretty differently. That exact same composition now leaves them staring at, well, nothing really.
There are going to be unsure on perhaps where they're looking or even what they're looking at, and it could take several seconds for them to explore the scene and finally discover that focal point of the story. Those are seconds. They're not going to be looking at our astronaut, the ship, or even thinking about the story I'm trying to tell.
Combined with unnatural camera height and the tilt. Also, this could make for a pretty disorientating scene. So how can we improve this? Well, first we're going to want to be careful about relying on rectilinear framing techniques. That might be approaches like balancing the frame with waiting or using overlay grids like the rule of thirds.
Yes, even Fibonacci's golden ratio might not help you here.
And inconsistent camera angles. They can feel pretty disorientating and disturb that sense of presence. With immersive cameras, we need to get comfortable with a different approach, one that's built around space, distance, and how people actually experience the world.
So instead I'm going to compose this shot using depth and dimension. By using distance and parallax, we can guide the audience's eye through the space instead of across a rectangle, and in the end, we can end up with results that still feel cinematic. Let's check it out for this composition. I started by backing the camera away to create a stronger sense of depth. In the foreground, we've got this large rock. It relieves our eyes into the mid-ground, and there you'll find our astronaut in the background. We have a few cactus to anchor the environment.
You'll also notice the horizon is now level, and that matches our audience's natural expectations. And the camera height aligns with the eyeline of our astronaut. So we feel like we're present with them, like we're together in the same space.
Now, this looks pretty flat on this screen here, but if I move the camera a little bit, you'll see that depth come to life.
This setup lets us use stereoscopic depth to guide the viewer's eye right into the scene. For Apple Vision Pro and in Vision Pro, the audience's perspective finally feels right. We're immediately drawn into the space and our attention goes exactly where it should. It's clear what's happening, and we can instantly feel present in the moment, with our astronaut just looking a little sorry for himself.
And if we add a little simulated parallax to mimic the 3D effect of being in device, you can see how proximity of the rocks and the distance cacti naturally guide our eyes right towards the middle of the scene.
Here we're using depth itself as a compositional tool, but techniques like leading lines or concentric circles would equally have worked as well.
Nice. So there's our first shot in the bag, and at least when we do this for real, I hope I'll know exactly what I'm doing and hopefully that will make some producers happy. But for your own spatial and immersive projects, think about how strong composition can help build a true high fidelity of presence, one that draws your audience in and keeps them there.
And as you do remember, watch out for rectilinear traditional techniques like rule of thirds, camera tilt instead. Adapt your approach for immersive capture and prioritize depth and layers. And lastly, always consider the perspective of the audience.
The only people who will see your film as a fisheye on a flat review monitor is you and your team.
And if you're looking for some great examples of this high fidelity of presence, I'd recommend checking out wildlife, especially the elephants episode. It's a beautiful example of presence that creates empathy and emotion. And for spatial video, the Arthrex winter climbing clips are very powerful. Their high presence and they also play with verité capture techniques, so they're pretty authentic.
And speaking of authenticity, our next foundation is authenticity. So let's look at it with spatial video and Apple Immersive Video. We now have the opportunity to present stories with unprecedented amounts of authenticity.
We can achieve that because as we've just explored, we're now capturing depth and scale.
And in the case of Apple Immersive Video, we're also capturing lifelike acuity.
So whether your story is scripted, a live performance, or perhaps a documentary moment in Vision Pro, you're going to sense a subject as if you're meeting them in person, surrounded by a full 180 degree scene, complete with all the tiny details that make it real.
And this opens up pretty new, exciting opportunities like presenting non-scripted moments just like here in ice diving as part of adventure through Ant's body language alone. We can sense the genuine pressure of the moments before his world record dive. There's no editorialization needed.
And when he begins that dive, there is no question about the scale of what he's undertaking. As we swim with him, we can perceive the scale and distance of that world record being set before our own eyes.
Authenticity in immersive formats gives us the opportunity to tell powerful stories that now feel true to life, but to make sure we don't trip up on using this power. There are some things we'll want to consider carefully. One thing in particular, and that's creative motivation.
So let's look at what that means in practical terms are it appears our astronaut is getting to work on some repairs on their spacecraft. So let's plan the next shot.
And this is what I'm thinking. With this shot we can see the nose cone of the of the spaceship open, and our astronaut is busy with their computer console. We're going to be able to follow the entire repair process perfectly. And I think the audience are going to love it. But there's an important consideration, right? We can see everything.
When we look at this audience's perspective, we can see everything inside the spacecraft, even the buttons, the wiring, every tiny detail. And if my motivation is to make this a realistic moment, then our astronaut can't fake it. They're going to have to actually perform a real repair.
They're going to need the right tools. They're going to need to move those tools correctly, perhaps remove some wiring, reconnect intergalactic capacitors. Suddenly it's all becoming very real, and getting some creative license here is a little harder to find with the power of authenticity. It's important to recognize how artifice can sometimes undermine both your story and the immersive experience. Now, considering artifice isn't new in filmmaking, but to the to the degree that we have to consider it in immersive, it absolutely is like we've just discussed, it's now possible to see even the tiniest details, like the subtle twitch of a contributor's body language that perhaps tells the audience that they're nervous in front of the camera, or how close someone is to real danger. Important, perhaps, if you're attempting to up the stakes and produce something that feels more dangerous than it is, audiences can now tell how large or small a character really appears to be. And if you want them to sense something as fast, then you're going to have to move the camera just as fast as you want it to be.
These new realities challenge us as filmmakers to think carefully about how we editorialize and sensationalize our stories. We might not be able to use smoke and mirrors in the same way we have before to keep authenticity. We can't just rely on Foley sounds or zooms or fast cuts in the edit to cover us.
So how can we ensure that we keep this scene authentic and avoid undermining our story? Well, I found it easier to get started with these new mediums by carefully selecting your subject.
And making sure that they're showing authentic skills and performances.
So in the case of our repair, I'm going to keep it simple and focused on the computer console.
And I'm also going to make sure that our astronaut has enough time to rehearse it properly. That way when we roll, they'll know exactly what to do.
Now, while what's in front of our camera is essential to establishing authenticity. Preserving that authenticity means thinking about something else carefully. And that's audience agency.
Specifically, we need to consider their agency with the spatial and immersive content that we have. Our audience can now look anywhere whenever they choose to.
As a result, as filmmakers, we're now sharing that storytelling process with our audience. We're meeting them in the middle. And in pursuing authenticity, that becomes a new responsibility. And it's one I'm going to try to keep front of mind as we shape our next scene.
I want to surprise both our astronaut and our audience with the arrival of this slightly curious looking creature. But in practical terms, how will this affect our audience's agency and my ability to keep that sense of surprise? I want our alien to pop up from the rock behind there on the right hand side, just enough to give our astronaut a little fright. Now, in traditional filmmaking, it might make sense for me to exaggerate this in the moment with something like a quick pan like this.
That way we can show the audience what's really happening and maybe give them a little fright as well. But in spatial and immersive storytelling, I need to remember that the camera is the audience. And I don't know about you, but the idea of someone grabbing my head and forcing me to look at something doesn't sound very pleasant. It's a sure way to break both agency and authenticity. So it's something I'll probably want to avoid. So how do we fix it? Well, one approach is by giving some of that storytelling agency back to the audience.
To do that, we can let the audience discover the creature for themselves. That way, they'll feel like they've noticed something that our astronaut hasn't. And that moment will feel genuine and entirely of its own.
We'll keep the camera positioned pretty much exactly where it was, but we'll meet the audience in the middle by using Spatial Audio to guide their attention to the event. Just like this.
Now, here on the screen, that might not seem all that exciting, but in Apple Vision Pro, the audience gets a chance to look over their shoulder and discover something delightful, if a little strange. Like this.
With this approach, we're preserving the audience's agency, and in doing so, making the story feel more authentic and more immersive.
So as you're pursuing your own authenticity. Consider your motivation as a creator and the agency of the audience.
Watch out when introducing artifice or heavily producing a story and when working with talent, choose them carefully, prioritizing authentic skills and behaviors and always think through your creative motivation.
Do what you can to preserve the agency of the audience.
There are moments in Canal+ latest experience tour de force that I think do this perfectly, especially when we get to celebrate with Johan as he sprays you in the face with champagne. It feels so honest and so real, and it was exactly like being there in person. So go check it out if you haven't.
Okay, so we've talked about presence and authenticity, But underpinning both of these is the concept of proximity.
One of the most powerful foundations of spatial and immersive media is the ability to make our audiences feel physically close to both you and your stories and your characters.
With that, filmmakers can now give audiences the chance to experience moments that they might never do otherwise. Just like here in Open Hearts from the weekend where audiences sit opposite their favorite artists to experience an intimate music performance and a story unfold.
Using proximity allows you to craft stories that are felt as personal experiences and could even be remembered as memories, not just a video.
But how can we establish proximity? Well, that's going to start with camera distance.
With the introduction of this perceived physical space between your audience and the story. You now need to think about proximity very differently than how filmmakers have before in traditional flat screen filmmaking. We zoom in and enlarge the image to bring the viewer a greater sense of proximity, perhaps a moment of intensity.
But with immersive and spatial, we often have to think about fixed lenses. So we'll want to think about using physical space instead, and how it corresponds to the scene that we're trying to create.
But what do I mean by relative space? Well, to show you, I'm going to ask Sarah to join me on stage. Sarah is one of our incredible directors.
In our day to day lives, we're all used to different kinds of space and the proximity that comes with them.
Right now, Sara and I are about 25ft apart. And that's a familiar space for public space, like an airport. Here, Sara and I could walk past each other and we might not even notice one another.
If Sara and I step in a little bit closer together. Let's say to around 12ft, we'd be in what many people would call professional space. Here, we'd likely know each other's names, but we might not be ready to invite each other to our birthday parties. Sorry, Sara. All right, then, if we come even closer together, perhaps to around six feet, this is likely to be a private space. Let's say it's a friend's house where Sara and I can be comfortable in each other's spaces.
And finally, if Sara and I step even closer together, let's say less than an arm's reach. This is what we would call personal space. And this is the type of proximity that we usually reserve for our close friends and our loved ones.
Now, Sarah and I, we've been on a few intense shoots together, so we're pretty comfortable in this space. But if we hadn't met before, this would be pretty uncomfortable, to say the least. Thanks, Sarah. You're welcome. Thank you.
Now, while Sarah and I could move across the stage here today, these types of relative spaces are still important to consider when you are crafting your stories. Because in immersive and spatial experiences, your audiences can't simply step back if they feel uncomfortable cutting to a new shot. That someone to show someone that you've never seen before and they're inside your personal space staring at you is likely going to feel uncomfortable.
Or may end up leaving the experience altogether if that happens.
So let's take a look at how we can use these considerations.
Are it looks like our alien is becoming a little bit more confident. So in this next shot I want to introduce them to the audience properly. It would be rude not to. I also want to make sure that we really get a sense of who our alien is. So I'm going to say something that I'm sure everyone here has heard before, probably from a director or maybe an excited client. Let's get a close up.
And in traditional flat screen filmmaking, this is probably what we'd want. We can see the alien nice and big in frame, and with a little bit of depth separating them from the background. It's a great shot for introducing a character, and it also feels cinematic.
But with an immersive camera. I'd have to think a little bit differently if I tried capturing the exact same perspective as this setup. I'd need to place the camera less than three feet from our alien.
Now that's well within an arm's reach, and definitely inside our audience's personal space. And while that might look okay here on the screen in Apple Vision Pro, it's a very different story. We are now way too close, and that's going to cause some technical problems like closed captions or even viewer comfort. But it would also cause issues from a story perspective when presented like this. What was meant to be a cute and playful introduction to a character has now turned into something that looks more like a horror film. And whilst that might be exciting for the horror producers out there, that wasn't my motivation. So let's solve it. To establish a more suitable sense of proximity will place the camera at about six feet. It's still plenty close, but now it's not in our personal space. And when I add a little camera movement, you can get a sense of that distance.
In Apple Vision Pro, we can now get a sense of the real creature's size and their cuteness too. By keeping a little distance, I can save that feeling for intense proximity for later on. Perhaps once the audience has got to know our creature a little bit better.
Great. So there's no doubt that using proximity as a foundation in our stories can help surprise and delight audiences. But as you do, rethink what a close up really means, question your motivation and consider how it will make your audience feel.
Apply proximity through camera distance, remembering the four P's of space public, professional, private and personal.
And finally, use moments of extreme proximity that's closer than an arm's reach to the camera, thoughtfully and perhaps a little more sparingly.
If you'd like to check out some of the incredible moments of proximity we have on platform, then submerged on Apple TV would be a great place to start. And for spatial video, Red Bull Air Force's wingsuit is a powerful example.
Now, proximity might seem like the most visceral foundation you can use in immersive storytelling, but to close out today, I'd like to discuss connection. Now get your capes ready because it comes with the most significant responsibilities. Spatial and immersive formats are often described as being able to transport people to new places, and while just showing someone a world can be impressive, you can also make the audience feel like they're more than just a fly on the wall. You can now make them feel like they're sharing a space and sharing a moment.
The orangutans episode of wildlife does this beautifully. You get an experience of the world with these playful young orangutans as they grow up preparing for a life in the wild.
But at the same time, there are these incredible moments where just a single stare can creates a connection that feels almost human.
In these moments, the power of connection can give your audience an experience that feels deeply intimate and personal, something no other medium can quite match.
But as creators, how can we establish connection? Well, like we've just seen a great place to start experimenting is with eye contact.
But there's another technique I've seen creators get very excited to try almost right away.
Any guesses? It's movement.
Okay, out of curiosity. Pop up your hand. If you've ever tried moving an immersive or spatial camera.
Pretty much everyone. That's awesome. I've often seen teams at Apple and independent creators I've worked with use movement not just to increase production value, but also to deepen the audience's connection to the story.
As long as both you and your story are grounded in clear creative communication, connection through movement is incredibly powerful.
But as you've probably guessed, It also comes with a responsibility, and that's Motion comfort.
Now, it's worth noting that Motion comfort has long been a consideration for folks working in giant screen and Imax. But with spatial and especially immersive 180 and 360 formats, you are now responsible for the audience's entire field of view. And while that presents a unique frame for storytelling, it's also a huge responsibility.
That means if your camera is slung out over a 3000 foot cliff and it happens to wobble a little bit, your audience are going to feel that exact wobble, too.
Now there's someone I'd love you to meet. I'm going to invite Igor to join us on stage and share a little bit more about the research. He and his team have been leading into motion comfort for immersive media experiences. Igor.
Thanks, Elliot. Hi, I'm Igor. I'm a research scientist at Apple. And as we know, Apple Vision Pro gives you an opportunity to share all new kinds of immersive stories with your audience. And as Elliot mentioned, with an immense field of view comes an immense responsibility.
According to the US National Library of Medicine, 1 in 3 people are susceptible at least to some form of Motion discomfort in their everyday lives.
So when we are building experiences with Motion, we want to prioritize people's comfort.
Today, I want to talk about how we find the right balance and make immersive content that is both enjoyable and accessible to more people. First, it's important to understand Stand why people experience Motion discomfort.
The leading cause is a conflict between the Motion we see with our eyes, and the vestibular sensation that we sense with our ears or inner ears. When we move through the real world, there is typically absolutely no conflict. The visual and vestibular senses are perfectly aligned. They both communicate motion.
However, when you move through a virtual world, you're typically stationary, as when you're sitting in a theater like this, and the only motion you sense is with your eyes, like so.
So let's consider some of the factors that you can control to minimize motion discomfort starting with camera motion Because immersive content can offer viewers such an expansive field of view, we have to be very aware of camera motion. Unlike in traditional media, anything that happens to the camera makes it feel like the viewer themselves is moving within the scene.
Even if your camera remains stationary, your scene might still communicate motion. If the content within it is moving. For example, if your story contains moving objects that take up the majority of the audience's field of view, that can cause a similar sensation to a moving camera.
This means that it matters how much visual space objects occupy.
This varies with proximity from the camera as well as object size.
In other words, Camera or object motion is less of a factor if the objects are far away from the camera and are smaller, as you see on the left. This means that they occupy less visual space compared to the scene on the right.
Texture and contrast within an immersive scene are also important. Let's see this in motion.
Notice how simple textures with low luminance contrast, like the scene on the left, make potential camera motion less susceptible than textures on the right.
We call the combination of these four factors motion intensity.
And there are other factors that also warrant mention like camera orientation. It's best if the camera is aligned with the horizon as we know All Motion predictability also matters. Motions that the users can anticipate are generally more comfortable.
And finally, scene cuts its best when the scene cuts don't have sudden changes in object proximity or camera angles, and that is to avoid disorienting the viewer.
Back to Motion intensity. One effective way to reduce it is, of course, to decrease the amount of visual space Motion occupies on the screen. And this can be done by the viewer, but it comes at a cost.
One of the features of motion sensitive people can use is a detect high motion setting, and this visionOS feature is enabled by default for consumer created App Nap content and it can automatically pause video or reduce immersion when large motion is detected.
And people have who have extreme sensitivity to motion, can always choose to watch any immersive content in a windowed mode, including Apple Immersive Video.
But of course, the trade off is then. These people then miss out on the immersive nature of the content. And that is why the best place to deal with motion is during production. Of course, just because motion intensity can lead to motion discomfort doesn't mean that you need to eliminate it from your content.
Your goal is creators is to balance motion intensity throughout your story.
Let's use an analogy to help visualize this balance.
You can think of motion discomfort as the water level in the bucket. The higher the water level, the greater the discomfort.
We also know that Motion discomfort symptoms subside over time, and we'll represent this with a hole in the bucket.
For example, when there is no camera motion, the faucet is closed and there is no water in the bucket. So no discomfort.
Importantly, even if there is motion, as long as the motion intensity is within certain limits, the leak should be able to keep up with the flow, preventing a build up of discomfort.
But when the motion intensity is high, the faucet is wide open and the bucket is filled rapidly.
Keep in mind that the size of the bucket and the hole also vary from person to person.
So folks with a small bucket may experience discomfort from even minor motion, while those with a larger bucket can tolerate a lot more.
And although you can't control the size of the bucket, you can control the input. As creators, the point is not to avoid motion altogether, it is to manage this flow over time.
In other words, when you plan moments of motion within your scene the same way you might mapping out emotional highs and lows in your narrative, you can potentially improve the experience for everyone watching.
Now back to Elliot to show you how you can manage this in your own projects.
Thanks, Igor. It's pretty exciting, I think, to see some tangible research being applied to this topic. We all know how much it can spark plenty of subjective debate amongst us filmmakers as we make our projects.
So let's see how we can apply some of these learnings to my short film by adapting our capture for better motion comfort.
It looks like our astronaut is making some speedy repairs and is already off to another planet, so hopefully it wasn't our creepy alien that scared him off. But for the final shot of this scene, let's take a look at how I can plan camera movement to follow the ship. I want the audience to feel connected, like they're right there with the astronaut taking this next step on the journey.
First, I'll want to make sure that I'm not moving on too many axis. If I do a simple dolly Push for example, moving towards the ship. That's one axis like this.
We could move on to axis, right? If we did a dolly in and a jib up like this.
Or if we had a really strong creative motivation, we could dolly in jib up and tilt the camera all at the same time. Like this. But in the case of this scene, I'm going to keep it simple just to axis.
Next, I'm going to want to think about the speed of the movement, as Igor explained. Broadly, the faster the camera moves, the more discomfort our audience will feel. So moving the camera quickly towards the ship might feel a little bit less comfortable and less accessible than a slower, more deliberate move like this.
However, as we just heard, speed is also closely tied to the camera's proximity to objects in the environment.
For example, if we move the camera very slowly but keep it close to the ground like this, we get a huge amount of detail and a strong sense of motion intensity.
Moving slowly this close to the ground might actually feel more uncomfortable than a faster camera move from a higher increased height. So for now, I'm going to keep the camera higher up and at the same altitude as our ship.
Right. So we pull that all together and let's see what we have. We lift up, we gradually start our two axis move and our ship flies off into the distance.
And I really hope there's going to be some budget for sound design.
Okay, so I've got my considerations for when we shoot. But what else can I do to make sure this is going to be comfortable for everyone watching in Vision Pro? I want them to love it. And while with all shoots, things don't always go to plan, and that's totally okay because that's what post-production is for. I'm only joking.
But in post we can fine tune the motion that was captured. We can amend it, and we can make decisions that can improve the comfort.
First, we can fine tune the duration of the movement that we've created. Let's say I get a little bit of feedback that the shot feels uncomfortable. Well, then I can just trim it down with the editor. I get to keep the movement, but simply show less of it.
And secondly, I can work with the editor to increase the predictability of the movement. Just like looking forward in a car on a windy road, the more predictable emotion feels, the more comfortable it will be.
In the shot we've just created. I've noticed that it runs a little bit long and it starts without a clear cue for the viewer. So let's fix it.
I've started the shot with a gradual fade up, and then accelerated the camera gradually as the ship lifts off.
I've also shortened the shot, and I'll make sure to follow it with a static moment in the edit later on, just to give our audience a chance to reset. There's going to be no overflowing buckets here, Igor.
So as you consider adding motion to your immersive and spatial projects, remember that you can draw your audience in with a deeper sense of connection as you do. But if you add motion lead by establishing your creative motivation, move only when it serves the story without undermining the other foundations that we've covered today. Be precise and deliberate with your camera moves. Choose your axis. Set an appropriate speed. And mind the proximity to objects.
And refine it in post-production. Fine tune movement by adjusting its duration and making it as predictable as possible. With these considerations, your audience can stay immersed in your story and enjoy every second of it.
And if you're looking for some great examples of how motion can be used for connection, then check out the series elevated. The team here have used movement as the primary storytelling device, offering a unique perspective on a place beautifully combined with narration.
After all, who wouldn't rather fly down a valley like a bird than just look at it from a static viewpoint? And for those of you seeking a thrill, the Hill Climb episode of adventure shows what's possible when you Push Motion to an extreme, and both are available today on the Apple TV app.
As I said earlier, there really hasn't been a better time or a more exciting time to create immersive experiences with spatial and Apple Immersive Video.
So as you venture out to tell your own stories, remember to design for difference and lean in to those creative superpowers of these incredible formats, considering presence, authenticity, proximity, and connection.
By embracing these new considerations and responsibilities that come with them, you can craft experiences that are not just immersive, but that drive forward this frontier of immersive storytelling. Retelling.
The whole team and I can't wait to see what you create and to meet many of you here later today. But for now, I'm going to hand it back to serenity. Thank you.
Oh man. I know that was just preface, but I really hope you make that short film. It's really fun. I want to meet that little creature. So that was awesome. And we are going to take a break for food and lunch and more digestion of delicious ideas, after which we'll all come back here. And I'll tell you a little bit about how you can use design to make more interactive experiences. Have a nice lunch everyone. We'll see you back here at one.
Hello, hello, hello.
Welcome back from lunch, everybody. Hope folks got themselves fed, chatted with some nice humans that you may or may not have known before. And for you online, no matter what time it is. I'm very glad you're joining us. So we're back, and we're back to talk about one of my personal favorite topics, which is designing immersive and interactive experiences. The excitement about the story. It's a new medium, right? And that means you've got new opportunities to impact your audience. Now, speaking of new mediums and impacts, one of my favorite and kind of apocryphal stories comes from the early days of filmmaking. You might have heard of these folks, the Lumiere brothers, right? There were some of the first to experiment with making movies. And one of their first films was this one, arrival of a Train at La Ciotat.
It's about 50s long. And so the story goes. When it premiered, audiences were so frightened by the train coming towards them that they ran out of the theater in droves.
If you're in the media, you probably know this story, and you also probably know it's not quite accurate. Well, it's a fun idea that a piece so moved people that they fled their seats for fear of a runaway train. I actually like the real reports from those early viewers even more. Listen to this.
The locomotive appears small at first, then immense, as if it were going to crush the audience. One has the impression of depth and relief, even though it is a single image that unfolds before our eyes. You'd think you were there.
Now, this is a quote from Phyllis Reynolds, one of the first viewers of this piece, and it's clear he wasn't scared. Some word play about crushing the audience aside, he was captivated. And I think that's what we're all chasing as storytellers, that moment where the audience dives in.
Now, earlier, Elliott spoke about how you can create immersive video that captivates your audience, but keeping them connected if your story goes beyond this frame. Well, that's what we're here to talk about today. So I'm going to share some best practices and some design fundamentals for when you're making media apps, games and experiences for visionOS. We'll start with talking about immersion, then explore how you can add interaction to your stories. And I'm excited to share that we'll have a special guest speaking about their studio and how they're using immersion and interaction, so stay tuned for that. Then we'll take a look at how to craft great spatial interfaces that enhance your content, rather than distracting from it, and find out how to make our environments and experiences come alive with sound. For all of these, I'm going to have some examples of some of the great apps already available and on this platform to help get you started. All right, let's dive in. So immersion, as I mentioned this morning, creators can make experiences for this platform that start at any level of immersion. And they can also change that immersion for their experiences throughout. And this means that immersion is not just a setting. Immersion is a tool.
Now, we're no stranger to storytelling tools if we're filmmakers or artists or designers, if I'm shooting content on iPhone with an app like Keynote, for example, I can frame my subjects, zoom in for emphasis, and I can even change focus if I want to help tell a story. But these tools aren't designed for making apps, especially on a spatial platform, because when you're designing for visionOS, your content's frame is potentially your viewers entire field of view. They see the world in front of them by default, and also they get to control what they look at. That's fundamentally different because it would feel wrong to a viewer if a visionOS suddenly zoomed in on something, right? Or worse, change the viewer's pass through to emphasize something in your app. We take away our audience's agency if we did this and alienate them. And that's the opposite of what we want to do on a platform like this. Instead, we want to keep people in control of their experience because we want to bring them into the story. And for that, we can turn to new tools like immersion, How we choose to immerse people in our content is a core part of any visionOS experience, and we can change immersion at any time. Like I said, to help tell our stories. So I want to share three of the most common patterns I've seen for using and changing immersion. Let's start with a media app like Vimeo and visionOS. This opens in a large window, and it's pretty familiar and comfortable to most of us who've used computers for the last 20 or 30 odd years. And in this window, people can browse content like filmmaker Jake Olson's Currents while staying connected to their surroundings. It's pretty standard stuff. But let's watch what happens once someone selects a piece of content. Immersion changes, the room goes dim, and our eyes immediately go to the video. This is a small change. All the apps really done is dim the pass through feed, but it immediately draws the audience's eye and helps them focus. Focus is a great reason to change immersion in an app. Them. It brings attention to a specific piece of content. Or if you have content that's more immersive, like a 180 video, you can even hide the outside world completely to bring people into your story. And this pattern can also apply really well to immersive and interactive work. Let's take a look at Tiago's immersive World War Two documentary, D-Day. During the experience, if you haven't tried it already, we follow a woman learning about her father's time as a camera soldier in World War Two. We'll check out a clip from the experience. The documentary starts with stereoscopic video played back in a window. But when the woman discovers a box of her father's World War Two artifacts, the audience gets to explore these, too.
The scene transitions into a fully immersive darkroom with the objects in front of the audience.
And here people can examine the artifacts of the past without the present coming back in. And they get to learn more about the man behind them. And when this moment finishes the app seamlessly returns the audience to the documentary footage and their surroundings. This is a great example of using immersion to help someone focus on a moment and further bring them into your story at the same time. Now, you might also want to change immersion to transport somebody. This can bring your world to the audience and really immerse them in your content. Rewild and interactive documentary from Foria does this really well. Like D-Day, rewild starts simple video playback above a model of Earth to help people learn about our changing ecology. But as the documentary progresses and the viewer dives into this underwater world.
The world starts to come in to us.
What I love about this example is how well it brings the world to the audience. And it's subtle. The audience never leaves their own room, and Fourier doesn't try and turn the Turned the entire room into an ocean, right? Instead, they bring in immersive content only for key elements in the story, providing those wow moments that really pull someone in. You can use any level of immersion to provide this sort of transportation, but I particularly love that Fourier chose to blend their content with the audience's surroundings. And that's because the message of this documentary, it's about our stewardship of the planet. So it makes sense that the audience is living in the audience's, or the content's living in the audience's room. It's making them part of the bigger picture.
Now, experiences can also use higher levels of immersion to fully transport people where appropriate. For instance, Apple Inc. Explore POV uses 180 video and also Apple Immersive Video to bring people to entirely new places. And you've also got apps like Disney+, which use full immersion to put audiences in the worlds of their stories while they watch content like the Containment Room from fxx's Alien Earth. Focus and In transportation there. The most common reasons for media apps to change immersion levels. But there's one more. It's the trickiest, but it can be incredibly effective when it's used appropriately. And that's narrative.
Now, immersing someone for narrative reasons, it's going to look very different depending on the experience and the app that you're making. But there are a few patterns I've seen in apps that do this well. Let's look at a few examples, starting with encounter dinosaurs. Here the audience comes face to face with prehistoric creatures. Once we press start, they encounter their first one, a small butterfly.
This is a creature that doesn't feel terribly out of place in a real room. As it eases people into the interaction and the experience, and when the butterfly leaves, it creates a portal or a doorway for people to meet some of the other creatures of that period. And the audience, they're now ready for that next prehistoric encounter.
The pacing here is really great and it helps bring people into the story. It starts in someone's surroundings to set the stage, then slowly builds up to bigger moments. And while you can change immersion more abruptly if you want to provide shock value, an app that starts with you meeting a giant carnivore is going to provide a very different and visceral opening experience. You might lose people right off the bat if you're Rajasaurus is not an idea of a warm welcome.
So the more you can build trust with your audience early on in the experience and build up to big moments, the more you can do with immersion in your story. To show you what I mean, let's return to D-Day, the World War Two documentary I mentioned earlier. I love that this experience increases immersion for the audience whenever the protagonist immerses herself in her father's history. We already saw this at work earlier when exploring those artifacts, but it happens at other moments in the story, too, when she travels to Normandy. The film frame changes to an immersive 180 degree video, and when the audience sees her father's footage from Normandy for the first time, they're placed behind the lens. But it's this next moment that really gets me. The audience sees photographs from the battle site. The frame briefly blinks, and then the scene is suddenly real in 3D around them.
Now, on paper, this is a huge jump in immersion, right? But because the documentary has been building to this, it earns that crucial moment. We have the context. The audience understands where we are, where where we're going to be, what's what's we're going, what we're doing, all of these things we understand. And it's a perfect example of what you can do with the power of immersion and narrative.
Now, as you start to think about how you can use immersion in your experiences, whether it's to help audiences focus on content. Transport them or bring people into your narrative. There are a few best practices that you should keep in mind.
No matter where you plan to go in your media experience, it's really helpful to start with the familiar. In media apps, people like to navigate content libraries and windows, and they also like to stay connected to their surroundings whenever possible. It's worth starting there, even if you plan to move to a more immersive viewing area later.
In interactive experiences. This is even more important because this is still a newer medium for storytelling. Familiar UI helps ground people. Before you take them into the unknown, now with someone's entire field of view at your disposal, it is tempting to try and design the entire frame, but try and keep your experiences focused. Too many bells and whistles can distract people from the core of your content. The story. Here are a few examples of how to do that well. Media apps, when switching between browsing and playback, will hide the browsing window when playback starts to help people focus. Experiences with more immersive content like rewild here. Place. Audience. Place all these things in front of the audience so they're not searching for the story and environments that sit alongside. Content like FX's Containment Room from Alien Earth have different animations, sounds, and lighting depending on whether you're browsing media or watching it. When you decide to change immersion, smooth transitions make all the difference in keeping people connected to your content.
In a media experience, you might use a slow fade when you're dimming or tinting, pass through to match content, or combine movement and a portal opening when you want to take someone into a fully immersive viewing environment. And you can get a little bit more experimental with interactive experiences like the butterfly and encounter dinosaurs flying back to open a portal. You can even add animations like this sample we made from a few years ago. Botanist. This little robot companion literally leaps from the tabletop into someone's surroundings to tend their virtual garden. Now, I do want to close this section with a few words of advice on bigger looking animations like this. If you're going to use them, make sure you ground your virtual content in reality in some way so it feels real. Let's watch this robot again as he lands on the real floor. See how he reacts from the landing. We see physics at work and he instantly feels more real to us.
It's also helpful to keep the majority of the room or environment visible and unmoving during the animation for Motion comfort. To show you what I mean, let's isolate the animation in this scene. So here's our little robot. Despite how big this looked in the previous clip, if we as we play it, you can actually see that it takes up a very small part of someone's field of view while they can move, and they can have big actions to follow the scene. The animation itself is small and compact, and everything else in the virtual scene remains stationary so people can stay comfortable and connected to their surroundings. You can do bigger immersion changes like we saw with D-Day. And this is where prototyping, testing and testing again becomes crucial. Let's look at an example from the spatial puzzle game Black Box. There are lots of moments where black box plays with immersion, but my favorite one asks players to physically lift their hands up and rip the existing world apart. If you haven't tried this game yet, I can confirm this works and it feels wild, but it works honestly because it's driven by a gesture. Even though this changes someone's entire field of view because they're driving the action, they feel in control.
Let's imagine this same moment without that gesture. You're sitting in your surroundings. You're playing a game when suddenly a rip in space sucks you into a new environment without warning.
This action is technically immersing you in content fully yet, but it actually breaks the feeling of immersion entirely. Just like we talked about at the beginning of this section, if you take away your audience's agency, they get yanked out of your story just when you're trying to draw them in. And as cool as something like this might sound in your head, you have to think about how you can take care of your audience. Immersion is a powerful tool for media experiences, but we have a responsibility as storytellers to consider the audience experience whenever we use it. And remember, this is still a new platform. Your app might be the first thing someone sees on visionOS and the first thing they try. So it's important to make sure that that experience is a great one.
Of course, immersion is not the only tool at our disposal.
Your visionOS experiences can add interaction and give audiences the chance to be part of the story in a way that they just can't with traditional media. And that's because even minor interactions on a spatial platform, they matter.
Let's take Haleakala. I want you all to go ahead and look at this. This is an environment built into visionOS that people can view on its own, or pair with background content. Now it's a beautiful vista. It's really nice. Good to sound, good to just like, listen to it. But I got a question for folks in the room. Would you call this interactive? No I would. Now, it's not a big interaction. It's passive for sure. And it's more of what I'd call a lean back experience. There's not a lot to do, but because a spatial platform like visionOS can target the audience's field of view, it's always interactive for them in some way. The viewer has agency to look around, listen to the world, and enter or exit the experience. These are all verbs that the audience is doing. We're interacting here. Now, this kind of interaction, it's not going to change the world of the story, right? If I stare at this rock, it's not going to suddenly sprout eyes and run away from me. But this world still might affect me. It might make me calmer or transport me. an even passive interaction like this can be a compelling experience for people, especially if you already have great content like this virtual environment. That said, there's so much more you can do with visionOS. We already saw how D-Day offers limited but meaningful pieces of interaction taking place at key moments in the story. And while these interactions only temporarily affect the world in the story, they definitely impact the audience. There are lots of different Vision experience visionOS experiences that take advantage of limited interaction like meditation experiences, guided stories like D-Day. You'll even find limited interaction in another one of our system environments, Jupiter, which offers controls to change the time of day. Essentially, almost any experience where the audience can be momentarily part of your world would fall into this category. Now, there is a special type of limited interaction that I personally love, and that's hidden interactions. These tend to not be crucial to an audience or an experience, but they add so much delight. Now to show you what I mean, let's return to the mountains and Haleakala. Now, I lied a little bit. I told you this was a passive experience. And it is. But if you're willing to explore and say, shout to the mountaintop. Echo. Echo.
Cool, right? The environment can use sound from my mic and play it back with a custom reverb, making it feel like my voice is really echoing back from those distant cliffs. This is such a tiny little thing, but it's the kind of interaction that visionOS does so well. It feels natural, and it makes such a big difference to how present I feel in this environment. And I really encourage all of you with visionOS and Vision Pro yourself to go try this. It's very fun. There are a couple of environments where you'll get different reverbs depending on how large, how loud you yell, and when you're trying to make spatial experiences feel like you can live in them. Interactions make all the difference. Like this. Especially if you're doing experiences that offer more active, in-depth moments of interaction of all the apps you can build and the stories you can tell. These kinds are the most game like stories like encounter dinosaurs. Invite the audience to talk and interact at multiple points, and these interactions often affect the world, the story, the characters, as well as how the audience experiences it. I mean, for those of you who haven't played this, how the audience interacts in an encounter dinosaurs literally changes the ending of the story. And while this is really cool, you also have to take a step back and say not every experience for visionOS needs to be this active right? In their interaction. They may not even need limited interaction. So when you're thinking about your stories, the first question you should always ask yourself is when and why you should add interactions. When I meet with people about their visionOS projects and we talk design, I have a few questions about their interactions I'd like to ask to help dig into this. First, is this interactive moment meaningful or is it more of a distraction to your audience? Is it helping make them part of the experience? Does interacting reveal a hidden element of the story? Something I wouldn't be able to learn just from looking around, watching a video, or reading a page.
And lastly, does this moment progress your story or does it distract from it? I want to look back one more time at that artifacts from D-Day, because I think it has some really, really great answers to all of these questions. First, is this moment meaningful? 3D objects on their own aren't incredibly meaningful, but here it's all about context. In that moment, the narrator has just described the feeling of opening these boxes, and when the audience transitions from that video to interaction, they take the narrator's place. They're now physically handling the objects she's just described, and they become more connected to the content as a result. When they get to examine these objects, they can turn them around or read an inscription, and that gives them information they wouldn't otherwise get from the video. And this adds to the story by giving the audience space to process the emotional weight of what it feels like to go through a parent's belongings and perhaps see them in a new light.
All in all, it's a really impactful moment of interaction. Now, while D-Day has great answers for all of these, your interactive moments don't have to check off every box here. But if you're thinking about something and you can't answer a single one of these questions, it might be a good gut check to go back to the drawing board.
Now, if you are planning for more interaction in your experience, how can you do it really well on this platform? The first thing you need to do is identify the actions that you want to help your audience do in your story. These will help you define the inputs and interactions you'll need to design. For example, if you're making a media experience, you'll likely want your audience to select or watch content, and these verbs tie really well to indirect actions like looking and tapping. If you're building an experience where people should examine objects up close, you might want to use direct manipulation. Should people be able to speak in your experience, like the Haleakala environment, or use their bodies in some way? In that case, you may want to use voice or take custom head and hand motions into consideration. And if you're building a game where your audience needs to do a lot of things in quick succession like run, jump, bounce, fall, you might want to design custom gestures for that, or even think about building in controller support because you're asking more from your audience when they interact. It's important to make these moments comfortable for people. Most people are going to use Vision Pro while seated, and they expect content to be positioned at the right height in front of them, centered in their field of view. You also have to think about the proximity of this content. If you place content beyond your viewers reach, their initial instinct will be to interact indirectly by looking and tapping their fingers with their arms at their side or in their lap. This is great, and it's a super comfortable way to interact for most window based experiences. But if you're building a moment where you want people to physically interact, you should consider placing that content within arm's reach so they can pick it up without strain. And they're not trying to, like, reach over to grab your thing. This is also really important if you're doing repetitive actions, right. If you're asking someone to press a button over and over again, you don't want to push this button all the way out in front of them 50 times, because I can guarantee you, after about the fifth time, you're like, nope. Goodbye. Let's try something else.
So you can see some direct manipulation right here.
And sometimes experiences like this, they aren't so straightforward. Maybe you need somebody to stand so that they can comfortably interact, or you're switching between immersive lean back moments or interactive lean forward moments. This is where inviting interaction with cues for your audience is incredibly important. They need to know when something's interactive and when it's not. And if they're not sure what to do, this can break their connection to your story. You can do this in a few ways, like with clear signposting at the beginning of your experience here, encounter dinosaurs tells people that they can interact naturally with any creature, so they're not constantly looking for UI. And you can also bring up hints during interactive moments like D-Day does here. But you can also do this in-world, directing the audience's attention with patterns like sound, light, and motion. One great example I want to show is from the interactive experience Auto's Planet. When the audience needs to interact, the experience cues those moments with subtle glows, similar to system hover effects to indicate that the audience should look or tap on items to help its tiny protagonist. And when the interaction is complete, the glow will disappear. Lastly, if you offer the audience interactions in your experience, it's important to remember that you're giving your audience agency, and that means you kind of have to respect the choices they make. Taking counter dinosaurs. If you tell the audience that they can interact naturally, I'm really sorry to say that some people are going to try and fight a dinosaur. I mean, I wouldn't, but as designers, we have to step outside our own desires, our own desires, and our own intentions and expect the unexpected. Because if someone tries to smack Raja over here and he doesn't react, that immediately breaks immersion for that person and their belief that their choices matter in your story. So instead, you can find creative ways to incorporate their choices. For instance, the first time that the audience does something that Raja doesn't like encounter dinosaurs. He shakes his head like a horse. And if the viewer decides to keep tormenting him? Well, sometimes you have to design characters who have firm personal boundaries.
One last thing on audience agency.
No choice. Still a choice. If the audience decides to lean back and not interact, your experience should acknowledge that. Let's go back to encounter dinosaurs one last time, and I'll show you what they did. This is a high interaction experience, right? Creatures want to meet you. But what happens if people aren't so thrilled to meet them? If the audience stays wary and doesn't interact with the butterfly or the small first dinosaur, the app actually recognizes this lack of interaction, and the story falls back to a more traditional cinematic experience. The app still delivers an end to end story, just one that doesn't involve the audience. Now, you don't have to make it a 0% interaction or 100% interaction thing. You can always split the difference. Kung Fu Panda. School of Qi does this really, really well in the experience. Po the Panda helps the audience learn a series of tai chi moves for inner peace, of course.
And when people follow along and the experience senses their hand motions, Po will make encouraging facial expressions. His eyes light up. He smiles. He nods. He's really excited.
But if they do the wrong thing, or if they stand back and do nothing at all, he starts to become more and more exasperated. And if people repeatedly dismiss or skip the move, then the experience offers on screen UI with hints to emphasize how to get people back to the story. And this progression is great. People can choose not to interact, but they also get space to see how that impacts the story. And it doesn't stop them from being able to interact in the future. The experience doesn't just shut down because you stopped interacting. Now, I could probably spend the rest of this hour and maybe even the day talking about interaction design for this platform. But instead, I'm going to take a break for a second, because I actually want to invite a very special guest speaker up on stage to share their experience designing stories for visionOS. Now, spoiler you have seen some of this person's work on stage already. He's the co-founder of the immersive studio Tiago, and most recently produced one of my personal favorite pieces for visionOS D-Day The Camera Soldier. So I'm very excited. And please, all of you, give a warm welcome to Victor on the stage.
Hello everyone. First of all, thank you serenity. Thank you Elliot, and thank you everyone at Apple for holding this event. It's truly incredible to see the entire community, the immersive community here, and I'm thrilled to be presenting. So my name is Victor Agulhon and I'm the co-founder of Tago. Today I'm here to talk about crafting interactive and immersive stories for Vision Pro. But before, I'd like to start with a quick word of introduction on what we do at Tago. At Tago, we believe in the power of immersive technologies to connect with the real world. From the very beginning, we fell in love with the sensation of presence. We love the feeling of being there, of meeting real people, of exploring real places, of being inside real stories.
Tago. With an immersive documentary studio, we create original, immersive experiences for mainstream audiences. I co-founded this company eight years ago with the immersive director of our films.
Our team handles everything in-house from concept to post-production, and we're constantly pushing the boundaries of immersive storytelling, both narratively and technically.
Over the last eight years, we have published dozens of documentaries, a series on women chefs in gastronomy, true crime investigations, historical experiences that take you back in time. And we leverage all technologies. Immersive video 3D video, 3D modeling, 3D real time environments. We always start with the story. And then we choose the right technology. But today I'm here to share learnings about an experience that we created and designed specifically for Vision Pro called D-Day The Camera Soldier. It is a 20 minute documentary that we produced in collaboration with Time Studios. We released it last May and is now available on the App Store.
The documentary tells the story of Jennifer Taylor. She is the daughter of the World War Two soldier Richard Taylor. For most of her life, she knew very little about whether her father had done in the war. She knew that he'd been a soldier. She knew that he'd been a photographer, but he never spoke about what he'd seen. Until one day she received a message from a historian, and he shares photos and films that her father took on D-Day when she discovered the footage. Everything completely changes for her. For the first time, she understands what he went through. He was in the very first waves. He filmed everything, but he never spoke about it.
His footage turns out to be the only existing film of the D-Day landings, and for Jennifer, it became the great way to connect with her father, and this documentary follows her on this journey to reconnect with him.
Over the past years, we've really been shaping our own philosophy for immersive storytelling, and today I want to share three of its core ideas applied to Vision Pro, starting with the most important one for us being intentional about immersion. For every project, we always ask ourselves this very simple question that you've heard before why does this story need to be immersive? What can we achieve here that we couldn't possibly achieve with any other medium? With this documentary, we used immersion to transform time into a place that you can explore more than showing you the past, you can literally move through it in the documentary the audience will find themselves in the exact same location 80 years apart, and it creates a sense of wonder that I've only ever experienced in immersive. The second concept was the idea that immersion should mirror the story. The whole documentary is built around this idea. The more Jennifer dives into her father's story, the more she's immersed in it, the more the viewer is immersed alongside her. So let's see what it means. The documentary begins with the 16 by nine 3D video. At this stage of the story, Jennifer doesn't have the full picture yet, so visually she is literally boxed in to capture this moment in 3D and in 8-K, we built our own custom Blackmagic 3D camera system. It creates this beautiful magic window effect, and it gives the feeling that the audience is looking right into her world.
The turning point in the documentary is when she learns about her father's role on D-Day. That's when the experience becomes fully immersive. She returns back to Normandy to immerse herself into his memories. The frame expands and the audience is immersed alongside her. They are now part of the story with this beautiful, crisp, immersive video.
For us, immersive video is the most powerful way to capture the authenticity of a moment. It's the closest you'll ever get to meeting a person in real life, especially with the new Blackmagic Assassin Immersive Camera. But crafting meaning through a 180 lens is a real art.
Finding the right, framing the geometry, the perspective, the distance it's all very subtle. For us, the craft of immersive video is about mastering that feeling of belonging to a moment.
The final step of immersion in the documentary is about bringing the audience back in time. At one point, she returns exactly to the spot where her father landed on Omaha Beach. She closes her eyes to remember the D-Day, and the audience will be transported back to 1944, on the day of the landings. They see the photos and suddenly they're inside the frame.
They're in the exact moment the photo was taken with incredibly immersive sound design. You really have to picture an immersive and interactive 3D bullet time of iconic moments. And this actually builds on top of our immersive video capture, because the scenes here are fully built in 3D from the ground up. The audience can literally move inside of them.
This this is a crop that we've developed at TOG over the years that truly brings the past back to life for us. It is the ultimate immersion. And this is how by being intentional with immersion, we can truly serve the story. But immersion in the documentary or in any immersive experience doesn't come only from expanding your field of view. It also comes from bringing people closer to the story physically by letting them interact with it. For these experiments, there are two rules that guided our interactive design. First of all, it's the belief that someone's action is more important than someone's attention is more important than their actions. Interactions should only enhance your experience. They should never take anything away from it. Even if you don't interact, you won't miss a crucial element of the story. There is no sense of winning. There is no sense of losing when interactions are simple. They also let your story be the focus. Interactions in the experience make you connect with Jennifer. They're your literal touchpoint in the story. And this is why in the documentary, we give you the sensation of being alongside her here. As you can see, she's looking at her father's letters and a few seconds later you're going to be in front of the exact same thing in 3D. It's your turn to grab them, to hold them, to look at them.
You look at what she's doing, and naturally you want to do the same. You truly feel like you're her guest.
Later, she opens a box of historical objects to look at them. When it's your turn to hold them, you can literally touch history.
And during all these interactions. The narration always continue as sets after and ends after a set duration of time.
It ensures that the story keeps on moving forward, but interactions don't always require the viewer's input.
In the documentary, we wanted to convey the idea that we all live in a world that's been shaped by D-Day, that the legacy of the event still lives around all of us. We wanted to create a very personal experience with the events. So we decided to use a viewer's room as our canvas.
We used Vision Pro to scan automatically the surroundings of the users, which gives us a 3D map of the room on which we could apply our effects. At the start of the experiment, the viewer's surroundings slowly become covers with photos and films of the dead ending, bringing that legacy to life all around you. For us, it's a great example of how you can create a very personal experience for each viewer without their inputs.
So now let's take a step back and recall how we use the field of view to increase immersion With interactivity. We can now engage viewers even more and you can see why. Featuring such a wide diversity of media, video, 3D models, interactions all with such, this high fidelity was absolutely essential. This truly is the core reason of why we built this experience for Vision Pro, because it allowed us to unleash our creativity across all technologies without any trade off.
To bring together so many technologies into one single experience. We also created our custom system behind the scenes. We created the experience in the game engine UDC using Unity Spatial to build for visionOS, we customized the timeline to create a next generation editing tool for interactive experiences. This is truly what enabled us to bring this life to life. This concept of growing immersion.
So now that we've seen how we can create highly differentiated experiences with interactions, I want to talk about how we can bridge the gap with what people already know and watch at a high level. It doesn't matter how complex your technology is, it should always feel natural to viewers. And this is why, for this documentary, we decided to lean heavily into the film and TV language. It creates a sense of familiarity.
The app design was inspired by the TV experience in visionOS. The main screen that you see when you open the app looks like a streaming service, and we give information that you would find in One. The duration of description. We use words that people are familiar with. Watch play starts. People have to know what they're signing up for. The UI is also inspired from streaming services. It shows a timeline, a countdown that lets you know where you are in the film. It allows people to skip, to go to chapters, to play, pause all the things you will do when you're watching a film. And in case people wanted to show it to their friends, we also included a quick tutorial at the beginning to bring users up to speed on how to pinch and recenter their view.
Finally, this logic is also baked into the content itself. In this presentation, I've shown you a lot of frame content. It follows the exact same logic. It's about creating an environment that users are familiar with. It's how we guide people gently toward immersive. It also creates moments of contrast that means the immersion truly shines for us. This is how we make immersive content more accessible and more mainstream. Before I wrap up, I want to share one final note with you. Since the release, we've been truly blessed by the reception of the experience. The film has gotten thousands of downloads on the App Store. It's been finalist at the Emmy Awards. It was selected at the Venice Film Festival. It only has five star ratings on the visionOS App Store. But this. This would have never been possible without the support of this immersive community when we started the project. The immersive community truly is our most precious asset and I really mean it.
In June 2024, we launched a prototype on a private beta on Testflight. We had hundreds of Vision Pro users signing up and sharing their feedback with us. We learned a lot in the process. So today I want to thank everyone who contributed for your time, your ideas, your conversations with us because it truly changed the game. And finally, I want to say a special thanks to the incredible team at Tago. Because you're brilliant, you're talented. It's just a pleasure. And I'm so proud of what we're building together. And I want to thank everyone here for your time today and online and serenity back to you.
Amen.
I get a little touch listening to Victor talk about this. This community is really special. I love seeing the support that everybody's been giving to each other over the last year or so that this platform has been active, especially the immersive media creators. And I really, really love seeing the work that some of you have already put out and what's what else has come out so far. And I'm looking forward to hearing what else you folks have are cooking up now. Immersion and interaction on this platform. They get a lot of glory, and rightfully so. It's really, really cool what you can do, but great interfaces and laying out your content well in 3D space. This can also make all the difference between your audience diving into the details of your story, or being yanked right out to the surface. So we're going to talk a little bit about those two areas now. Interfaces. They are the backbone of your apps and experiences on this platform. They're used for navigation and hierarchy and to easily read about or find content. And on a spatial platform like visionOS, your interfaces will move off screens and live in the audience's world. They might be used in a small space, like a train car or a wide open living room. And whether you've designed interfaces for years or your learning design to support your immersive media content, there are some unique considerations for this platform. I'm going to highlight two areas today that I think are particularly relevant for media apps and experiences.
Best best practices for spatial placement of your content, as well as how to make your interfaces easy to use.
Let's start with placement. Now, because visionOS is a spatial platform, you can take advantage of depth and people surroundings in laying out your content. If your experience starts in a window or a volume, it's placed automatically in front of the viewer on launch, centered in their field of view. This applies even if they're laying back at an angle, like on a couch. And this way, your audience can immediately start enjoying content no matter where they are. When you're placing virtual objects, however, by default they're placed a little bit differently. They're relative to a coordinate space on the floor in front of the viewer.
Now, this does mean that because content is placed relative to the floor in front of the viewer, if they're angled in some way or standing, they might not quite get the scene that you originally planned. So there are a few different ways to solve for this in your apps. One way is to start in a window, then have your audience place content themselves. I want to check out How Paradise, an educational experience that you saw a little bit earlier, does this Now card lets people explore a fantastic vintage car collection. It begins the experience in a browsing window, which appears right in front of the viewer, and when they select a car, you've got this small 3D model appears that appears and it's attached to the window. So when people move the window, the car comes with. But that's not where this experience ends. The viewer can tap and drag on the model to physically place it wherever they'd like. And if they want to make that a little bit bigger, they can pinch or they can double tap and it becomes life size in their room. This is a great example of a progressive and scaffolded flow for your audience. They still get an awesome experience, even if they just want to browse from a window. And if they have some free space, they can pinch and size the 3D model to where they'd like it. But if they want that full vintage garage experience, they can opt in. This is helpful if you're starting in a window, but you might be asking, what if your environment has 3D content from the start? What if you have an environment? Well, there are a few things you can keep in mind as a designer to ensure that your content gets laid out well for your audience. Vision Pro and visionOS. They let you request the current head position of your viewer. So instead of just placing content in front of them at floor origin, you can also observe their head to make sure they get the right picture. This is also really useful in experiences where you're building a fully immersive environment, and you're not quite sure if someone's sitting or standing.
Now, one last tip in this vein. Did you know that there's a way for people wearing Vision Pro to quickly recenter every piece of content in their space in front of them? No. Some half. Yeah, I can see people making hardware button. It's the Digital Crown. So if you press and hold on the Digital Crown, all of the content will snap in front of the viewer. And remember, the audience always has agency in visionOS experiences, this is really crucial. And that means that yes, they're going to have content or control over where content is placed relative to them. So at any time your viewer can press and hold it on Vision Pro to recenter all content, windows volumes, and even 3D elements in front of them. So if they start sitting or they later decide to lay back, they can press and hold the crown and they move their content without ever having to physically pick it up and reposition it. Now this works for Windows and volumes, right? They automatically reappear in front of the viewer, while 3D content will recenter relative to the floor, environments will not recenter, right? You're not going to recenter, and all of a sudden the floor shifts up to to meet you. That would feel a little bit disorientating to people. But if you want your 3D content to be positioned in other ways, you can observe this button press in your experience to let your experience know that a recenter is happening. I want to show you our Mount Hood environment, which does this. So if I'm watching media in Mount Hood and I decide to go from a sitting position to laying down when I press the recenter, as I said, this environment's not going to change. Environment will stay at that position, but the screen itself will recenter to match my relative eyeline. This is a small thing, but these sorts of actions make all the difference to your audience as they experience your content.
Now, there are a few specific considerations if you're designing for the viewer's real world surroundings. When we're talking about placement. The first is considering someone's physical free space. For example, if you're designing a portal based experience and you're like, oh, I want to put this up against a wall, that docking might look great in a large living room, but what happens if someone has a bunch of paintings on the wall, or they want to use it in a much smaller space? I want to show you how encounter Dinosaurs approaches this because I think it's really smart. In the background, as it launches, it makes a scan of the room, both to understand the space available to it and to find an appropriate surface to place that portal. But if there's not enough space to just stick the Cretaceous period on the back of a wall, the experience automatically adapts in tight spaces, the portal turns into more of this wraparound style experience and an even smaller areas like, say, you want to Watch encounter dinosaurs on a train. The portal will shrink and float in front of the viewer like that. Maybe you want to make content that feels like part of your audience's room, not just using a wall. Well, visionOS has a ton of tools to help here. It can recognize floors and walls, but it can also recognize angled planes and certain objects like tables. Victor just showed us how D-Day and the documentary is mapping the walls and the furniture and to to really bring people into that narrative. But you can also use these sorts of tools to make content feel real in a scene, Like building a tabletop story that snaps to an actual surface, or an experience where items have physics in your world and bounce off furniture and even get occluded.
Finally, you'll want to consider whether you need movement in your experience and place content accordingly. For example, here the game blackbox has a puzzle that requires the viewer to stand up and spin. Now it needs to hint them in the right direction, so it actually provides a visual cue where when the puzzle launches, it's placed high above the viewer's head position if they're sitting. This encourages them to stand to see over the content, and if they shift at all, they might inadvertently begin the puzzle and realize what they have to do. These are just a couple of different ways that you can start to think about placement in your experiences. And once you're happy with that, it's important to understand how people are going to interact with it to get to the story. That's where your experience is. Interface comes in and the good news is in visionOS you get so much for free if you design with our system components, windows and volume styles, materials, buttons and presentations, perfectly sized elements, hover effects, and even support for platform accessibility features, there's a lot here, and for some of you in the room who are newer to visionOS development or even development in general, if you're a filmmaker or a creative technologist who wants to start building apps, this is a great place to get you started without you having to think about designing everything from scratch. For example, if you're building a media player, you get the player controller UI for free as well as controls not only in mixed immersion, but also in more immersive playback. You also automatically get access to system environment playback for media experiences and the option to invest in your own custom environment. And there are lots of resources on the developer website for building beautiful browsing experiences. So if you're new at this, you don't have to reinvent the wheel or the button when you're making an app. And yes, this is true even if you're a creative technologist who uses unity, our user interface framework actually interoperates with unity poly spatial experiences, and it's a great starting point to check out if unity is your preferred system of choice.
But if you're going beyond the window and designing more immersive moments, you'll still need navigation. And fun fact you can use system UI components here too. You don't have to reinvent buttons just because you're going fully immersive. Whether you're making primarily an app with 3D content, if it's in a volume or an immersive environment, you can always think about creating custom window sizes for your content or attach something like an ornament if you want to create great utility controls. This would be useful if you needed a panel close to the viewer or separate from the content. And I'll give you an experience of something, or an example of something we already have on the platform that's doing that. And that's the Jupyter environment. This offers a control panel to adjust the time of day and watch the planet's rotation And the audience can reposition it using the window bar if they want it out of the way, and they can even press done when they're finished with it to continue enjoying the environment without closing it.
Now, your 3D content can also support our system presentations. And those are things like menus, tooltips, popovers, alerts, and confirmations.
These can be useful in a narrative experience like this little asteroids speech bubble right here. But it's also a great way to offer controls tied to a specific piece of content.
And of course, you can design custom experiences where needed. But it's important to make any custom components you're building feel familiar. D-Day is a great example of an experience that needed a custom solution, as Victor was saying, for their for their story. But what Targo did is they designed their media player to feel very similar to the system player, like tapping to click or show UI as well as building in chapter support or easy play pause controls. All of these things can help even custom components feel familiar to your audience.
Lastly, in fully immersive experiences that also have interactive content, you might want to keep the viewer largely focused on the content, but still offer them a way to easily enter or exit. Now, if your experience is highly interactive, you may already be using the tap gesture for something, so a tap to show and hide controls might conflict with something you're already doing. Instead, you can consider reserving the bottom of someone's field of view for persistent UI, like the way that the photos app does when showing a panorama. You've got that exit view button at the very bottom, and you can also consider, again, building components on 3D content. If I can leave you with one piece of advice on usability for interfaces whenever possible, keep it simple. Simplicity will help power your storytelling and help people focus on your story, not trying to navigate it.
Now, we've covered a lot about the look and feel of your content and visionOS. But there's one more feature you can incorporate to elevate your audience's experiences. Sound.
Now, if you're a filmmaker, you know the power of visuals paired with a great soundscape. Whether you're setting the scene of the world, giving life to characters, or even building tension without sound, you lose a part of the story. And that's true for experiences in visionOS, too, where other Apple platforms are often used, muted sound is on by default in visionOS, and that means it can be just as important as the visuals in helping people explore. Sound is also spatial on this platform. That means that someone wearing Vision Pro hears audio from an app or experience just the same way that you hear my voice. Audio reflects off of surfaces, objects, and materials. It feels real to someone wearing Vision Pro and externalized. And you can make some pretty cool experiences as a result of that. Today I want to share three examples of experiences on the platform that are doing sound really, really well. We'll look at them and we'll listen to them too. First, we're going to explore how to build a peaceful soundscape for an immersive environment like Mount Hood.
Then we'll discover how apps like Disney+ are building environments that sound real, although maybe a little bit too real.
This is a little too real too. We also are going to consider how sound can help define characters in an interactive experience, like encounter dinosaurs. So let's go away from the scary creatures for a second and start our journey in the forests of Mount Hood. I'm going to ask you all to close your eyes for a moment and just listen to the soundscape.
What do you hear here? Light ripples on the water.
The drip of a stream.
A slight wind in the air.
Even without seeing this environment, this soundscape helps you start to get a sense of the space you're in. The world around you and what you can expect.
Go ahead and open your eyes. Now this environment, Mount Hood is a fully immersive experience for visionOS users, and it has a light and a dark version, as well as a realistic spatial soundscape to match both of those daytime and nighttime experiences.
Now, it doesn't demand the audience's attention. It's here to add subtle emotion to the scene, and that's perfect because this environment is designed to live in the background as people use other apps. Now, I want to talk a little bit about how this environment was designed from a sound perspective. Our design team first took a trip to the actual Mount Hood, and they sampled some of the naturally occurring sounds there. As designers, this is a great starting point, but as our team found, you can't just expect to replicate the sounds of a place one on one. There are always surprises. For example, here's a clip from right next to the picture of the lake that we just saw that Danielle Price, one of our sound designers, got on location. I'm going to go ahead and play this.
Hear that loud gurgling noise? That's a large water drainage system right off screen.
Yeah. The noise is so loud you can't actually hear any of that pretty nature. We were listening to 3 or 4 slides ago. In fact, without context, it kind of sounds a lot like.
Yeah, not really the vibe we're going for.
So when you're designing soundscapes for your experience, reality can be a good place to start. But remember, you can also curate it if you want to make something more pleasant or weird or fun, you have the power to make your sounds complement their visuals in the best way possible.
There are two main categories of sounds our designers use when they're curating a soundscape, and the first one is spatial audio sources. These are like sound elements that occupy a point in space, like birds, crickets, and frogs. And we also like to think about ambient background audio. This is the overall ambiance of a place. This audio is usually a surround mix, anchored to play all around the viewer and loop continuously without being noticeable. In 2023, at our Worldwide Developers Conference, we actually had a few members of our sound design team share how they use these categories to build the realistic soundscape of Mount Hood. And I'd like to play that excerpt for you today, because I think it's a really great example of how you can put all of these pieces together. So without further ado, I'm going to introduce Danielle from three from three two years ago. Take it away, Danielle.
Maybe. Let's see. There we go. First, let's try placing the sounds of the crickets and frogs on the left and the right.
Now let's adjust this. It's way too loud and feels too close. So we need to do two things first. Turn down the volume of each a few decibels, and then push the location of the sound into the distance so it can sound further away. Let's listen again.
It's starting to feel more natural. Now let's add a couple frogs in the foreground on the shoreline.
That sounds really great, but we don't want to hear the same frog over and over again from the same spot. So if we use randomization, we can create a more natural soundscape. We could achieve that by alternating between a collection of different frog recordings, as well as the location they are playing from. From there we can randomize the timing of when they are played. Now let's listen to everything together.
There's still one more thing we need to add the overall ambiance of a place. These sounds are played and surround softly, adding ambiance to the space around you.
Now let's listen to all the sounds together as it's experienced in the mix.
That's pretty cool, right? It's all comes back together in a really, really nice way. And if you haven't checked out that session before, I really encourage it. Danielle goes into some really, really great detail about how to build spatial soundscapes for visionOS. Now, not every environment is designed to be subtle. Like Mount Hood. The environments of Disney+ are incredible standalone experiences for fans of shows and films. They're great for watching content in, but they're also rewarding to explore. And FX's Containment Room, which places us inside the world of alien Earth, is one of the absolute best examples of this. Let's watch and listen.
Inside the containment room, we hear the sounds of the room, the crackle of electricity.
Some clanking of chains. Some dripping water.
And then we hear the hiss.
Cool. Anybody have a flamethrower? So the goal for any environment here is to use sound and visuals to place you in the scene, and the containment room nails this on both fronts. The ambient background. It makes us feel right at home in this sci fi universe. And if you know alien or any of the alien properties, you also know what's lurking the second you hear that hiss. I want to look at this scene again, though, and I'm going to take the soundscape away.
So here we are. We still have strong visuals and FX's containment room, flashing lights, creepy containers, dripping water. But there's less build up here. The tension is just not the same. And when I finally look around, if I see that Xenomorph, it feels more like a jump scare than a wary search. Listening to my environment without the cues that the sound brings, we lose out on some of the story. FX is trying to tell here.
Sound can help direct people's attention, and that's really important for an experience like this one. And I kind of want to share something even cooler or more terrifying, depending on your feelings about Xenomorphs that FX did here to direct attention the aliens not just hiding behind that pillar watching you, it actually stalks you around the environment. Let's go ahead back and return to where we left it behind this pillar so you can start to hear the xenomorphs clicks and hisses, and in Spatial Audio it feels right above you.
But in addition, as you hear that, you hear it move around the environment. You hear him crawling around. You can tell exactly where it might be lurking, including, yes, it climbs over you. And this sound not only directs attention when you're wearing Vision Pro, but it also helps set expectations as the viewer. Now, for me personally, I love the alien films and I also am terrified of horror and thrillers. It's a weird dichotomy here, I know, but this environment I actually feel really good in, and it hits that exact right balance for me because I can hear the Xenomorph, I can hear the hisses and growls, but in hearing those sounds, because it's Spatial Audio, it's positioned in such a way, and it's far enough from me that I know that it's very unlikely that the Xenomorph is going to jump in my face and try and eat me. I know exactly where it is in the room, and exactly how scared or how wary I should be. And especially if you're building something that's thriller in nature or horror sound can just add that extra beat to really help ground your viewer and make them feel comfortable in their environment, or at least as comfortable as they can be if there's an alien stalking them. Let's leave our immersive environments for now. I think we've talked enough about Xenomorphs and head back in time to the Cretaceous period specifically, and take a listen to the interactive world of encounter dinosaurs as a narrative experience that lives partially in the real world. Encounter dinosaurs has a really interesting balance to strike. Even though it's interactive, it's still a very cinematic experience. It's a narrative story first and foremost, and it's important that sound reflect these cinematic environments. And the experience also has a job to do, though it needs to make viewers feel transported millions of years into the past, and to make characters from that past world feel real in someone's space. So I want to look at how Encounter Dinosaurs approached each of these three BeatsX. Now, while the entire experience isn't scored with music that might feel a little bit heavy, there is a brief musical introduction to help set the tone of the narrative. Let's go ahead and listen.
Now, this plays around in sort of that surround ambient background mix that we talked about earlier while you have this interaction with the butterfly, and the score continues while you're interacting with the butterfly. But the second that that butterfly leaves you to return to the Cretaceous period, as that portal's opening, the sound fades away. We transition to the sounds of prehistoric volcanic life, and we get thunder instantly. We have been moved inside this environment, and one of the reasons that we're doing this, this sound, is reverberating inside the environment of the portal, but that sound is also bleeding out into the viewer's space. This causes a boundary that's been breached between the inside and the outside. And it sounds really, really cool. And now that the scene is set, encounter dinosaurs can use specific sound sources with its characters to make them feel real.
So you'll see as characters enter the scene, there are audiometers on their feet, their mouths, their tails to ensure that every stomp on the ground or guttural growl feels like it's coming from that point in space. When something sounds real, it can feel that much more real.
These sounds can help give weight and dimension to the characters, and they're also used to direct attention in the scene, just like Alien Earth does. Before the first dinosaur arrives, for example, the audience will hear a small chirp erupt from the fissure.
It's really cute when the larger rajasaurus is first introduced, the experience plays loud, booming steps coming from the left before you ever see the rajasaurus.
By the time the experience ends, the audience knows these dinosaurs by sound almost as well as by the visuals. And I don't have the sounds of this because I don't want to spoil how cute this moment is for you in real life if you haven't gotten this ending yet, I encourage you to go back and check it out because they're very sweet.
Now, no matter what experience you're looking to make in visionOS, I hope you're starting to get an idea that sound is a crucial part to help you bring your story to life and make it feel real.
We've covered a lot of ground today, and no matter your experience, I do hope you've learned at least one new thing about designing for this platform. And if you want to learn more, we've got a ton of incredible resources on the developer website to help you get started. For those of you in the room, we'll also be hosting design consultations on Wednesday evening and all day Thursday. So if you want to talk in more detail about how you can apply these design principles and many more to your own experiences, I'd love to chat with you. In part, I just love nerding out with this group. It's really fun. Now, if I can leave you with one final piece of guidance, start with what you know. Well, there's a lot of room to experiment here. Many of you already have great ideas you've produced for other mediums or platforms, and it's worth starting your explorations or continuing your explorations if you're already on Vision Pro right there. And as you're starting to think about making something new or going beyond that, ask yourself what excites you about making work for this platform? What elements have we talked about today that can help your storytelling? How do you want to use the spectrum of immersion to better tell your story? How are audiences going to interact in your story, and how can you use sound and other tools to really enrich your content for Vision Pro? I'm hopefully looking forward to speaking more with everybody about this this evening at the mixer, but for now, we're going to take a quick break and then we'll return as Nathaniel shows us how to build some of the things that I've just been talking about, and to take us behind the scenes of creating an experience for Vision Pro and Reality Composer Pro. Enjoy, folks, and thank you so much.
So next up we've been talking a lot about conceptualization for visionOS. Right. How we can make really great immersive video. How we can design really great interactive experiences. The second half of the day is all about turning those conceptualizations into action. So I'm really excited to introduce Nathaniel. He works in reality Kit and helps developers and creators make really, really cool experiences for Vision Pro. Nathaniel is going to take us through a little bit of what went into building. One of the experiences that we released as a sample project this last summer. So without further ado, Nathaniel.
Prompter I need the clicker. The clicker. I got you.
Hello, my name is Nathaniel and I'm an engineer here at Apple. So you've dreamed up the perfect experience you want to bring to life on visionOS. Apple Vision Pro is the most capable platform for bringing spatial experiences to your audience. How do you actually start building? Today, we'll explore how you can bring 3D immersion to your storytelling experience with Vision Pro with Realitykit.
During this presentation, I'll be using assets and examples from a sample project Petite Asteroids, that we released earlier this year. This is a vertical slice of a complete spatial experience on Vision Pro.
The full Xcode project, including the assets you'll see today, is available for download on the Apple Developer website. You can build a project for yourself and experience it in Xcode, simulator or on device.
There's a lot more to this project that I won't be able to get into today, so I hope you're able to download this project after the presentation and try it for yourself. We know many of you are eager to see your content on visionOS.
We want you to know that you don't have to start your project from scratch. You can bring your existing experiences to visionOS. There are several different pathways to getting your content on device, so let's go over a couple of them first. Existing projects built with native Apple frameworks like Realitykit will have the easiest time bringing their experience to visionOS, because Realitykit works great across Apple platforms.
You can also build new experiences, new experiences from the ground up with Apple frameworks. The Encounter Dinosaurs Experiences was built with Realitykit.
Alternatively, if you have a project built with unity, Unreal or Godot, you can configure your project to target visionOS.
These game engines all have a well-established track record for building Triple-A games and fully immersive experiences using workflows many people are already familiar with, and you can also take advantage of your existing skills in these game engines to create groundbreaking experiences for Vision Pro like Tiago did with D-Day the Camera Soldier, the studio uses Unity Poly Spatial to design a custom immersive player for their experience.
Let's talk about why you might want to go with native Realitykit over other options, like third party game engines.
First, Apple frameworks naturally work great across Apple platforms.
If you've already built an AR app with Realitykit for iOS and iPadOS, you can bring that same code to visionOS. And new this year, Realitykit is now available on tvOS.
Realitykit is built to work together with other Apple frameworks to create unique experiences. This means you can use tools like AV Kit and SwiftUI to bring additional functionality to your app. Let me give you some examples.
AV Kit gives you tools for playing back audio and video on visionOS. To the right is a recording of an app you can make to playback spatial videos.
If you bring reality Kit into your app, you can elevate the immersion with rendering effects and immersive environments.
Another powerful framework is SwiftUI.
SwiftUI gives you everything you need to define your apps, windows, volumes, and spaces, as well as the content that exists inside them.
And when you combine SwiftUI with Realitykit, you can build UI controls that directly interact with 3D models in your scene.
Of course, going with Apple's first party APIs means you'll always have access to the latest technologies as we make them available, like Nearby Sharing, which you'll hear more about from Ethan and Alex later this afternoon.
Today, I'm going to introduce you to one of my favorite frameworks, Realitykit.
We announced Realitykit in 2019. Since then, we've continued to improve Realitykit, and today it is more capable than ever, works across Apple devices, and is part of a complete suite of frameworks for building interactive apps.
So what kind of experiences can you create with these technologies? You can make robots shoot sparks and fall in love without ever leaving a window.
You can explore the Grand Canyon from your tabletop.
You can build an amazing theater environment for your audience to enjoy your movies.
You can help a little chondrite get back to her friends from the comfort of your living room.
Or fully immerse someone in the beauty of the Wild West.
We built all of these experiences with reality Kit, and today I'm going to show you how you can get started.
For this presentation, I'll first go over how to work with 3D assets.
Then after a quick demo, I'll move on to talk about designing immersive scenes.
Next, I'll talk about bringing interactivity to our experiences, and then we'll take what we've learned and apply it in another demonstration. So let's get started.
In order to build apps that take advantage of a 3D space, you'll need to understand how to work with 3D assets. 3D assets can come in many different file formats, so let's take a high level look at some of the other formats out there today.
The most basic format is obj, which often essentially contains just a single 3D model. It has limited support for materials and no support for animations.
Then there's a large group of more modern formats, including FBX and Gltf. These usually support multiple models that can be laid out in a scene graph, and have varying levels of support for materials and definitions. Many are tied to proprietary tools.
USD supports all of this and is additionally designed to be highly scalable. Pixar developed USD for its use in films, so representing millions of objects is the typical case.
And USD is built with collaboration as a core feature, allowing for many artists to work on a single scene without getting in each other's way.
USD stands for Universal Scene. Description.
It was originally created by Pixar for movies or as we like to call it, linear media.
Today, USD is quickly becoming an industry standard for 3D content thanks to widespread adoption by companies like Apple and Nvidia.
Usdc, Usdc, and USDA are types of USD files, each for different purposes.
To create USD files, you'll need to use a digital content creation or DCC app like blender, Maya, Houdini or ZBrush.
These are professional grade tools with complex workflows and are used to create assets for film and Triple-A video games.
With the DCC, artists can manipulate the vertices of an object directly or even sculpt the shape out of virtual clay.
When the asset is ready, these apps can export 3D content to USD.
But what do you do with the USD file? That's where Reality Composer Pro comes in.
Reality Composer Pro is Apple's scene authoring tool for building spatial content with USD for Realitykit without code.
Reality Composer Pro is included with Xcode. When you create a new project targeting visionOS, Xcode will create a Reality Composer Pro package inside your project, and this is where you'll organize your spatial assets.
Here's a screen capture of Xcode just after I created a new project. On the left side under packages is the reality Kit content package. So let's click to expand it.
Inside you can see the source assets in my project. There isn't much here yet because this project is new. But as I import assets into Reality Composer Pro, they will begin to show up here.
And I can open this in Reality Composer Pro by clicking this button.
Reality Composer Pro lets you design scenes for spatial apps and experiences without code.
So let's jump to the first live demo of this presentation and see what Reality Composer Pro looks like and how you can use it to build scenes from 3D assets. For this demonstration, I'll be using assets from a larger sample project, Petite Asteroids. There's a lot to this project. It's a vertical slice of a complete volumetric experience on Vision Pro.
The full project, including all the assets you'll see today and more, are available to download on Apple's developer website. So let's go over to the podium and I'll start the demonstration.
So I'm going to go ahead and open up this project that I prepared ahead of time. So what you see on screen now is Xcode. And this is very early on in the development of petite asteroids. So we've started with one of the default Xcode visionOS templates that Xcode provides. And on the left here, all we've done so far is really we've added we've brought in some some assets into our reality kit content package. And to show you that, I'll just go down here to click on this package button and let's open up Reality Composer Pro.
I'm just going to quickly close this tab so we can start from a fresh canvas.
So here we have Reality Composer Pro open in front of us. We don't have any scenes open yet, so we can change that. I'm going to go to the models folder here. We have our source assets in our project. And you'll notice they're all pink. They're pink and that means they're missing textures. This pink striped color means they don't have a text or don't have a material yet applied. So in order to apply a material, we'll create a new scene in Reality Composer Pro. So in the project browser I'll click this button here.
And what I want to do is bring in a USD reference to one of my source assets. And so I can do that using the plus button here on the left side in the scene hierarchy window. I'll go down to reference.
I'm going to look for the model that I want to edit. I'm going to select this five container.
You'll notice that it comes in offset from the origin. That's just how the artist had prepared it ahead of time. You'll see with the other pieces in this butte that I'm going to create later, that they'll kind of fit together like a puzzle piece. And that's thanks to the way the artist was able to offset the mesh ahead of time. But what we want to do for this demonstration is apply a material to this pink striped rock. And so we can do that with another USD reference.
So I'm going to reference a material that I've already prepared ahead of time. And later on I'll show you how to create materials. But for now let's use a reference to an existing material. And I'll go here to my scene. I'm going to click this material I've created.
So now in Reality Composer Pro we've created a new scene. We're using USD. References to a source asset in a prepared material. And in order to apply the material, I'll click the entity. The entity I'll go over here to the right side in the materials binding section I can now select route material.
And there we go. So that's one way we can start to modify our source assets. But we don't want to stop here. We can continue to use USD and USD references to build up layers of complexity in our scene. So for this demonstration, I'm going to show you how we can build a fully immersive environment in Reality Composer Pro. And I'll start with a new scene.
A great place to start when you're building fully immersive environments is with a SkyDome asset, and Reality Composer Pro does provide a default one out of the box, and you can add it to your scene with this plus button. You can go down to environment, click SkyDome.
You'll notice our viewport just changed. That's because we've added a new entity to our scene. And if I zoom out a little bit we can see what that looks like.
So really all this SkyDome is, is just a very large asset that completely surrounds the user and all its faces are facing inward. But this is kind of a gray void. It's not very interesting yet. And of course I've prepared a more interesting asset ahead of time. So I'm going to delete this default skydome.
And now I'm going to bring in a USD reference to an asset I've prepared ahead of time. So reference scenes, find my SkyDome scene here and I'll select that.
Now if I zoom in to the center.
I can use the controls in the bottom left to sort of look around the scene as if I was on device in the headset. And you can see that this skydome now fully immerses me in this environment on all angles. I can look up and I can see the sun, and I can see this kind of dry desert landscape in all directions. But I'm going to show you how we can continue to bring in our other prepared assets that we like. The five piece that I just prepared earlier. So we're going to skip ahead a bit. I prepared these other scenes for us.
I'm going to show you a quicker way we can bring in a USD reference to our scene. And that's just dragging and dropping into the scene hierarchy. Let me zoom out a bit.
So that's a distant terrain piece that kind of blends in the color of the background. I'm going to bring in another piece a ground piece.
And now you can see how I mentioned these are kind of starting to fit together like puzzle pieces. And that's just because of the way the artist had prepared these assets ahead of time. So I'm going to select shift select to select multiple assets.
I'll select the platforms as well. I'll drag all of those into my scene.
And there we go. So now we've assembled an entire immersive scene using these prepared assets in Reality Composer Pro without using any code. But of course, we want to see what this looks like on device or in the simulator. So I'm going to switch back to Xcode.
And open up this project that I've built ahead of time.
And if I click play.
It should build very quickly and launch inside the simulator.
If I show the immersive space, there we go. And so if you were on device, you could imagine yourself standing here at the at the base of this enormous butte. You could look around in every direction.
And that's one way you can really quickly build a fully immersive environment in Reality Composer Pro without using any code.
So that's the end of the first demo. I'm going to jump back to the The presentation, and we'll learn more about how we can use Reality Composer Pro to add more interest to this scene.
So now that we know how to how we can bring our content into Reality Composer Pro, we've even built our first spatial scene. Let's dive deeper into the powerful scene design tools available in Reality Composer Pro.
To the right is a screen capture of the project browser and petite asteroids. It shows all the USD assets I've already imported into this project.
From the previous demo. You know, I like to call these source assets because I haven't yet been further refined with added materials or custom behavior.
To build with source assets. Create a new scene and use USD references to bring in other USD files. Here's our five piece open in a new scene.
You can tell that we're working with a USD reference to our entity by the blue arrow and italicized entity names.
Let's explore how we can apply a material to this. These usds.
The pink stripes on our model mean no material is applied to apply material. Select the entity and if there are materials in your scene, you can select them in the drop down inside the materials binding section of the inspector. This will apply to all descendant entities as well.
That looks great.
You can use different types of materials depending on what you want your model to look like. So let's dive deeper into how you can create materials inside Reality Composer Pro.
We built these materials with Shader graph.
Shader graph lets you work visually with material properties, inputs, and outputs to design the logic that renders your spatial content.
In petite asteroids, we use Shader Graph to optimize the look and feel of the characters on the butte. The shadows below the characters you see on screen are actually being drawn by a custom shader graph material. They aren't real shadows like what you'd expect from a stage light.
And for the character, when she jumps, we apply a subtle squash and stretch effect using a shader graph material as well. There's a lot to cover, but let's look at some of the basics so you can get started.
Here's one of the scenes from the earlier demo. This is the Butte three piece, and it's somewhere in the middle of the structure.
In the bottom of the window. You can see the shader graph editor. Open the panels inside represent the rendering logic for my material.
Let's take a moment to walk through what this material is doing because it isn't very complex.
You can start to understand the shader graph by reading from right to left, starting from the outputs node.
The outputs node has a single input for the surface, an unlit surface. These are simple, highly performant shader graph nodes because they don't interact with lighting in our scene.
The unlit surface has a single input for color an image node. This is where you can input the textures for your model.
This is the texture of one of the butte pieces from petite asteroids. You might be able to imagine wrapping this image around the surface of an object, kind of like papier maché, so that the cracks and shadows in the texture perfectly match the geometry of the model.
And that's exactly what we're doing. This source asset was prepared by an artist who also provided the texture, so the faces on the model will line up exactly with the pixels inside the image.
This process is called UV mapping.
When you configure a simple unlit shader like this and apply it to the source asset, the model instantly appears beautifully rendered in our scene, just as the artist intended.
The materials and petite asteroid were created in Reality Composer Pro with Shader Graph, in combination with other advanced shader techniques made possible by Realitykit to create these beautiful scenes in your living room.
Next, I want to talk about the timeline feature in Reality Composer Pro timeline is a no code solution for sequencing actions and behaviors in your scene.
For petite asteroids, we wanted to tell the story of a chondrite after she crash lands on Earth and is separated from her rocky friends. When a person opens petite asteroids for the first time, they're presented with this intro sequence, which was made possible by timeline feature in Reality Composer Pro.
Here's what the timeline editor looks like.
Timeline lets you sequence actions to be executed in a particular order at a particular time.
On the left panel is a list of all the timelines.
The center is the main timeline editor, and the right panel is a list of all the available built in actions.
And the play button will let you preview your timeline in the viewport. You can go further with timeline by sending notifications that you respond to in code.
In petite asteroids. We use timeline notifications to trigger procedural animations, like the ramping fire particles trailing behind this meteorite. And to show you I like the speech bubbles above characters.
Now let's talk about how Realitykit enables you to add interactivity to the entities and scenes you design inside Reality Composer Pro.
We've been working with entities in the previous demo and throughout this presentation.
For example, when you load a USD asset into your scene using Realitykit APIs, an entity gets created with components like model, component, and transform.
Let's fast forward a bit in the development of petite asteroids, and take a look at how you can attach components to your entities to give them additional functionality. Here's a new scene we've created for our main character.
With the entity selected. The Inspector now shows information about the components attached to the entity.
The transform contains data for the entity's position, rotation, and scale in 3D space. If I were to move or scale this entity, that change would be reflected here.
I further customized this entity by giving it a Charactercontroller component. Charactercontroller component works great for entities that move and slide around on surfaces in response to input.
To learn more about the different components available to you, visit the developer website.
Let's build a mental model of how to design entities in Realitykit. Here's a simplified view of our character entity.
Realitykit lets you build up functionality and behavior by composing entities from sets of components.
All entities in reality Kit have at least a transform component.
This makes it possible for the entity to have a position, orientation, and scale in the scene.
To create a ball entity, you can give an entity a model component for the mesh and materials. A collision component to register collision events, and a physics body component so that it can move in response to forces.
For character entity, you can also use a model component, but instead of using a collision or physics body component, use a character controller component to interact with the physics simulation.
And then for a portal entity. The key is to use a portal component in combination with a model component.
As an exercise, let's investigate how we can create portals using entities and components.
Petite asteroids uses an L-shaped portal to frame the content inside a volumetric window, and this looks amazing on device.
To create a portal setup, start with an entity with a mesh with the portal shape you want. I created this L shape inside blender.
Then add a portal component to this entity.
Next, add a portal material to your model component.
But without a world to render, portal material doesn't do much on its own.
Choose a second entity as your world entity and give it a world component.
For petite asteroids, our world entity is the container entity for our SkyDome, the Butte assets and everything in our scene.
You then specify this root entity as the target entity for your portal.
So when you start with a portal entity and your world entity.
You'll get something like this. Hold on.
The butte is getting clipped when we look at it from certain angles.
At a portal crossing component to the entities that should be rendered outside the portal.
Perfect.
Additionally, we've set the crossing mode on this portal component to be positive Y. This ensures the entities across the portal in the way that we expect.
You may have heard of entities and components in the context of X or entity component systems.
Systems contain logic for working with entities and components, and are the third piece of the way Realitykit enables you to create interactive spatial experiences.
Custom systems are written in Swift inside your Xcode project. Systems allow you to provide custom logic to your spatial experience.
Create a system by defining a new Swift class that implements the system protocol.
Systems have an update function that is called each frame. This is where you can modify your entities over time.
Create queries to search for specific entities, such as all entities that share a specific component. Then each frame you can use the query to iterate over your desired set of entities.
Let's move on from looking at code and talk about how you can bring interactivity to your realitykit entities with SwiftUI gestures.
With gestures. People can pick up virtual content naturally using the advanced hand tracking capabilities of Vision Pro.
You can use gestures to allow people to interact with your content in ways that they expect, like pinching and dragging to move items around a space.
In petite asteroids. We use a spatial tap gesture to tell our character to jump when the person holds and drags the character roles in the direction of the drag.
When using the simulator, you can simulate gestures with the mouse and keyboard. In the next demo, I'll show you how you can wire up the gesture to your reality view in code so you can bring gestures to your own experience.
We just went over a lot. Shader graph and timeline let you design your scenes in Reality Composer Pro without code.
You can build entities through composition by mixing and matching different components.
We just saw how SwiftUI gestures can add interactivity to our entities. So let's revisit the Petite Asteroids project to see what has changed during development, and see how we can use all that we've learned to enhance the experience we've created so far. So on to the demo.
So let me go over to the next project that I prepared for this demonstration.
So we have been hard at work on petite asteroids. We've added some code to our project. We've added some more USD files to the of the project. But first, I want to show you exactly what's changed since we last left off. So I'll just go ahead and click play.
Let me move our view up a little bit.
Look down.
So there we go. So right off the bat, you'll notice we're no longer in a fully immersive scene. We're now inside a volumetric window. And you can see at the bottom of this window there's a menu bar that the person can use to move the the volume around their window so they can place it on the floor or their table.
Additionally, we framed our scene inside that portal I just talked about earlier. So we're using this L-shaped portal that I recreated in blender. And inside the simulator. I can use these controls at the right to kind of swivel my head around.
And you can see what that that effect looks like in the simulator. And it'll look amazing on device as well.
Let me zoom in a bit so you can see our character.
There she is.
But if we click around, nothing's really happening. That's because we haven't hooked up our SwiftUI gestures yet. So let me show you how we can do that in code very quickly.
So.
I'll go over to contentview. So if you've used SwiftUI gestures before, this will look very familiar to you because it's just like any SwiftUI gesture you might create using SwiftUI.
So we're creating a tap gesture here. It is a spatial tap gesture, and it's targeted to any entity. So we can tap anywhere in our scene and we’ll be able to execute the code that we want. And that code is when this gesture ends, we call a function called handle spatial tap that we wrote. And this is where we'll tell our character to jump.
Because this is a SwiftUI gesture, we'll need to install it in our Swift in our Swift view, and we can do that with the SwiftUI modifier, the gesture modifier. And that's all you'll need to do to install a gesture onto your reality view. But of course we we don't want to do this much in code. We want to go over to Reality Composer Pro and see what we can add to the scene. So let's just go ahead and open up our project.
One second. There we go.
I'm going to close out this tab. So what I want to do for this demonstration is create an intro sequence using timeline in Reality Composer Pro. So I'm going to find the scene that we were working on earlier. It's going to be in our scenes folder.
It's the static scene object.
And once it's finished loading there we go. So it looks just like we left off. However, I've been organizing our scene a bit. I've got these container entities here.
I've got a container entity here. I've got it. We can see we have a USD reference to our character.
And so like I mentioned, I want to create a an intro sequence using timeline. And what I'm imagining is going to happen is maybe this butte, this sorry, this Butte container entity here, we can have it sort of slowly animate up out of our portal when the user loads the scene for the first time. So to set this up, I'm going to give this butte an offset so of two meters. So because we're working in centimeters I'll do 200. And you can see in the transform component here I'm using cm centimeters. So that's two meters below our scene. And now we can create a timeline by going over to the timelines tab I'll click this plus button to create a brand new timeline. And in order to move our entity around the scene, we can use the timeline action that we can find over here in the right panel. So let me drag this transform.
This transform to timeline action.
I'll slide it all the way over to the left so that it starts immediately. When this timeline plays, this yellow icon means we haven't set our target yet. So of course we want that to be this container entity that I was just sliding around.
I'm going to click done.
It took that default -200 position when I added this transform two action. So I'm just going to reset that to zero because I want to transform it back to zero from that offset.
And if I click play well if I go over here and I should be able to click play, there we go. It's a little fast. So I can speed this up by stopping the timeline. Going back over to the inspector, setting this duration to seven seconds instead of one second. Enter. Let me zoom out the timeline. Let me. Let's play that and see what that looks like.
There we go.
One thing I'm noticing right away that looks a little bit awkward is our character entity is sort of floating around in space while this, this butte is rising up around her. So to quickly fix that, I'm going to use another timeline action.
I'm going to go over to this actions panel. I'm going to find the disable entity action. I'll drag that into my timeline.
I'll slide that over to the left because I want to disable this entity immediately. When the timeline starts again, there's this yellow icon, which means I need to select a target.
I'm going to choose this character. Entity. Be done.
Now if I click play.
She's disappeared.
Of course we want to bring her back.
With an enable entity action.
Again, I need to choose a target.
And I want to bring her back. About six seconds into our our intro sequence. So I'll set the start time to six.
So now if I press play, I scroll over here, press play. Let me stop that.
Try one more time. She's vanished. Our buttes rising up.
And then right around six seconds. There she goes.
One more thing I want to show you that you can do in timeline, which is very powerful, is to use notification actions to communicate with your code base. So for this demo, I've prepared a procedural animation to fade from black. The whole scene when the the intro sequence starts. But in order to trigger that animation to start, I need to send a notification to my code base. So I'll go over to the right under actions and I'll bring in a notification action.
Again, I need to choose a target, but for this notification the target doesn't really matter and I'll just choose the route of the scene.
Done. And I'll set the start time at about two seconds so that the fade from black happens two seconds after the timeline animation starts.
But of course, if I press play now, we're not going to see that fade from black effect yet because our code isn't running. So we'll need to jump back into Xcode and press play. And we can see this running in the simulator.
So I'll jump back over to a project, the Xcode project.
So just to reiterate what we've done, we've added we've installed a tap gesture to our reality view.
And we've also created a timeline sequence in Reality Composer Pro without any code.
So now if I press play project should build.
One moment. Let me close the the previous demo that's running.
Just so we aren't confused about what we're looking at.
Let me go back here. Press play.
There we go. Our scene is black. We're playing that intro sequence we just created inside Reality Composer Pro. Our characters disappeared, and. But she came back. And if we zoom in, in our character, let's test to see if we've installed our SwiftUI gestures correctly. And if I click around, there we go.
So of course we can continue to refine and tune this. This game mechanic to make it feel the way we want. But that's all I have for this demo today. Let's go back to the presentation and finish up from there.
So there's a lot more to petite asteroids. And you can find the complete project in all of these assets on the Apple Developer website.
You can build petite asteroids in Xcode and experience it for yourself on Vision Pro.
We've gone over a lot. So let's take a moment to recap.
We walk through how you can work with 3D assets to build spatial experiences.
USD enables collaboration across large teams when working on your scenes, and lets you build up complex scenes from smaller pieces.
Reality Composer Pro is Apple's scene authoring tool for spatial content.
Shader Graph and Timeline enable you to design your scenes without code.
Finally, we talked about how Reality Kit's X framework makes all of this possible and how you can bring interactivity to your entities by taking advantage of SwiftUI gestures.
I want to leave you with one more piece of advice. Find us on the developer forums. Apple engineers, including myself, hang out on the forums answering questions about APIs and bugs. So please reach out to us during development with any questions you might have. Thank you.
We've got one more session for you. I hope you're ready.
Those of you who are in the room, those of you online, I think we love all of our sessions children equally over the next two days. But what I will say is I'm really excited about this session. I've been excited about this session since I first talked to these presenters. One of the things that visionOS does incredibly well is the way that it can bring and connect people together, and as Ethan and Alex are going to show you in just a few seconds, I find that this can be incredibly impactful when we're talking about media experiences and interactive experiences where you're doing things together, whether that's sharing video content or playing a game together or discovering something. So I'm really, really excited to introduce Ethan to you. He is a member of the SharePlay team, and he's going to talk a little bit about how we can connect people through experiences. Ethan.
Hello. My name is. Ethan Custers and I'm an engineer on the visionOS FaceTime team. Today I'll be joined by my colleague Alex Rodriguez, and we are really excited to talk to you all about how to build compelling shared experiences for Apple Vision Pro.
One of the most powerful features of visionOS is its deep integration with FaceTime and SharePlay that allow apps to create a shared context among participants in a group activity.
You can build experiences where people who are both nearby and remote see the same virtual content in the same place, and visionOS does all the heavy lifting of creating that shared space for you.
I'll kick things off today with an intro to SharePlay, the underlying technology that enables shared experiences on visionOS.
Next, we'll take a look at some existing SharePlay experiences available today and how they intersect with the API available to you to build similar functionality into your own apps.
Finally, I'll hand things off to Alex, who will dive into a case study prototyping a shared escape room for Vision Pro.
So what is SharePlay? Well, SharePlay is the core technology that enables real time collaboration with others on Apple platforms. If you've ever played Apple Music or watched Apple TV together with someone on a FaceTime call, you've used SharePlay. Typically, SharePlay sessions happen as part of a FaceTime call. For example, if you're in a FaceTime call and start to play a song in Apple Music, the app will prompt you and ask if you'd like to SharePlay that song with the group. If you decide to SharePlay it, the music app will be launched on everyone's devices and everyone will hear the selected song played in sync across the group together.
visionOS takes SharePlay even further by allowing participants to experience these activities together in a shared coordinate system. So when you watch TV with someone on visionOS, the player window is in the same position for all participants.
I think it will be helpful to take a high level look at how developers actually adopt SharePlay in their apps. Starting with the Non-spatial components of a SharePlay session.
When adding SharePlay support to an app, you start with group activities. The core framework that powers SharePlay. Within group activities, you'll find a number of helpful APIs. But to start, you should be aware of at least two. That's group session and group session. Messenger A group session is provided to your app whenever SharePlay is active, and it provides information about the state of the activity. It's also where you'll access the sessions Group Session Messenger, which is used to send messages between participants to synchronize the app's state.
An important thing to understand about SharePlay is that state synchronization is entirely up to your app, and it's done in a distributed way. Let's walk through an example of how a music SharePlay session might work.
Here Alex is in a FaceTime call with Willem and Gabby.
Gabby recently started playing music and when prompted, decided to share it with the group, starting a SharePlay session.
Alex's device then receives a message about the new activity, causing FaceTime to launch the music app on his behalf and providing the app with a group session.
At this point, Alex's music app is unaware of Willem and Gabby, but then it joins the group session and the system establishes a communication channel with the other devices.
So this is now a representation of an active music SharePlay session with three participants. Notice that each participant is running their own independent copy of the music app, just as they would outside of SharePlay. So the only thing that makes this a SharePlay session from the music apps perspective is that it was given a group session and has access to a group session messenger.
Next, let's say one participant, Alex, presses pause on his music player. His music app will pause the song locally, and then we'll send a message to the other participants informing them of the new state. They'll receive that message, process it, and each pause their own local players so that it feels like Alex has direct control of a single copy of the music app that has just shared among three participants.
If Willem decides to press play, the same thing will happen from his perspective. His local player starts playing and a message is sent to Gabby and Alex so that they know to match his state.
You can imagine then the music app implementing support for a number of messages for controlling play, pause state, track selection, shuffle mode, and more. There might also be state that the music app decides to not synchronize. Maybe the team decides it makes for a better user experience. If one person can be reading lyrics without activating lyric mode for all participants.
A major benefit of this architecture, versus something like screen mirroring, is that group activities allows the experience to scale across different platforms. I can be on FaceTime on my Mac and SharePlay, the music app with my friend on an iOS device, and we'll both have a great app experience designed for the platform we're on.
A huge reason to build for visionOS with the native SDK. Adopting technologies like SwiftUI and Realitykit directly, is that SharePlay is integrated into many of the APIs you'll be using, and this is especially true for the media experiences that many of you are working on. So if you were building a music player experience, the platforms built in player API has SharePlay support that handles all of the synchronization I just talked about for you. I'm going to be covering many of these APIs in the next section. But first, let's take a closer look at the spatial component of SharePlay on visionOS.
On visionOS, SharePlay apps are placed in a shared coordinate system, allowing you to make use of a shared context with both spatial and visual consistency.
When you're in a FaceTime call on visionOS using your spatial Persona, you'll be able to point and gesture at content in the shared app, and all other participants will see that content in the exact same place. It really creates an incredible sense of presence with others.
And with visionOS 26, those participants can be a mix of people who are nearby, appearing naturally via pass through and remote people who appear as their spatial persona.
Critically, your app doesn't need to do anything to get this. Windowed apps like Freeform in this video are automatically placed by the system in a single coordinate system that guarantees spatial consistency.
So, as serenity points to an object on the board, a niche knows exactly what she's referencing, and you can also choose to support multiple platforms. Here I'm joining the call from macOS, and I'm able to seamlessly collaborate with Serenity and Anish.
When building for visionOS. There are two additional API you should be aware of and group activities. The System Coordinator is an API unique to visionOS that enables apps to coordinate the spatial placement of elements in a SharePlay activity.
A key thing you'll configure there is your spatial template preference, which is how you influence where spatial participants are placed relative to your app.
So a typical visionOS FaceTime call with spatial personas begins with them arranged in a circle, allowing participants to easily interact with one another. But at any moment, a participant could start a SharePlay activity and the call will transition into a spatial template, rearranging each participant's spatial Persona relative to that newly shared app.
This placement is handled automatically by the system and is what establishes spatial consistency among participants.
Importantly though, this template is just defining the starting point or seat of each participant. Participants can get up and walk around leaving their assigned seat, but maintaining the spatial consistency.
If a participant recenters by holding down their Digital Crown, or if the app changes its spatial template preference, the participant will be placed back in their seat.
There are a handful of templates provided to you by group activities that your app can use to influence the way spatial personas are arranged relative to your content during SharePlay. And if you're building something that doesn't fit into a system standard template, you can design your own custom template, deciding exactly where each persona in the session should be placed. Alex will cover this in more detail later on. This shared context doesn't just apply to windowed apps either. Immersive apps also benefit from the system's handling of shared placement and the use of spatial templates, allowing you to decide where participants should be placed in a shared immersive space with a shared coordinate system.
So that SharePlay. And what makes SharePlay special on visionOS. I think it's time to dig into some more concrete examples. Let's start with media playback.
The TV app is a great way to watch both immersive and non immersive media together. It really exemplifies how to build a cohesive end to end SharePlay experience, a critical piece of TV is synchronized real time media playback. It's so important that all participants are seeing the exact same frame of video and hearing the same audio at the same time. Fortunately, there are system standard API for this.
AVPlayer is an API provided by the AVFoundation framework, and is the most common way to play media on Apple platforms. It's a powerful and flexible API, and in my opinion, one of its coolest features is SharePlay support.
You're already familiar with group session. This is your link to group activities and SharePlay.
So to synchronize media playback across your group, you just pass that group session to your AVPlayer via the coordinate with session method.
With just this handful of APIs, you can create a really compelling shared media experience for visionOS. Your shared window will be placed in the same location for all participants. The correct spatial template will be applied to give every participant a good view of each other, and the app and AVFoundation will synchronize playback as participants pause and play, creating the illusion that all participants are looking at the same video window together. I know many of you are especially interested in Apple Immersive Video playback, while AVPlayer supports shared playback of immersive video as well. When playing content like AIV that has a single good viewpoint, it will automatically hide spatial personas so that all participants can be placed in the ideal viewing location. Playback of the video will still be in perfect sync, and participants will be able to hear each other as they view the content in their private, immersive space.
If you're building a more custom, immersive experience and you'd like to include spatial personas, you'll likely want to use reality kits. Video player component.
You can use Video Player Component to embed an AVPlayer into a reality kit entity and create fully custom video experiences like 180 degree wraparound video or projecting video onto a 3D model, and the placement of your immersive space will be synchronized across the SharePlay session, meaning all participants will see each other and the entities in the space in the same place.
Now this is all great if your media is stored on a server or ships with your app. But what if you want to bring personal media into a SharePlay session? Experiences like photos that involve sharing your own media can be really powerful, especially when combined with the feeling of physical presence provided by spatial personas and nearby sharing group activities. Provides a dedicated API for use cases like photos, where participants can share larger files with each other. The Group session journal enables efficient file transfer between participants and solves common problems like providing files to participants who joined even after that file was initially shared. You can think of Group Session Journal as a companion to group session messenger. You should use the messenger for small real time messages and the journal for larger experience. Establishing file transfer.
I'd really encourage you to think of the types of experiences that journal unlocks. Participants can bring their own media to an experience, or even create media on the fly that is shared with others.
Finally, for a bit of extra magic, I recommend checking out the image presentation component. You can use this component to generate a spatial scene from an existing 2D image, and these scenes can be even more amazing when viewed with others.
Possibly my favorite new experience with visionOS, 26, is sharing a 3D model with someone nearby. Physically handing a virtual object to someone standing next to me is just truly mind bending.
The gestures used by Quicklook to enable these kinds of interactions are available as APIs, so you can integrate them into your own apps.
If you're building an experience that involves manipulating 3D content, you should start by looking at the Manipulation Component API offered by Realitykit. Adarsh went in depth on this earlier today. You can use this API to enable rich gestures that will be familiar to your users.
Manipulation component is designed for local interactions though, so you'll need to observe your entities transform as it's being manipulated, and use Group Session Messenger to synchronize those updates to remote participants. By default, any interactions done with the manipulation component will only be seen by the person actually performing the interactions.
When you're using group session Messenger to synchronize real time interactions like this, I recommend taking a look at the unreliable delivery mode. This removes some of the network overhead involved in the Messenger's default mode. So for scenarios like this where you're sending a lot of transform updates in real time and none of the individual transforms are very important, it's perfect Use of the unreliable mode is a place where your app can really differentiate itself. It will make your app more complicated, but when done correctly, the performance improvements can be the difference between an experience that feels lifelike and real time, and one that feels slow and immersion breaking.
Game room is a standout SharePlay experience on the platform. I love getting together with friends who might be across the country and playing hearts or Battleship or chess while catching up and feeling like we're all actually together. This is an experience that works really well with a mix of people who are nearby and remote, and this is partly due to its use of a volume based virtual tabletop. This design pattern of placing interactive elements on a virtual tabletop is a great one to consider when building any kind of interactive experience for visionOS, whether it's a game or not. With visionOS two, we released a new purpose built framework to make it even easier to build experiences like this. It's called tabletop kit.
Tabletop kit automatically handles state synchronization during SharePlay by keeping items like player tokens and scenery in perfect sync during a multiplayer session, and it has great support for gestures as well.
If you can think of a way to model your experience in the language of a tabletop game, tabletop kit can solve a lot of problems for you.
Beat punch is one of my personal favorite SharePlay experiences. It's a great example of an app that really pushes the limits of what's possible with shared experiences on visionOS. And the result is something really special. It's thrilling to look over and see a friend struggling to keep up with the beat in perfect sync with me and my experience, and it scales really well up to five spatial personas, making it great for group get togethers.
I think this example is really relevant for this audience in particular. Even if you're not thinking of building a game, the combination of a custom spatial template with a large virtual environment where the app is placing personas and custom locations around the space is really inspiring to me.
Beat punch is the first app we've looked at today that uses a group immersive space.
Group immersive spaces are really powerful on visionOS. When your app has an active SharePlay session and opens an immersive space, the system can automatically position your space so that its origin is in the center of your spatial template, and all participants have a shared context to opt into this behavior, you just set the supports group immersive space flag on the system coordinator, and the system will move your immersive space to a shared origin. This means that with no additional changes to your app, you can go from an incredible solo environment to a shared one.
You can combine this with a custom spatial template to place participant spatial personas in specific positions of your immersive space. So in Beat Punch, they are able to place each player on their own platform while maintaining the immersion of a fully custom environment.
Finally, if you're considering building an immersive interactive experience, I would definitely take a look at AV Delegating Playback coordinator. This is maybe the API I'm most excited about for this audience.
It's an API from AVFoundation that gives you direct access to the underlying system that powers AV players sync, and allows you to apply it to any time based synchronization. So, for example, if your immersive space contains some animations, you can use AV Delegating Playback Coordinator to guarantee that everyone in the session sees the same frame of that animation at the same time.
Okay, so that was a lot. Before I hand things off to Alex to take us through the prototyping process for designing a new SharePlay app, I'd like to really encourage you to check out our SharePlay. Sample code. If you go to the sample code section of Developer.apple.com and look for a guessing game, you'll find the Guess Together sample app, which is a really great end to end example of how to build a SharePlay experience for visionOS.
I think it will help put a lot of the API I've been talking about in context, and it's the way I recommend all developers get started with SharePlay on the platform.
With that, let's see how we can put these APIs together in a new app my colleague Alex has been working on. Alex.
Thanks, Ethan. So we just took a look at several great SharePlay experiences already available on visionOS today. Now I want to walk you through my process for prototyping a new experience built from the ground up for excellent SharePlay support and collaboration. The first core principle of a great SharePlay experience on visionOS is shared context. Spatial personas have a unique ability to allow participants to interact with virtual content as if it was really physically there. I can point at features on a window or in an immersive space, and everyone will see exactly what I'm gesturing at. I can draw directly on a canvas in Freeform, and everyone will see my strokes appear at the end of my personas fingertip. I can even pick up a chess piece in game room, and everyone will see exactly where I place it on the board.
SharePlay will handle positioning the shared app in the same place for all participants, but it's the developer's responsibility to preserve shared context within your app. Shared context consists of two key pieces. First, visual consistency. Visual consistency means that everyone will see the same UI at the same time. Think back to the Freeform example. Freeform handles placing each sourdough image in the same place for everyone. So when one person resizes a drawing, everyone will receive an update and Freeform will resize the image as well.
Second, spatial consistency. Spatial consistency means all content will be placed in the same place for all participants. Think of the Quicklook example. When one participant picks up the airplane model, the others see it in their hand directly. The developer must preserve both of these types of consistency by syncing transforms and state updates to maintain shared context in their app.
The second core principle is to avoid surprises. SharePlay and visionOS let developers control so much all of the API we've explored today. Enable amazing custom experiences, but I can easily overwhelm a SharePlay participant by trying to do too much all at once.
So here's a few good strategies for trying to avoid surprises in your experience.
First, it's always best to minimize transitions. For example, let participants initiate big transitions like switching their template seed beat punch does a great job of this by offering a dedicated button to allow players to enter their custom spatial template.
Second, consider placements of all content in your experience. Don't assume anything about a player's physical space. They might be in a small bedroom or in a giant conference room. When you place your content, don't require the players to have to physically move to interact with your UI, either present your UI close enough to read and to touch, or offer ways to virtually move throughout your space.
And finally, if your app is immersive, it's always best to start out simple. Try starting with a window or a volume before you transition into your immersive space. Beat punch starts at SharePlay activity with a window, which gives players a great chance to get set up in their physical space before diving into some pretty intense punching.
And the final principle is to simplify whenever you can. We have talked a ton about all of the excellent API available to build exactly the experience you're imagining, but that doesn't mean you have to customize everything all the time. APIs like AVPlayer give you great SharePlay experiences out of the box, and system templates like Side by Side often are a great fit for the type of window you're trying to share.
So where do I start? Maybe you have a great single player experience you want to bring personas into. Or maybe you're starting from scratch with a new multiplayer story, or maybe you're just excited about a type of interaction you've imagined based off of some new visionOS API. Let me show you how I prototype my SharePlay experiences by bringing together storytelling and the unique capabilities of visionOS as a platform.
Today I'm going to go into a design, a sci fi themed escape room that uses all of the API we've talked about today to tell an exciting story with three great interactive puzzles. My players are a ragtag crew of astronauts adrift in space. Their warp drive is broken when all of a sudden an alien craft appears.
Players will first work together to decode a secret message the aliens are broadcasting. Then they will work together to use the transmitter to respond to the aliens. And finally, if the aliens are satisfied with their response, they will lend them a battery so that they can repair the warp drive and return to their home planet.
So, to give an overview of the prototype I'm designing, I'm going to start out with a waiting room where players can get set up before they enter their immersive space. Next, I'm going to design the space itself and all of the mechanics necessary for the players to move between puzzles and start solving the escape room for the puzzles. I'm going to work with our kit APIs to play the decoded message. Then I'll work with the players local content to transmit the message. And finally, I'll make each piece of the engine manipulable so players can interactively work together to repair it. But before we throw players into the deep end, I want to make sure that I give them a good, simple experience to start. I'll start my app with a window where players can get started, talk about the game, and hear about the background story of the escape room. So first, I'll start out with a simple text field where players can enter their username. I'll also add a start button so that players can start sharing their game, and when they do, I can call activate on the group activity to start SharePlay. As players join the SharePlay activity, I'll be able to display their username on the window, just like a waiting room in any other game. Then, since I now have an active group session, I can use AVPlayer and AVPlaybackCoordinator to coordinate with the session to play synced playback. Here I can now play a video from Ground Control, where they tell the astronauts about their mission and the broken warp drive. This way, if players arrive late, the team can pause and rewind and make sure that everyone's seen anything they might have missed. And then finally, I'll add a lift off button so that everyone can mark themselves as ready before they jump into the escape room. Once everyone presses it, I'll get a final countdown and we'll jump into the immersive space.
So that's the waiting room. Now I have to build the ships actual immersive space.
I'm going to start out with a little bit of brainstorming, using Image Playground to get a futuristic spaceship that I have in mind for my escape room. So now that I have it, I can jump into Reality Composer Pro to construct a reality scene just like you just heard about. And Reality Composer Pro will give me all the amazing abilities like scene editing and timelines and animations, but it will also let me place excellent assets for each one of my puzzles throughout my scene. First, I can place a futuristic monitor in one corner where players can go to decode the secret message. Then I can place a radio in another corner where players can transmit the response to the aliens. And then finally I'll have an engine model where players can come together to reassemble the engine before they return home. So with my spaceship complete, I now need to configure my group session to support an immersive space. This way, players will be able to see the ship's immersive space with shared context. When one player starts and enters the ship, all the rest will automatically follow so that everyone preserves their shared context, whether that's the window or the ship itself. And then inside the spaceship, they'll see each puzzle in the same place, and if one points to a puzzle, everyone else will see exactly what they're pointing at.
Now I have the immersive space itself, and I need to design the mechanics for players to be able to solve the escape room. So I'm starting out with three puzzles spread out throughout my immersive space. The secret Message, the transmitter, and the warp drive. Now I can leverage custom spatial templates via spatial template Preference custom to allow players to move throughout the room. I first need a place for players to arrive when they enter the space. So I'll set up a row of seats facing all three of the puzzles. I'll preconfigure five seats so that no matter how many players or spatial and FaceTime, there is always enough seats for everyone to enter the immersive space. Next, I want to be able to let players move from the starting area to any puzzle they want to work on. So to do this, I can set up specific seats next to each puzzle with spatial template roles in visionOS, participants with spatial personas can have spatial template roles. These roles assign a purpose to a participant, whether that's a specific team on a board game or a presenter in a slideshow app. I can then specify seats with these roles in my custom spatial template, and allow participants to move into the correct position for their role in the shared experience. In the context of my escape room, participants will have roles corresponding to the puzzle they're trying to solve. Then I can add seats next to each puzzle for those corresponding roles. So let's take a closer look at my custom spatial template. I'll start out by putting five seats next to the message monitor. This way, if everyone wants to come work on the monitor at the same time, there's enough space for everyone. From there, I can add five seats by the transmitter and then another five seats by the engine. So for my escape room, players are going to come together to solve these puzzles one at a time. But all of these seats let each player explore the entire immersive space however they want. This way, they can discover which puzzle comes first and the order that they need to solve them. To truly make it feel like a real escape room. So now if we zoom back out a little bit, all players will start without an assigned role. This means that they will start in the seats that I set up at the beginning. And if a player wants to work on a specific puzzle, for example, maybe the secret message puzzle, my app just needs to assign the correct role to the player, and SharePlay will handle automatically moving that player into an open seat next to the puzzle. But how will players choose which puzzle they want to work on? This is where I can utilize private UI for choosing your position in the room. I can make a custom control panel with buttons for each puzzle in the room, and then I can present my panel with a utility panel window placement.
This way, visionOS will automatically handle presenting my controls close to the player in a convenient place for them to quickly switch between whichever puzzle they want to work on. Here, I can also abstract away the concept of custom spatial roles to just allow players to pick exactly which puzzle they want to work on, or if they want to be in the audience. So, for example, if one player wants to work on the secret message monitor, they collect the monitor button and my app will assign the role to them. From there, SharePlay will automatically move them into the seats corresponding to the puzzle for the monitor.
Great. So now we have a rich, immersive space and a great set of mechanics for player to easily move between puzzles and explore the spaceship. Let's jump into building some of the puzzles themselves. We're going to start out with the secret message puzzle for the first puzzle. Players will receive a cryptic transmission from an alien craft. And just like I did in the waiting room, I can use the AVPlayer coordinator to coordinate with the session to sync the video across all the players as the aliens play it.
This time though, we're inside of an immersive space and I already have an excellent 3D asset for my monitor. So here I can use the video player components to actually render the video itself inside of my monitor model. This way, the secret message puzzle will truly feel like part of the immersive space that players are in, but it's a trick. Aliens don't speak English, so the players can't possibly understand what they're saying in the recording. So I'm actually going to encode the secret message via morse code in the monitor's power button. I'll make a monitor light blink the code of the message. But then how will I sync the message for all of the players. I can't just use AVPlayer here because it's not a video. This is where Avi delegating playback coordinator comes in. By coordinating with a session, I can sync any time series content between players in the group session. That way I can play the power light blinking for all players in sync so that they can work on decoding it together.
So now if the player successfully decode the message, they'll discover that the question is does your planet have water? And now we have one puzzle complete we can jump into the next. How do the players respond to this message? When players decode the message, they now need to provide proof that their planet has water. So to do this, if a player interacts with my transmitter model, I'll open the photos picker to allow them to transmit a photo from their own private library as proof that their planet contains water.
But I want to give the whole team say in which photo they choose to respond to the aliens with. So this is where I can use the group session Journal to share locally generated content like photos with other players. In the SharePlay session, one player can upload a photo of them swimming, for example, and then my app can distribute that shared photo via the group session journal and display it in everyone's immersive space. So even though we're building an experience for a great, great SharePlay interaction, we can also utilize all of the other frameworks available on visionOS. So when the team agrees to transmit their image, my app can use the Vision framework to classify the image. I can make a classify image request and check for classified keywords for containing water. I'll look for keywords like harbor or swimming pool or river to see if the photo has the correct information. Then, if the player succeed in choosing the correct photo, I'll apply the image presentation component to convert their picture into a spatial scene. This will give a satisfying conclusion to the puzzle, and really make the photo feel like part of the immersive space. So now we only have one puzzle left. The engine itself.
So, thanks to the player's smart decoding, the aliens have been kind enough to donate a charged battery to return to Earth. And so my first step for this puzzle is to make sure that each piece of the engine becomes interactive. To do this, I can apply the manipulation component to each engine entity to let a player directly grab and reposition pieces into the right configuration. But there's a problem. The manipulation component is designed for local interactions, so if one player comes to visit another while they're working on the warp drive puzzle, they won't see each other's movements if one is holding and moving around the engine pieces.
This is where shared context gets broken, and it's so critical to SharePlay to preserve shared context and visionOS that we must make sure that we sync this context. Players will end up very confused if they come Si and one player is moving a piece that they can't see. So here we need to make sure that we have collaboration. So I need to sync each piece of the engine as it's being manipulated. So the group session messenger is actually a perfect use case for this, since it's easy to send quick updates of the positions of each piece. It even has the unreliable delivery mode so that I can get the best network performance and the smoothest movements as players move the pieces.
This way, as one player moves an engine piece, they can quickly send the position to other players who can then have their apps update the parts in real time. It'll really feel like players are working with the same exact pieces. So now players can finally come together to complete the warp drive and finish the escape room.
So as they finish the escape room, they can return back to Earth and hopefully they make sure the aliens don't follow them, because that would make for a pretty interesting sequel to my escape room.
So earlier, Ethan showed you some of the amazing visionOS apps available today and some of the powerful API that back those similar experiences. And now I've taken you through a prototype process to conceptualize a similar SharePlay experience.
We started out our prototype using AVPlayer and coordinating with the session to player tutorial video. Then I drew out the custom spatial template for our ship, and I talked about the role API I'd use to enable movement around the ship.
And then we used AVPlayer again, but with Avi delegating playback coordinator to not just sync video, but my own custom secret message. And then we incorporated the group session journal so the players could transmit their own private photos as part of the aliens response.
And then finally, I used the group session messenger and the manipulation component together to allow players to repair their warp drive in a truly hands on way in just one escape room. We touched on almost all of the API. Ethan showcased earlier. It really goes to show that some of the richest and most exciting experiences on visionOS bring together many of these tools to build entirely new stories.
So what's next for you? I really hope this talk got you excited to go bring your assets and your media into visionOS. For a great overview of SharePlay, go check out our session from 2023. And for custom spatial template information, go check out 2024. And finally, to come up with creative ways to incorporate others nearby with you. Take a look at our most recent talk from 2025.
There are so many countless ways to combine all of the API available on visionOS and spatial personas and others nearby truly bring another level of presence to your experience. So think back on old stories, or maybe look forward to new ones, because truly, the best experiences are the ones that are worth sharing with others. Thank you.
Oh, gosh. Well, that is almost day one, everybody. Thank you so much for coming. We have one final thing here in the theater today and that is a Q&A. Now all day people have been sending in some amazing questions. We've got some we've got questions. I'm going to move around here so I don't get knocked over by a chair. We've got some amazing questions that folks have been submitting and and really and uploading the ones that they're really excited to hear about. So we've got these ready to answer you today. And we've got all of the panelists that you that you will see here are the same, in fact presenters that you saw all day today. I just want to call out one thing. We got a number of questions on Apple Immersive Video production workflows. We're going to hold those questions till tomorrow's Q&A. Since all day tomorrow, we're going to be going into the depths of production, post-production, all that good stuff. So we want to want to tie it to those, those sessions. But for today, we're going to talk about some of the content that we saw on screen and some of your great interactive questions. So without further ado, I'm going to bring our panelists up on stage if they're ready. Possibly. Here we go.
Yes, here we go. We've got Elliot coming back as well as Tim Adarsh.
Thank you. Nathaniel and Ethan. And I also want to call out we'll be doing camera tours for folks in house at the conclusion of this Q&A. So if you've got a group two that's when we will go. All right. Well thanks for joining us back up on stage in a slightly more comfortable position, everybody. Lots and lots of great content today. One of the questions I think we got, we got asked quite a few times, was talking a little bit about the different video formats that are available for visionOS, and also talking a little bit about immersive video and what it looks like. If you want to ship to Vision Pro and you want to create amazing content for Vision Pro. But you might also want to publish that content elsewhere. What's what's our philosophy on this? Do you want to take it? You want me to take it? I'll take it. Okay. So obviously, Apple Immersive Video is the highest quality of immersive video.
If you're editing your Apple Immersive Video in resolve. There are presets to export. The bundle format, as well as the final delivery format. They've also added a preset for Vr180, so you can start up here with a really high resolution, and still end up with a vr180 output that would work on other platforms. That's really cool. And you told her this a little bit in your presentation earlier, Tim, but can you go a little bit more into detail? We had I think Wei asked the question, where is the difference between 180 stereo video and Apple Immersive Video? Wow, such a huge difference. If I were to choose one word, it's metadata. Absolutely. So regular 180 video, as I mentioned earlier this morning, is just half correct projection. And really it has one piece of metadata that says, hey, I'm half correct. That's it. Apple Immersive Video. We try to maintain the actual fisheye of the original image that's recorded by the camera. So the image is fisheye.
The coolest thing about it is that at the factory, the cameras are calibrated and there's this little calibration file called an LPD, which is a lens calibration. And so that is written into each file as you record on the camera. And it's really important that, that that little file, the LPD follows all the way through post-production all the way until the final final deliverable package. So what happens is when the Vision Pro plays the file, it reads the LPD and it says, okay, I know that this exact lens has these exact parametric distortions and it distorts the image for like, even if you combine 3 or 4 different cameras from a shot to shot basis. And then we have all kinds of other cool stuff like dynamic transitions and dynamic flips and the edge blends and static foveation. There's so much going on. It's very rich in metadata and of course, extremely high resolution. Yeah, I'm very excited about the static foveation. Yeah. Tomorrow we'll have Ryan Sheridan joining us on stage. So he'll be able to give everyone the deep dive on that format. It's pretty. It's pretty powerful. Yeah. Elliot, speaking of the formats, I know that we've got quite a number of them available on visionOS right now. Do we have any samples that we can share with people so that they can check out? Yeah, that's. A good question. Yeah. If you're looking to test out these formats, we do have sample streams available on Developer.apple.com. If you search for HLS stream examples, you should find some pretty cool examples across App Nap content. AIV content. Spatial content.
And if you're on the front end of that, if you're looking for kind of what those files look like, how they work. I'd invite you to check out some of the sample content you can download from the Blackmagic Design website. That way you can get your hands on some b-rolls if you don't have access to an Apple Immersive camera at the moment. That's awesome. Let's switch over to interactive for a moment. Nathaniel, you did a great job kind of talking us through all of the impacts in Xcode and Reality Composer Pro. We had a couple of questions, but I want to call out one from Jerome about how you define whether the project starts as fully immersive or starts in a window. Where where is that for people who are newer to these platforms? So that's going to be in the info.plist of your Xcode project. I believe it's in the application scene manifest section of your Info.plist, and it's the preferred default scene or something like that. So you'll need to set that for like a volumetric window or an immersive scene. And if you recall in the demo, the first version of the demo was this fully immersive scene, and I had to transition that to a volumetric window for the second demo. So I went into the application. It went to the Info.plist. I found that key value pair, the application scene manifest and I changed that. One thing I will call out is that you need to make sure you have the correct window group in your app, or else you'll get an error. But yeah, that's all you need to do to to convert your app to one of the other types of Spaces. Awesome. And Adarsh is a technology evangelist. Do you have any sessions that you can recommend to people who are excited to learn more about starting to create immersive media? Oh, there's so many! Well, I would definitely recommend checking out the SwiftUI sessions. There's a lot of them. Like with visionOS 26, there's Better Together, SwiftUI and reality Kit. It really gives you a good sense of the 3D interfaces interacting with 3D content. A really useful session for a lot of the storytellers here in the room. Awesome. Talking a little bit about resolution here. You know, Tim, you called out the Blackmagic Cinema Camera has an incredible resolution. And then also, if you're thinking about creating high resolution content for a custom experience, whether that's 3D content or whether you're bringing in existing formats, I know we can use metal via compositor services to to get a little bit higher resolution to the device. Can you talk a little bit more about how you can kind of push the resolution if you if you want to create like a fully immersive game or share some existing media? Yeah, there's actually a new API called Render Quality in Metal Compositor services. And you can you can tweak the values of render quality. If you set it to one, it will give you the highest resolution over there. It will also like work with Foveation. So it uses the head based Foveation Aviation. When you set it to one, but it will come at a cost. You'll consume some GPU and CPU resources and memory, so make sure that you profile for your app when you do that. But yeah, you should get the frame buffer up for that and you should get a high resolution. Awesome. Ethan, I want to throw a SharePlay question your way. Yuri reached out and was asking, when you're doing a shared experience, when you're thinking about latency and data rate, what are what should you keep in mind and what you should you design for when you're considering fast quality interactions? Yeah. So in general, we design SharePlay and the group session messenger API in particular for these kinds of real time interactions. So I would start off just using the group session messenger in its default mode. But then as you start to push things and you're doing things more real time, like I said in our session, take a look at the unreliable delivery mode. And you can think of this as a bit like a TCP UDP divide, where we remove some of the network overhead to reduce latency, but also provide less guarantees on the delivery of the message. So a mix of those is what we tend to use to to create kind of the best, most real time feeling experiences. Awesome. Let's go back to video for a second. We got a couple of different questions on capturing different video types.
Elliot I actually no, no, Tim I'll start with you and then I'll go to Elliot. Tim Steve wants to know, do you have an ideal workflow recommendation for shooting and editing spatial video from iPhone 17 Pro? Well, I would recommend Final Cut Pro.
I'm shocked. I'm shocked, I tell you. Resolve also supports it as. Well. Final Cut Pro I think we did a pretty good job at designing a very easy workflow, so you can just bring that spatial video Audio in, it just looks like regular 2D video when you first bring it in. But then there's a bunch of different viewing modes. So even if you don't have a, you know, a Vision Pro to check stuff you as an editor, you can edit stuff. Even with anaglyph glasses, there's difference mode. All that stuff is just in there in the viewer. And then the really cool thing is you can add any titles, you have convergence, control, you can place those titles anywhere in space and then export a video directly to MV HEVC. That's that's awesome. Are there any tips you'd share for folks who are shooting like iPhone specifically as they bring it into this post workflow? Get closer with the iPhone for spatial, but not too close. So you're you're shooting with two lenses, but they're of different focal lengths, so they actually have different depth of field. So if you're about that far away from your subject, you're going to get pretty decent depth, but still have everything pretty much match as far as field of view. As soon as you start getting closer and the focus comes in closer, then you'll notice that the depth of field is a lot shorter on the one X lens than it is on the 0.5 x lens. So there's a couple of things to watch out for. And maybe if you are looking to get closer, some of the canon lenses can do that really nicely. Yes, the distance between the lenses is even a little bit smaller than than is on iPhone, so you can get just a fraction closer. Yeah, and those lenses are exactly the same.
Cool. You're talking about other ways to to capture spatial video. Mark Swanson had a particularly fun one. How is spatial video capture with using Vision Pro itself? What are the opportunities there? Should I take that? Yeah. Yeah, I'm looking at you. It's probably overlooked, right? Shooting spatial video on on Apple Vision Pro because a lot of people want to go to a device that they're used to shooting with. I think one of the interesting things, if you did shoot with, with spatial video on Apple Vision Pro is that you'll get a slightly taller aspect ratio, so roughly 1 to 1, I think it is 2200 by 2200, still 30 frames per second, but that's slightly taller. Aspect ratio can be pretty helpful at revealing just a little bit more of the scene than perhaps you get on iPhone or on some of the canon lenses. So yeah, I mean, if you were going to do something creative where you wanted a POV, you know, someone like cooking or maybe doing a sport of some sort and you're willing to wear a Vision Pro whilst you're capturing it, then it could be a pretty cool thing to check out. That's awesome. I haven't thought about this before, and now I'm now I kind of want to try it, although maybe not with the sport. I feel like that might be. It might be challenging thinking about how to bring video into an interactive experience. Adarsh, I'm looking towards you at Elliot. You might have some tips here as well. Michael was asking is there a way to use transparency with 3D video inside an app. For instance, have a 3D video of somebody appear in the viewer's space. From what I know, I think you'll you'll have to create a custom material or custom shader to render transparency, and you would have to use a custom shader in reality Kit or unity, I think I would start there. Do you have something that you want to add more? Yeah, I mean depending on where the question is coming from, if it's like a you have an idea and you just want to execute it. I think the biggest gap sometimes we see is if you're working in a creative capacities, try find a really keen developer, someone that's comfortable in Xcode with Realitykit, or perhaps a game dev in unity. They'll be comfortable with that type of language and the engineering documentation that Adarsh and the team have to be able to get to the bottom of it and hopefully bring it to life. Hopefully. That's helpful. Yeah, I think that makes a lot of sense for developers who do want to include other video formats in their experiences. Is there a way to have an entity model entity component system for wide field of view or App Nap? Trying to understand what you need to do to show App Nap video inside a reality kit experience.
I mean, reality Kit supports video playback across a lot of formats, and there's a feature called Video Player Component and Video Player component also has support for wide field of view video.
So I would I would start there. I think that should that should give you good support. If you want to learn more. I would also point you to support immersive video playback on visionOS session. That's a WWDC session from this year. Do check that one out. It will give you a good understanding of like what video player component supports. Yeah. And that shows you how to adopt the the IT into your apps as well. Correct. Yeah it goes through all the API. Very cool.
thinking about again, more interactive experiences. Nathaniel, I think I've got one for you. Frank is asking, can visionOS find the light sources in the room and use them to add light and shadow for virtual objects? Sure. So one of the ways that Vision Pro is able to render things very realistically in your space is it. It can take the light that's in your room and create an IBL or an image based light to kind of light models and materials in a way that the color is realistic in the environment. But I think what this question is getting at is maybe like detecting a stage light in your environment and then having it cast a shadow. That isn't something that happens automatically on Vision Pro. So you'll get these really accurate colors, but you're not going to get just out of the box shadows casting from maybe like a point light like a lamp that you have in your room. I wouldn't say that's not it's not possible to achieve this. I'm sure there's clever solutions, but it's not something that we offer out of the box. Yeah, that makes a lot of sense, Ethan, when you're thinking about doing SharePlay for immersive video on visionOS, can you talk a little bit? Just as a broad scale, when you're talking about SharePlay and you're moving into AIV, what does that look like? How is that different than some of the experiences that we showed on stage earlier? And then also Oli had a specific question about handling drift and late joiners. Yeah, yeah.
So what's great is because AVPlayer supports AIV out of the box and AVPlayer has great SharePlay integration. A lot of this is doesn't change when you go to AIV. So I'll talk about like the AVPlayer integration first. So I talked about that a bit in the session. But when you have a group session you can coordinate your AVPlayer with it. And it handles things like late join or catch up or somebody seeking the video. It's also optimized for nearby sharing. So when you have two Vision Pro in the same room who are nearby sharing with each other, the latency tolerances are very tight, so that audio playback is very in sync and you don't hear bleed or anything. So there's a lot of great. Optimization built into AVPlayer. When you bring AIV in, all of that still applies. You get the great sync and audio playback and catch up support. But whenever you're watching content like AIV that has a single good viewpoint, we will hide spatial personas. So you'll be each participant will be in like a private, immersive space. They'll still be able to hear each other and it'll be a shared experience, but they won't be able to see each other until they leave that video. And there will be indication to the other participants in the UI when people enter and leave that immersive space. That's awesome. All right, we've got time for a few more questions. So we'll call this a lightning round. So Tim, here's one from Christian relating to Final Cut Export. Maybe you can help. Christian's running into an issue where they're exporting exporting stereoscopic video from Final Cut. But currently it's playing back as two side by side video screens. Do you have any idea what's going on there? It sounds like it's not MV HEVC. So there's two ways to share your stereo. Your spatial project. In Final Cut Pro, what you want to look for is the Apple Vision Pro preset. So you go to Share Menu file share Apple Vision Pro and then you'll automatically get MV HEVC. If you do File Share export, then that's really good for archiving, and it outputs a side by side. So you can archive to ProRes and keep that in its highest quality without going down to HEVC. So that's probably all that's happening. Awesome. That's it. Could be a good bit of tech advice to to close out some of our questions. Now, before we go, I've got a question for all of you. We've talked about some really cool things on stage. Is there anything that you wanted to mention in your presentation but you didn't get to, like some, some cool thing about what you spoke about today that you want this audience to know about that we didn't get time for. Elliot, I'm going to. I'm going to spring it on you first.
It's a good question. I think a lot of creators have a lot of questions around the Blackmagic Ursa Cinema Immersive for Apple Immersive Video, and one of the things that we didn't talk about that I would encourage all creators to do is, is obviously to get out and test things right. Try out ideas, figure out what works. Previz, as we showed today, is one way of doing that. But with Apple Vision Pro also supporting App Nap, it's not a bad idea to try and head out with perhaps a camera that you already have, like one of the canon VR lenses and just start testing, you know, bringing that into the ecosystem. The more iteration I think we can do as a, as a community, the more we can share footage. I thought Victor from target did a great job at encouraging us to just share and be open about our projects. The better. And that might be a lower bar for entry. If you just want to play around and test things. Yeah, maybe a little thought. I think that's great. Tim, I know there's 70 different things in AIV. I can think of. One, though, that I should have had a slide for. So I mentioned that you can uplift your archival vr180 and 360 videos. You know, from the past decade or so, right? And I mentioned that you can use AV convert, which is either a command line app or through the Finder. You can just right click and bring up the contextual menu and and say encode selected files. Choose an MV HEVC. But what I forgot to mention is that if you just put one of those Google spherical files into visionOS, and if you know if it's visionOS 26 and it first goes into photos and then you export and you say export unmodified file and you put it in files, you'll get an automatic conversion to MV HEVC App Nap. So yeah, It's a lot easier to do it on a mac, but yeah. Still pretty. Fun. I wish I had mentioned that. At our show.
So remember I had mentioned that you can have persistent experiences so you can like leave windows and volumes pinned to surfaces and you come back even after a reboot, everything's in the same place where you left it. You could also do that with widgets, which I didn't cover today, but widgets could be a really interesting way to like bring your interactive characters into people's rooms so that everyday changes or maybe evolves as you like go over time. So something to look into. It's so seamless, like it blends right into your wall or or onto your desk. It's a really cool feature, which I couldn't go into today because of time and such, but do check that out. Awesome. Cool. Daniel. Yeah. So I think one of the things that I really would have loved to talk more about in this demonstration was. So I'm a programmer game developer at heart. I would have really loved to talk about how we built all the different shaders that we created for petite asteroids, especially like the squash and stretch shader for the effect, but I just want to highlight how these are available to you on the Apple Developer website, so you can download the complete project for petite asteroids. We also have another sample that we released, Canyon Crosser, and these use so many different pieces of the of all the different Apple frameworks and how they they work together that you don't really get from just sort of like a small sample or an isolated sample. So I really hope that you guys are able to check out the complete code base on your own and are inspired by it. Yeah. Awesome.
Yeah. So Alex and I covered a lot of APIs, but one of my favorites we had to cut. So there's a new API in visionOS 26 for Dynamic Scene Association. So scene association is specific to SharePlay and visionOS. It's what like when we show that Freeform example, it's what gives you that spatial consistency. You have one scene that's associated with the activities that everybody sees it in the same place and new and visionOS 26 you can dynamically change which scene is associated. So you could start with a vertical scene and switch to a volumetric scene, or you could start with vertical scene and enter an immersive space and associate the vertical scene. So you have a lot more power and flexibility in visionOS 26. Just exciting. That's really cool. And thinking about the prototype that that you and Alex were playing around with an escape room, I can definitely see some cool opportunities there. Yeah, it unlocks a lot. Awesome. Well, thank you all so much for joining us back on stage again. And thank you all for coming to our first day of of media experiences. Tomorrow. We will be back at 10 a.m. Pacific, both in person and online, to dive deep into Apple Immersive Video. And I am not using the dive deep metaphor lightly. I think. I think we really we have a full day of AIV content for you all, so we're really excited to share all of that. So please join us again today both in the room and online. And for now, to the folks online, thank you for sticking around. If you're calling in viewing, live streaming, what is this, a radio show? If you're live streaming and it's late out there, have a good night. If it's your dinner time, go have some dinner. Otherwise, good morning. And and and good afternoon to all of you. We'll have some camera tours right after this. Please check the back of your badge. We'll start with group two and also a nice little community mixer. See you all out there.
-