WWDC 2023 —

Hands-on with Apple Vision Pro: This is not a VR headset

This was the best headset demo I’ve ever seen. But there’s room for improvement.

The best part was the interface

Here you can see cameras on the bottom of the headset that read your body language for the interface and for FaceTime calls.
Enlarge / Here you can see cameras on the bottom of the headset that read your body language for the interface and for FaceTime calls.
Samuel Axon
I was able to touch, examine, and wear the headset. An Apple representative walked me through the basic interface, and I browsed a home screen full of apps.

Vision Pro’s interface is all about eye tracking. Whenever you look at a UI element (like an X to close a window or a photo within a gallery in the Photos app), it is subtly highlighted in your view. To actually make a selection—to click, if you will—you simply tap two of your fingers together. You don’t have to hold your hand in front of the headset to do this; as long as your hand is not hidden completely behind you, it can be pretty much anywhere. To scroll, you pinch and move your fingers up and down or side to side. It feels a bit like pulling a string to open window blinds.

In my testing, the eye tracking was perfectly accurate and responsive. It reminded me of using a similar feature in PlayStation VR2, but it felt just a bit more accurate. If you’ve used well-implemented eye tracking in VR before, you know it becomes intuitive and natural almost immediately.

I’ve used headsets that required hand gestures, but it never felt very natural. With Vision Pro, it feels just right. The fact that your hand can go anywhere, and that you can pinch subtly instead of making some kind of dramatic gesture, goes a long way.

If you’ve used a Meta VR headset, a PlayStation VR, or almost any PC VR device, you know how awkward it can be to carry controllers in your hands. Now that I’ve used Apple’s interface, it will be hard to go back to using controllers again. This approach is not only more immersive; it’s much more practical.

I was able to launch windows for multiple apps and arrange them around me. Moving them around involved simply gazing at a small white line beneath each, pinching to match what on desktop would be holding down the left mouse button, and turning my eyes to where I wanted the window to go. I was able to place windows in an array around me, and I was even able to overlap them on top of each other. Whichever one I looked at appeared in front in that moment. All of this worked well, and I had no complaints. In this respect, there’s nothing to criticize: Apple has nailed the interface.

There’s one other aspect to the interface worth noting: adjusting your immersion level.

Turning the headset’s digital crown smoothly transitions between total immersion at one extreme and absolute passthrough at the other. The headset captures your surroundings (depth perception included) and displays them on the two screens in front of your eyes. When you’re turned all the way to passthrough, you see what you’d see if you weren’t wearing Vision Pro at all—albeit a bit darker and with just the slightest bit of softness.

As you turn the knob from immersion to passthrough, the digital elements slowly crossfade to whichever in-between state you want; it’s like changing the transparency level on a UI element in a 2D interface. If you turn the crown all the way, the digital objects disappear completely via a sort of vignette effect—kind of like those transitions in Star Wars where the initial shot disappears into a shrinking circle, revealing the next scene.

Further, Vision Pro recognizes when someone is standing or sitting near you and crossfades them into partial view, even if you’re far along the scale toward immersion. This is effective but very surreal. The face of an Apple rep sitting next to me was clearly visible, but he looked like a semi-transparent apparition floating strangely in my virtual reality environment—it was a bit like the visual effects you’ve seen movies use to represent ghosts or spirits, sort of there but sort of not.

Channel Ars Technica