2. Being There
• Conventional 3D graphics is cinematic
– Shows you something
• On a display, in your environment
• VR graphics is immersive
– Takes you somewhere
• Controls everything you see, defines your environment
• Very different constraints and challenges
3. Realism and Presence
• Being there is largely about sensor fusion
– Your brain’s sensor fusion
– Trained by reality
– Can’t violate too many hard-wired expectations
• Realism may be a non-goal
– Not required for presence
– Expensive
– Uncanny valley
5. Low Persistence
• Stable image as you turn - no motion blur
• Rolling shutter
– Right-to-left
– 3ms band of light
– Eyes offset temporally
6. Positional Tracking
• External camera, pointed at user
• 80° x 64° FOV
• ~2.5m range
• ~0.05mm @ 1.5m
• ~19ms latency
– Only 2ms of that is vision processing
8. Image Synthesis
• Conventional planar projection
– GPUs like this because
• Straight edges remain straight
• Planes remain planar after projection
• Synthesis takes “a while”
– So we predict the position / orientation
– A long range prediction: ~10-30ms out
9. Note on Sample Distribution
• Conventional planar projection, not great for
very wide FOV
– Big angle between samples at center of view
10. Alternative Sample Distributions
• Direct render to cube map may be appealing
• Tiled renderers could do piecewise linear
– Brute force will do in the interim
– But not much FOV room left at 100°
13. Optical Distortion
• HMD optics cause different sample
distribution – and chromatic aberration
• Requires a resampling pass
– Synthesis distribution -> delivery distribution
– Barrel distortion to counteract lens’s distortion
• Could be built in to a “smarter” display engine
– Handled in software today
• Requires either CPU, separate GPU, or shared GPU
14. Display Engine (detour)
• In modern GPUs, the 3D synthesis engine
builds buffers to be displayed
• A separate engine drives the HDMI / DP / DVI
output signal using that buffer
• This engine just reads rows of the image
• More on this later…
15. Time Warp
• Optical resampling provides an opportunity
– Synthesized samples have known location
• Global shutter, so constant time
– Actual eye orientation will differ
• Long range prediction had error
• Better prediction just before resampling
• Both predictions are for the same target time
• So resample for optics and prediction error
simultaneously!
• Note: This just corrects the view of an “old” snapshot
of the world
16. Time Warp + Rolling Shutter
• Rolling shutter adds time variability
– But we know time derivative of orientation
• Can correct for that as well
– Tends to compress sampling when turning right
– And stretch out sampling when turning left
17. Asynchronous Time Warp
• So far, we have been talking about 1 synthesized
image per eye per display period
– @75 Hz, that’s 150 Hz for image synthesis
– Many apps cannot achieve these rates
• Especially with wide-FOV rendering
• Display needs to be asynchronous to synthesis
– Just like in conventional pipeline
– Needs to be isochronous – racing the beam
– Direct hardware support for this would be
straightforward
18. Asynchronous Time Warp
• Slower synthesis requires wider FOV
– Will resample the same image multiple times
• Stuttering can be a concern
– When display and synthesis frequencies “beat”
– Ultra-high display frequency may help this
– Tolerable synthesis rate still TBD
• End effect is, your eyes see the best information
we have
– Regardless of synthesis rate
You see a movie, watch a tv show, even play a 3D game. You’re not there. You’re looking at a sequence of images captured by a physical or virtual camera.
And you’re looking at it displayed on a screen of some sort, somewhere in your environment.
When you enter virtual reality, the system provides the environment directly. Total ocular override.
It’s a big responsibility. With substantially different constraints and challenges.
Most of you are experiencing reality at this very moment!
A variety of sensors are telling you what your environment looks like, sounds like, the temperature, direction of gravity, orientation of your body, eyes, approximate rates of motion…
Lots of stuff. And it does it at a pretty incredible rate, pretty much all the time you’re conscious. It’s exhausting.
But reality trains you what to expect.
Ideally we would control all these inputs and provide believable stimuli. In practice we have to start with the most important ones first, and figure out what kinds of margins we have for error on them.
Inputs that violate our hard-wired expectations can often result in unpleasant user experiences.
The interesting thing is, obviously synthetic virtual environments don’t detract from presence. And in many ways, going after total visual realism can be a distraction that doesn’t enhance the user’s experience.
The DK2 makes some substantial and important improvements over DK1.
Specifically, resolution, low persistence, higher refresh rate, and full 6DOF tracking.
For each pixel, we can predict which direction vector it corresponds to at the time the pixel lights up.
This is dead simple with a global shutter, but not too bad with a rolling shutter.
Start with some IR LEDs on the HMD.
Add in a USB camera.
Plus a little invisible software, and alacazam! You have position tracking!
Dov Katz might kill me for pretending it’s that simple. Suffice to say, it’s not. But also, you don’t need to worry about it.
The mechanisms will change, but mostly developers don’t have to care.
For wide FOV, conventional planar projection has awful sample density in the view direction, and great sample density at the periphery.
Notice the angle subtended by each dash in the diagram. Larger angles (like in the center) mean poorer sampling density.
Coming from the other side, the optics tell us the ray direction that each pixel corresponds to.
And it’s not the same as the planar projection.
And because of chromatic aberration, the ray is different for each component of each pixel.
This picture shows a DK2 eye piece resting on a monitor. You can clearly see both the pincushion effect and the chromatic aberration that the lens produces.
The barrel distortion you see when looking at HMD rendering without the optics cancels out this pincushion effect.
Methods of computing and predicting eye position and orientation are important. But we don’t really have to care, as long as the information we get is good.