This cohesive overview of the advanced rendering techniques developed for Rise of the Tomb Raider presents a collection of diverse features, the challenges they presented, where current approaches succeed and fail, and solutions and implementation details.
VK Business Profile - provides IT solutions and Web Development
Rendering Techniques in Rise of the Tomb Raider
1. Labs R&D: The Rendering Techniques of Deus EX: Mankind
Divided and Rise of the Tomb Raider
SIGGRAPH 2015
www.eidosmontreal.com/www.square-enix-montreal.com
2. With great graphical features, comes
great responsibilities
Creating a safety net for future problems with web tools
Production Track
2
Robbert-Jan Brems
Technical Artist from Eidos-Montreal
3. Why
❖ Supporting new features
❖ Improving iteration time
❖ Debugging scenes
❖ Data visualization
❖ Training
Production Track
3
4. Programmers
❖ Prototypes that give clear feedback for improving tools
❖ Outsourcing tool development to user
❖ Find problems worth solving
Production Track
4
Artists and Designers
❖ Choosing personal workflow
❖ Designed for the user by the user
❖ Introduction to computer science
5. Why web programming
❖ Reusable skill
❖ Plenty of resources
❖ Lower entry threshold level
❖ Clear difference between UI and functionality
Production Track
5
6. How we achieve this
Production Track
6
HTML5
KnockoutJS
Engine API
(C#)
7. Web Tool Example: FeedbackNote
Connect the people through the problems they share
All Rights Reserved SquareEnix Eidos Montreal 2015
8.
9. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Volumetric Lights
9
Presented by Peter Sikachev
3D Programmer at Labs, Eidos Montreal
Samuel Delmont
Senior 3D Programmer at Labs, Eidos Montreal
10.
11. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Overview
❖ Light diffusion in air, caused by particle (dust/smoke) in the
atmosphere
❖ Needed for realistic rendering
❖ Also known as:
❖ Light shaft
❖ God rays
❖ Light scattering
11
Volumetric Lights
12. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Past Approaches
❖ Post process (screen space light shafts) :
❖ Good for one distant light source like the sun.
❖ Doesn’t support local lights, works only when the light
source is visible on screen.
❖ Screen space 2D raymarching :
❖ Good : Works with any sort of light, and efficiently use the
scene depth to do the work only on visible fragments.
❖ Bad : Poorly uses GPU parallelism, keeps information for
only one depth (bad compositing with transparency), and
half/quarter-resolution approach requires bilateral
upsampling.
12
Volumetric Lights
13. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Inspiration
Assassin’s Creed IV, volumetric fog
❖ Voxelized approach : use 3D texture as storage.
❖ Support directional and local lights.
❖ Use compute shader for raymarching using the texture’s
slices as steps.
❖ Keeps information for multiple depths in the 3D texture, that
can be sampled by any kind of primitive like transparent
particles.
13
Volumetric Lights
14. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Overview
❖ The camera frustum is divided into sub volumes stored as voxels of a
160×90×64 3D texture.
The depth is exponentially distributed among the slices over 64 meters.
❖ The idea is to “light” each voxel with the scene lights (light density estimation),
before doing the final scattering.
❖ Method is compatible with deferred lighting.
❖ Since the algorithm is independent from the scene depth, most of the work runs
asynchronously after shadowmaps have been rendered.
❖ Uses temporal reprojection to effectively increase the number of steps of the light
density estimation pass.
14
Volumetric Lights
R11G11B10 Float
160×90×64
15. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Separated passes
❖ Light influence in the volume texture per block of 4×4×4 voxels.
❖ Light density estimation + temporal reprojection.
❖ Light density estimation texture blur.
❖ Light scattering.
❖ Apply volumetric light on opaque.
❖ Render transparent primitives using the volumetric light.
15
Volumetric Lights
16. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Light influence in the volume texture
❖ Goal : Build a list of lights for each block of 4×4×4 voxels to filter out lights in
the light density estimation pass and drastically reduce its cost.
❖ For local lights only.
❖ Subdivide the camera frustum into another R32_UINT volume texture, which is
4×4×4 times smaller than the volumetric light texture (resolution 40×23×16).
❖ Each voxel of this texture represents a block of 4×4×4 voxels of the volumetric
light texture.
❖ In one 3D compute shader dispatch (1 thread = 1 voxel), cull each light volume
against each voxel and pack into the texture the number of lights influencing
the voxel and the index of each influencing light.
16
Volumetric Lights
40×23×16 texture
17. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Light density estimation
❖ In one 3D compute shader dispatch (1 thread = 1 voxel), loop
through each influencing light using the previously generated
light influence texture. Since the number of threads per group
is (4,4,4), the groupId is the coordinate in the light influence
texture.
❖ Use the deferred light data (shadowmaps, modulation maps,
distance attenuation, etc.) to compute and accumulate the
density of each light, using the voxel center as the “lit point”.
17
Volumetric Lights
18. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Light density estimation
❖ Use a 2×2 Bayer matrix to add dithering on Z axis and
trade banding (due to the high frequency information
of shadowmaps) for noise.
❖ Use an additional frame offset on X,Y,Z for temporal
supersampling.
18
Volumetric Lights
Filtered out
Filtered out
Occluder
Offset samples
1 voxel of Light
influence texture
Black dots are “unlit” samples, because of light parameters
(cone/distance attenuation) or shadowmap
Yellow dots are “lit” samples
19. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Light density estimation
❖ The light density is computed using Henyey-Greenstein phase function (𝑔 is the anisotropy factor, 𝜃 is the angle between the eye
direction and the light direction).
❖ Apply the temporal reprojection at the end the of loop, using the previous frame’s Light density estimation texture (kept before the
blur pass).
19
Volumetric Lights
𝑝 𝜃 =
1
4𝜋
1 − 𝑔2
1 + 𝑔2 − 2𝑔 cos 𝜃 Τ3 2
20. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Code
20
Volumetric Lights
[numthreads(4, 4, 4)]
void main(uint3 dispatchThreadID : SV_DispatchThreadID, uint3 groupId : SV_GroupID)
{
float3 lightDensity = 0.0f;
PackedLightInfluence packedLightInfluence = InTextureLightInfluence[GroupID];
uint numberOfLights = UnpackLightNum(packedLightInfluence);
for(uint light = 0; light < numberOfLights; ++light)
{
// Get actual light index in constant buffer light list
uint lightIndex = UnpackLightIndex(packedLightInfluence, light);
LightData lightData = g_LightsData[lightIndex];
// Compute view position from this voxel coordinate
float3 viewPosition = ComputeViewPosition(dispatchThreadID);
// Compute light density for this light, at this position, using shadowmap, attenuation, etc.
lightDensity += ComputeLightDensity(lightData, viewPosition);
}
// Output to the light density estimation texture
OutLightDensityEstimationUAV[dispatchThreadID] = lightDensity;
}
21. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Light density estimation texture blur
❖ The Light density estimation texture is blurred on X,Y using a 3×3 kernel to get rid of the noise effect due to the previously applied
2×2 Bayer matrix.
❖ Separate blur is done in a single 3D compute shader (numthreads per group (32, 2,1)) using LDS as an intermediate storage to
reduce memory bandwidth.
21
Volumetric Lights
Vertical blur 3 taps, read from texture and store in LDS Horizontal blur 3 taps, read from texture and write in LUAV
22. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Optimizations
❖ Effective for both light density estimation and blur passes :
❖ Compute a 3D bounding box (AABB) which encompasses all
visible lights.
❖ Dispatch a number of threadgroups according to the bounding
box size.
❖ In the compute shaders, add the AABB offset to the
dispatchThreadID variable to have the actual voxel coordinates.
❖ Recompute the groupID variable to have the Light influence
texture coordinates.
❖ Reduce number of wavefronts and number of updated voxels in the UAV
22
Volumetric Lights
DispatchThreadID offset Actual Dispatch Threads
Lights AABB Volume texture (160×90×64)
23. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Light scattering
❖ Dispatch a 2D compute shader (numthreads per group
(8,8,1)) to solve the scattering (prefix sum). 1 thread outputs
values for all 64 slices.
❖ March through slices and accumulate light density.
23
Volumetric Lights
24. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Apply volumetric light
❖ Simple bilinear 3D texture fetch
❖ On opaque : screen space post process and use the
scene depth to compute the W coordinate.
❖ On transparent (alpha blend and additive): in the
primitive’s pixel shader, use the fragment depth to
compute the W coordinate.
24
Volumetric Lights
25. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Performance
❖ Cost is highly dependent on number of visible local lights, whether or not they are using shadowmaps, size of the lights on screen,
complexity of lighting.
❖ Since most of the work is running with asynchronous compute shaders, part of the cost is likely hidden.
❖ When running synchronously (only for profiling purpose), for the light shown on the previous slide, the cost is ~0.8 ms (Xbox One,
1080p):
❖ Light influence : 0.015 ms
❖ Light density estimation : 0.44 ms
❖ Light density estimation blur : 0.07 ms
❖ Light scattering : 0.12 ms (fixed cost)
❖ Apply : 0.17 ms (fixed cost)
25
Volumetric Lights
26. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Sunlight Shadow
26
Uriel Doyon
Technical Director at Labs, Eidos Montreal
Peter Sikachev
3D Programmer at Labs, Eidos Montreal
27. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
27
Sunlight Shadow
Cascade Shadow Maps (CSM)
Refresher
❖ Use several shadow maps to model a light shadow
❖ Required because perspective projection changes the
pixel/texel ratio
❖ Each shadow map uses texels of different world size
Conventional Shadow Map:
Light View
Camera frustum
28. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
28
Sunlight Shadow
Cascade Shadow Maps (CSM)
Refresher
❖ Use several shadow maps to model a light shadow
❖ Required because perspective projection changes the
pixel/texel ratio
❖ Each shadow map uses texels of different world size
Insufficient
resolution
Sufficient resolution
Conventional Shadow Map:
Camera View
29. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
29
Sunlight Shadow
Cascade Shadow Maps (CSM)
Refresher
❖ Use several shadow maps to model a light shadow
❖ Required because perspective projection changes the
pixel/texel ratio
❖ Each shadow map uses texels of different world size
Cascaded Shadow Map:
Light View
Camera frustum
30. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
30
Sunlight Shadow
Cascade Shadow Maps (CSM)
Refresher
❖ Use several shadow maps to model a light shadow
❖ Required because perspective projection changes the
pixel/texel ratio
❖ Each shadow map uses texels of different world size
Conventional Shadow Map:
Camera View
Sufficient
resolution
everywhere!
31. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Cascade Shadow Maps Issues
❖ Optimal cascade distribution
❖ Wasted resolution in occluded areas
❖ Can we do better?
31
Sunlight Shadow
Optimal zFar plane
Suboptimal zFar plane
Optimal zNear plane
Suboptimal zNear plane
Wasted resolution in
occluded area
32. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Sample Distribution Shadow Maps (SDSM)
❖ Proposed by Intel [1]
❖ Finds min/max z in depth buffer to optimally distribute cascades
❖ (More importantly) Tightens cascade bounds to visible only pixels
32
Sunlight Shadow
33. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
SDSM Idea
❖ Find min/max depth
❖ Distribute cascades
❖ Tighten cascade bounds based on depth buffer visibility
❖ …
❖ PROFIT!
33
Sunlight Shadow
34. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
34
Sunlight Shadow
SDSM Caveats
❖ Needs a depth pre-pass
❖ But who doesn’t have it these days…
❖ Quality issues
❖ Snapping, transparent objects, cascade transitions etc.
❖ Additional costs
❖ Parsing depth buffer and some other hidden costs
❖ Wait, how exactly do we apply cascade bounds calculated on GPU?
35. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Basically, you have two choices…
35
Sunlight Shadow
Take a blue pill, and sync with CPU
& enjoy 1/2/3-frames latency,
cutscenes
36. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Basically, you have two choices…
36
Sunlight Shadow
Take a blue pill, and sync with CPU
& enjoy 1/2/3-frames latency,
cutscenes
Take a red pill, and learn how to do it
without the nasty CPU sync
38. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
API Overview
❖ New calls in DirectX 11 API
❖ DrawInstancedIndirect
❖ DrawIndexedInstancedIndirect
❖ DispatchIndirect
❖ Pass a pointer to a buffer holding same parameters as a
non-indirect call
❖ Using ID3D11Buffer descriptor
38
Sunlight Shadow
39. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Instance Matrices
❖ Update view/proj matrices after bounds update
❖ Move matrices from CB to UAVs
❖ Use static/dynamic branching to use the desired matrix
❖ Minimal codebase refactoring
39
Sunlight Shadow
40. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Potential Increased Overhead in Draw
Calls
❖ Each primitive is drawn in every cascade
❖ Could increase number of draw calls
❖ Instance count increase does not incur overhead
❖ Can ignore (almost) always material setup, thus reduce draw
call count
40
Sunlight Shadow
41. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Building a Reusable Command List
❖ Generate a single draw list for all cascades
❖ Build once, reuse multiple times
❖ Materials params are bound to CB every frame in RotTR
❖ Using bindless resources could have saved state change count
❖ Use draw indirect buffers and matrices from RW structured
buffers, render to the same RT all cascades
❖ Resolve the resulting cascade depth buffer into the global
shadow buffer
41
Sunlight Shadow
42. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
GPU Culling
❖ Changie instance count of each draw call to render only
the primitives intersecting the shadow cascade frustum
❖ Use draw indirect buffers updated in a GPU culling
compute task
❖ Build a list of instance indices to remap the HW instance
value to the actual usable instance index.
❖ For instance set { 0, 1, 2, 3, 4, 5 }, if only { 1, 4, 5 }
are intersecting a given frustum, we draw HW
instance { 0, 1, 2 } and remap them to { 1, 4, 5 }
42
Sunlight Shadow
43. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
GPU Culling Overhead
❖ Culled primitives will have significant overhead if there are
too many 0 instance draw calls following each other
❖ Otherwise, the overhead is hidden by other rendering
tasks
Matrix Buffer
❖ Build a matrix buffer containing concatenated world to
projection transform for each drawn primitive
❖ Store it in a separated structured buffer updated by the
culling task
43
Sunlight Shadow
44. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
GPU Culling Resources
44
Sunlight Shadow
Draw indirect params
Drawn instance indices
Unused Space from
culled indices
Inst Count = 3
Inst0 = 1
Inst1 = 4
Inst2 = 5
Inst Count = 1
Inst0 = 1
Inst Count = 1
Inst0 = 0
Inst Count = 0
…
…
Projection matrix
World to projection matrix
for primitive 7
Unused space form
culled matrices
Draw Indirect Buffer Matrix BufferCommand Buffer
DrawCall
DrawCall
…
… Projection Mtx
WorldToProj1
WorldToProj4
WorldToProj7
…
WorldToProj5
…
WorldToProj8
46. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
SDSM Optimizations and Improvements
❖ Tight bounds compute shader optimization
❖ Smart cascade selection
❖ Shadow map snapping
❖ Transparent objects handling
❖ Dynamic number of cascades
46
Sunlight Shadow
47. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Tight Bounds CS Optimization
❖ We use shared memory and atomics to find min/max values
❖ SM 5.0 allows atomics only for integers
❖ Round up all distances, as they are in ‘centimeters’ in TR
❖ Use LDS for min/max coords instead of local variables, devectorize
❖ Less microcode instructions
❖ Global zMin and zMax for all cascades instead of one per cascade
❖ Makes no visual difference
❖ ~30% cost reduction
❖ Allows more detailed cascades to affect more pixels
47
Sunlight Shadow
❖ Use ShaderFastMath [2]
❖ Careful, there are places where precision
DOES matter
48. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Smart Cascade Selection
❖ For CSM we will just select a cascade based on the fragment’s
depth
❖ For SDSM options are:
❖ Do the same
❖ Find the closest cascade containing the fragment
❖ A filter-size offset is needed if using shadow filtering
48
Sunlight Shadow
Regular Cascade Selection
49. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Smart Cascade Selection
❖ For CSM we will just select a cascade based on the fragment’s
depth
❖ For SDSM options are:
❖ Do the same
❖ Find the closest cascade containing the fragment
❖ A filter-size offset is needed if using shadow filtering
49
Sunlight Shadow
Smart Cascade Selection
50.
51.
52.
53.
54. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Shadow Map Snapping
❖ Snapping light camera to avoid ‘moving staircase’
❖ But shadow map size is constantly changing…
❖ Need to snap shadow map size as well
❖ With an exponential step
54
Sunlight Shadow
55. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Transparent Objects Handling
❖ Transparent objects might be left out as they don’t write into z
❖ Solution #1: pad cascades with transparent objects
❖ Need to cull occluded transparent objects
❖ In RotTR some fx have really big BBs, so…
❖ Solution #2: use static global shadow map
❖ Solution #3: pad last cascade to fit all possible transparent objects
55
Sunlight Shadow
56. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Rendering Objects to Multiple Cascades
❖ Big performance issue (foliage, characters)
❖ Reduce cascade count depending on MaxDepth threshold
❖ Merge cascades of similar size/overlapping
❖ Do not re-render objects fully in one cascade
❖ Still working on this issue
56
Sunlight Shadow
57. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
SDSM Surplus Cost (XBOX ONE, 1080p)
❖ Processing + GPU culling: 0.5 ms
❖ Rendering objects to multiple cascades: up to 1 ms overall
57
Sunlight Shadow
58. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
References
❖ [1] A. Lauritzen, M. Salvi, A. Lefohn. Sample Distribution Shadow Maps.
❖ https://software.intel.com/en-us/articles/sample-distribution-shadow-maps
❖ [2] M. Drobot. ShaderFastMathLib.
❖ http://http.developer.nvidia.com/GPUGems3/gpugems3_ch18.html
58
Sunlight Shadow
59. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
59
Broad TemporalAmbient Obscurance
Anton Kai Michels
R&D Programmer for Labs at Eidos-Montreal
60. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Motivation
❖ HBAO is 2.3 ms on Xbox One (1080p, half res)
❖ Presents temporal artefacts
❖ Better large-scale AO for big vistas
❖ Better small-scale AO for details
60
Broad TemporalAmbient Obscurance
61. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
HBAO vs BTAO
All Rights Reserved SquareEnix Eidos Montreal 2015
68. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Motivation
❖ High-quality, temporally stable AO
❖ Large scale AO
❖ Small scale AO
❖ Fixed cost of 1ms on Xbox One (1080p, half res)
❖ Ghosting free
68
Broad TemporalAmbient Obscurance
69. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
SAO[1.1] influences
❖ Use normal calculated from depth
❖ Spiral sample pattern
❖ Removed mip mapped depth
69
Broad TemporalAmbient Obscurance
70. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Smarter Sampling
❖ Large fixed screen space sample radius
❖ Near samples contribute > 1 AO
❖ Not physically accurate
❖ Works great, lots of detail
❖ Artists define ‘near’ and near falloff
70
Broad TemporalAmbient Obscurance
71. Large and Small ScaleAO
All Rights Reserved SquareEnix Eidos Montreal 2015
76. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Textures
❖ Both AO texture and velocity buffer are R10G10B10A2
UNORM
❖ AO texture
❖ R10 – Ambient Obscurance
❖ G10B10 – Linear Depth
❖ A2 – Dynamic Objects flag
❖ Motion vector
❖ R10G10 – UV Motion
❖ B10 – Depth Difference
❖ A2 – Dynamic Object flag
Broad TemporalAmbient Obscurance
77. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Temporal Reprojection
❖ Influenced by Bartłomiej Wroński’s blog post[1.2]
❖ Motion vectors → previous frame AO
❖ Weight with depth difference
❖ Accumulate AO over time
❖ Less samples per frame (7)
❖ Allows for temporal supersampling
Broad TemporalAmbient Obscurance
77
Pixel-current frame
Pixel-previous frame
Motionvector
78. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Temporal Supersampling
❖ Temporal supersampling = AO accumulation + pattern variation
❖ Rotate pattern 121 degrees/frame
❖ ~ Same pattern every 3 frames, no flickering
❖ Hundreds of samples over time
❖ Introduces ghosting
78
Broad TemporalAmbient Obscurance
79. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Ghosting
79
Broad TemporalAmbient Obscurance
Frame 1
Pixel receives AO
From dynamic object
Frame 1-2
Objects moves away
from obscured point
Frame 2
Temporal reprojection
Applies previous AO
Dynamic
Obscurer
Pixel Dynamic
Obscurer
Dynamic
Obscurer
Pixel still
obscured
80. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Anti-ghosting Frame 1
❖ Temporal supersampling = AO accumulation + pattern variation
❖ Rotate pattern 121 degrees/frame
80
Broad TemporalAmbient Obscurance
Dynamic
Obscurer
AO hit
AO miss
Obscurance
direction
Dynamic
Obscurer
81. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Anti-ghosting Frame 1
❖ Sample motion vector at obscurance point
❖ MotionVector.a = dynamic object flag
❖ Store in alpha channel of AO
❖ Now we know AO came from dynamic object
❖ Use next frame
81
Broad TemporalAmbient Obscurance
Dynamic
Obscurer
❖ Motion Vector
❖ (A2 flag = 1 because
object
is dynamic)
82. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Anti-ghosting Frame 2
❖ Lookup reprojection texture for pixel
❖ If A2 flag = 1
❖ Last frame’s AO came from dynamic object
❖ Reject accumulated AO
❖ Do not apply 121 degree supersampling rotation
(prevents flickering)
❖ Enjoy ghost-free supersampling
82
Broad TemporalAmbient Obscurance
Dynamic
Obscurer
83. Anti-ghosting video: Before and After
AO darkened for visibility
All Rights Reserved SquareEnix Eidos Montreal 2015
84. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Blur
❖ Only AO texture as input (contains depth)
❖ Horizontal & Vertical bilateral blur
❖ Compute Shader and Local Data Share (LDS)
❖ 192 threads/group + 6 pixel blur radius = 204 float2 LDS
Broad TemporalAmbient Obscurance
84
LDS: 204 × float2
Depth
AO
Each 192 threads samples the texture, unpacks
the linear depth and stores the result in LDS
12 threads get
the extra samples
86. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Broad TemporalAmbient Obscurance
86
Pass Time (ns)
AO Render 787,426
Horizontal Blur 119,890
Vertical Blur 125,543
Total 1,032,859
Timings
❖ Timings are taken from a 1080p capture on Xbox One (AO rendered at 540p)
87. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Bonus Comparison
1ms BTAO vs 1ms SSAO Microsoft Xfest [1.3]
All Rights Reserved SquareEnix Eidos Montreal 2015
90. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Procedural Snow Deformation
90
Peter Sikachev
3D Programmer for Labs at Eidos Montreal
Anton Kai Michels
R&D Programmer for Labs at Eidos Montreal
93. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Motivation
❖ Defining graphical feature
❖ Trail depression + elevation
❖ Support forests, slopes, mountains
❖ Scale with NPCs and animals
❖ Fill in blizzard
93
Procedural Snow Deformation
93
94. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Past Approaches
Assassin’s Creed III [2.1]
❖ Last gen title
❖ No compute or tessellation on PS3/360
❖ Render to vertex buffer trick for GPU tessellation
❖ Replaces large triangles with small, displaced ones
❖ Same tessellation factor for all replaced triangles
Procedural Snow Deformation
95. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Past Approaches
Assassin’s Creed III [2.1]
❖ Pros
❖ Trick for GPU tessellation on last gen
❖ Persistent tracks
❖ Supported slopes and terrain
❖ Cons
❖ Not very detailed
❖ No elevation on trail edges & no filling over time
❖ Mesh encodes max deformation to not reveal ground
Procedural Snow Deformation
96. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Past Approaches
Batman: Arkham Origins [2.2]
❖ Snow only on flat, rectangular rooftops
❖ Uses these as orthogonal view frustums
❖ Renders dynamic affecters in frustum
❖ Use the render target as a height map for the snow
Procedural Snow Deformation
97. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Past Approaches
Batman: Arkham Origins [2.2]
❖ Pros
❖ Very accurate
❖ Supports snow filling over time
❖ Cons
❖ No elevation on trail edges
❖ Only works on flat, rectangular surfaces
❖ Need to track which actors affect which surfaces.
Procedural Snow Deformation
102. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Heightmap Writing
❖ Approximate objects with points
❖ Lara has points on hands and feet
❖ Accumulate points into buffer
❖ Dispatch single compute shader
❖ Number groups = number points
❖ Groups write 32 × 32 pixels around points
❖ Atomic min
❖ Deformation height = point height + distance(pixel.xy, point.xy) 2
102
Procedural Snow Deformation
102
103. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Heightmap Reading
❖ Snow rendering samples heightmap
❖ Snow height = vertex.z
❖ New height = min( snow height , deformation height )
❖ Pass snow height to pixel shader
❖ Per-pixel normals with reoriented normal mapping [2.3]
103
Procedural Snow Deformation
103
104. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Elevation
104
Procedural Snow Deformation
Trail with
Depression
Trail with
Depression
+ ElevationVS
106. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Elevation
❖ abs(deformation height – snow height) not enough
❖ Foot height needed
106
Procedural Snow Deformation
Snow vertex Z
Deformation height
V
SElevation on
side of trail
No depression
= no elevation
Foot height
107. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Constructing the elevation
107
Procedural Snow Deformation
❖ abs(deformation height – snow height) not enough
❖ Foot height needed
UINT32
Deformation Height Foot Height
108. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Elevation Distance
108
Procedural Snow Deformation
Y = depression depth
X = Y
X = depression distance
Elevation distance =
(Distance from foot – X)
Y = snow height – foot height
2nd 16-bits: Foot height
1st 16-bits:
Deformation height
𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑓𝑟𝑜𝑚 𝑓𝑜𝑜𝑡 = 𝑑𝑒𝑓𝑜𝑟𝑚. h𝑒𝑖𝑔h𝑡−𝑓𝑜𝑜𝑡 h𝑒𝑖𝑔h𝑡
X
Y
109. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Texture selection
❖ Change look of deformed snow
❖ Use multiple textures
❖ Texture selection float:
❖ 0.0: bottom of depression
❖ 1.0: beginning of elevation
❖ 2.0: regular snow
109
Procedural Snow Deformation
Trail elevation
Trail depression
Foot Height
Snow
0.0
1.02.0
110. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Sliding Window Heightmap
❖ Can’t cover entire level
❖ Heightmap is 32-bit 1024 × 1024 texture (4MB)
❖ Resolution of 4cm per pixel (covers ~20m around Lara)
❖ Sliding window heightmap
❖ Reading from texture: wrap sampler
❖ Writing to texture: compute shader + modulus function
float2 Modulus(float2 WorldPos, float2 TexSize) {
return WorldPos – ( TexSize × floor( WorldPos / TexSize ) );
}
110
Procedural Snow Deformation
111. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Sliding Window Heightmap
111
Procedural Snow Deformation
Lara’s delta position
Lara’s old position
These old pixels…
Become these new pixels Deformation Heightmap
Lara’s new position
112. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Early Exit
❖ 1. Point outside sliding window
❖ 2. Snow height and deformation height too distant (> 2m)
❖ #2 allows multiple levels of deformable snow
112
Procedural Snow Deformation
113. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Vertical Sliding Window
❖ Heightmap values are U16 → range limited
❖ Still want detail
❖ → Vertical Sliding Window
❖ If (player.Z > window max)
move window up
❖ If (player.Z < window min)
move window down
❖ Window moved by range / 2
❖ Deform. height += window min
113
Procedural Snow Deformation
Window Min
Window Max
U16 Range
114. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Filling over time
❖ Simulate trails disappearing in blizzard
❖ Compute shader adds constant to heightmap
❖ Covers entire heightmap (10242 threads)
❖ Erase sliding window edge with extra fill
❖ Exponential function makes window edge smooth
114
Procedural Snow Deformation
Deformed Snow
Exponential Edge
Erase
Constant Edge Erase VS
115. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Filling over time
❖ Filling breaks overlapping snow meshes (bridge scenario)
❖ Only happens after very long fill
❖ Prevent with per-texel lifetime (steal 6 bits from foot height)
❖ When lifetime is max, clear deformation and set texel = U32 max
❖ Increase precision: store foot height - deformation height
115
Procedural Snow Deformation
UINT32
Deformation Height Foot – Deformation Lifetime
16-bit 10-bit 6-bit
116. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Compute Shader Timings
❖ Timings are captured from an Xbox One
Procedural Snow Deformation
116
Pass Time (ns)
Snow Deformation Shader 0.011
Snow Fill Shader 0.175
Total 0.186
116
117. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Snow Tessellation
❖ Snow vertex shader is expensive
❖ Adaptively tessellate in image space (maxtessfactor = 10)
❖ Frustum culling done in HS
❖ No backface culling, snow is mainly flat
❖ Normals generated using derivatives [2.4]
❖ Using ShaderFastMathLib [2.5]
117
Procedural Snow Deformation
117
118. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Snow Tessellation - Performance
118
Procedural Snow Deformation
❖ Untessellated
❖ Normal pass: 3.07 ms
❖ Composite pass: 2.55 ms
❖ Tessellated
❖ Normal pass: 1.6 ms
❖ Composite pass: 1.14 ms
119. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
Future Applications
❖ Technique doesn’t care about geometry
❖ Very flexible
❖ Single deformation heightmap → snow, mud, sand, dust, grass,
❖ We hope to see this technique adopted and improved in future AAA titles.
119
Procedural Snow Deformation
119
120. RISE OF THE TOMB RAIDER
www.eidosmontreal.com
www.square-enix-montreal.com
References
❖ [1.1] McGuire et al., Scalable Ambient Obscurance, 2012
❖ [1.2] Bart Wronski, Temporal Supersampling pt. 2 – SSAO demonstration, 2014
❖ [1.3] James Stanard, Applied Compute Shaders, Xfest 2015
❖ [2.1] Jean-Francois St-Amour, Rendering Assassin’s Creed III, GDC 2013
❖ [2.2] Colin Barre-Brisebois, Defromable Snow Rendering in Batman: Arkham Origins, GDC 2014
❖ [2.3] Stephen Hill, Blending in Detail, 2012
❖ [2.4] Morten Mikkelsen, Derivative Maps, 2011
❖ [2.5] Michał Drobot, ShaderFastMathLib, 2014
❖ .
120
Procedural Snow Deformation