SlideShare a Scribd company logo
1 of 31
Download to read offline
Screen Space Reflections in
Michele Giacalone
Graphics Programmer @ Deck13
▶ Current working title
▶ Sci-Fi third person action RPG
▶ Exoskeletons!
▶ In-house engine “Fledge”
▶ Multiplatform
▶ PC (D3D11)
▶ Xbox One
▶ PS4
Deck13 - The Surge
What this talk is NOT about:
▶ Novel rendering technique
▶ Accurate physically based approach
▶ Heavy math formulas
Disclaimer
What this talk IS about:
▶ What worked for us
▶ How we approached the problem
▶ Share ideas that can be used for other techniques
Disclaimer
SSR OFF
SSR ON
SSR ON
▶ Overview
▶ Rendering
▶ Async Compute
▶ Conclusions
Agenda
Overview
▶ Performance
▶ Not that much frame time to spare
▶ Maximum budget allowed < 2 ms in worst case
▶ Still other features to implement
▶ Particularly true on Xbox One platform
▶ Quality
▶ Plausible BRDF match with our IBL
▶ Contact hardening reflections
▶ Ambient specular occlusion approximation
▶ No aggressive masking based on roughness
Overview - What we wanted
▶ Compute reflection vector from view direction
▶ Use GBuffer normals
▶ Ray marching against depth buffer
▶ Iterate until ray ‘intersects’ the depth buffer
▶ Use hit coordinate to resolve reflection color
▶ Reproject from previous frame
Overview - Screen Space Reflections
Hit Point
Rendering
▶ Tile classification
▶ Ray Marching
▶ Convolve Scene
▶ Resolve Reflections
▶ Deinterleave and Reproject
▶ Async Compute
Rendering - Overview
Rendering - Tile Classification
▶ Some texels are not contributing
▶ Other texels might require extra marching steps
▶ Divide screen in 16x16 texel tiles
▶ Fast ray march
▶ Sparse ray distribution [Wronski14]
▶ 64 rays in 16x16 texel
▶ Non-uniform jittered
▶ Different each frame to maximize coverage
▶ Estimate tile ray hit variance
▶ Discard non contributing tiles
▶ Produce GPU job queue
▶ Encode tile data into uint32
▶ Append to GPU job queue
▶ Consume later on with DispatchIndirect
(0, 4) (0,5) (0,6) (0, 8) ...
GPU job queue
▶ Naive approach is simple but it is also slow
▶ Hi-Z is sexy but might have too much overhead
▶ Depth sample distribution is a serious thing [McGuire14]
▶ Don’t forget you’re bound to screen space data
▶ What about depth thickness?
▶ And sampling coherency?
▶ What else?
▶ (ノàȠ益àČ )ăƒŽćœĄâ”»â”â”»
Rendering - Ray Marching Overview
▶ Ray march at lower resolution (720p, 900p)
▶ Interleaved rendering
▶ Even/Odd checkerboard pattern [El Mansouri16]
▶ Successive passes work with interleaved data
▶ Use low resolution depth buffer
▶ Less bandwidth, better cache usage
▶ No big impact on quality
▶ Importance sampling (GGX distributed rays)
▶ Fixed ray step count
▶ Line segment intersection [Valient14][Timonen15]
▶ Jitter ray start time, reduce banding artifacts
▶ Noise filtered out with temporal reprojection
▶ Process 4 depth values at time to hide VMEM latency (GCN)
▶ Output hit coordinate in a R10G10B10A2_UNORM target
Rendering - Ray Marching
A B C D
E F G H
I J K L
M N O P
B D
E G
J L
M O
A C
F H
I K
N P
Odd Frame
Checkerboard Pattern
Even Frame
Ray Hit Point (Interleaved) Attenuation mask (Interleaved)
▶ Based on “Screen-Space Cone-Traced Reflections” [Uludag14]
▶ Create convolved scene buffer mip chain
▶ Use previous frame buffer
▶ Includes reflections
▶ Accumulate multiple bounces
▶ 7x7 separable blur in a single dispatch
▶ Derive cone angle from roughness
▶ Best fit to match IBL
▶ Accumulate samples
▶ Use roughness as weight factor
▶ On Consoles
▶ Compute mip chain on same resource
▶ Avoid unnecessary copies
▶ Saves ~0.1 ms
Rendering - Convolve Scene And Resolve Reflections
MIP 0 MIP 1 MIP 2
MIP 3 MIP 4 MIP 5
Resolved Reflections (Interleaved)
▶ Based on “Screen-Space Cone-Traced Reflections” [Uludag14]
▶ Create convolved scene buffer mip chain
▶ Use previous frame buffer
▶ Includes reflections
▶ Accumulate multiple bounces
▶ 7x7 separable blur in a single dispatch
▶ Derive cone angle from roughness
▶ Best fit to match IBL
▶ Accumulate samples
▶ Use roughness as weight factor
▶ On Consoles
▶ Compute mip chain on same resource
▶ Avoid unnecessary copies
▶ Saves ~0.1 ms
Rendering - Convolve Scene And Resolve Reflections
▶ Deinterleave samples into LDS (Local Data Share)
▶ Load samples into LDS
▶ Extra samples required for reconstruct neighbour data
▶ Combine reads with gather
▶ Reconstruct missing samples using neighbors
▶ Temporal Reprojection
▶ Neighbors color data already available in LDS â˜ș
▶ Clamp history with 3x3 neighborhood AABB [Karis14]
▶ Use reversible tone map operator to reduce fireflies [Karis13]
▶ Local Data Storage (Grandma's Home Remedy)
▶ "Careful With That Axe, Eugene"
▶ Store separate RGB channels
▶ Pack two color channel into a single slot
Rendering - Deinterleave and Reproject
Loaded Samples into LDS
Final Reflections (Deinterleaved + Temporal Reprojection)
Async Compute
Async Compute - Dependencies
Tile Classification
Convolve Scene
Depth Buffer
Prev Frame Buffer
Deinterleave And Reproject Resolve Reflections
Ray Marching
Main dependencies:
▶ Depth Buffer
▶ Available after GBuffer
▶ Previous Frame Buffer
▶ Available after scene combine
▶ Start computing data in previous frame directly â˜ș
▶ Async dispatch Convolve Scene right after scene is resolved
▶ Overlaps mostly SAT and Post Process
▶ Bandwidth intensive, limit occupancy
▶ Async dispatch Tile Classification right after GBuffer
▶ Overlaps Decal Rendering
▶ Helps filling the holes in the pipeline
▶ Async dispatch Ray Marching
▶ Remaining Passes
▶ Async Dispatch while Shadow Rendering
▶ Find the right balance with Compute Lighting
▶ Do not use CS if you can use PS instead!
▶ On PC D3D11, no async dispatch available
▶ On GCN, going through CB cache is generally faster [Persson14]
Async Compute - Dispatch
Conclusions
▶ Usually few depth samples are enough
▶ Line segment intersection works great!
▶ Thin objects require more samples
▶ Use hybrid tracing algorithms [Stachowiak15]
▶ Interleaved rendering is awesome!
▶ Easy to use with other passes (e.g. SSAO)
▶ GPU work queues can be useful
▶ Dispatch only required threads
▶ Can overlap other Compute jobs (Console, D3D12, Vulkan, etc.)
▶ Reality check!
▶ Screen space data inherited problems
▶ Extremely easy to break
▶ Maybe invest GPU time in something else? [Pettineo11]
Conclusions - What we learnt
Conclusions - Performance Table
Tile
Classification
Ray Marching Convolve
Scene
Resolve
Reflections
Deinterleave
and Reproject
Total
0.07 ms 0.21 ms 0.43 ms 0.41 ms 0.27 ms 1.39 ms
Xbox One, SSR @ 720p, (no ESRAM, No Async Compute)
References
[ElMonsouri16] Jalal El Mansouri, “Rendering Rainbow Six Siege”, GDC, 2016
[Stachowiak15] Tomasz Stachowiak, “Stochastic Screen-Space Reflections”, SIGGRAPH, 2015
[Timonen15] Ari Silvennoinen and Ville Timonen, “Multi-Scale Global Illumination in Quantum Break”, SIGGRAPH, 2015
[McGuire14] Morgan McGuire and Michael Mara, “Efficient GPU Screen-Space Ray Tracing”, JCGT, 2014
[Uludag14] Yasin Uludag, “Hi-Z Screen-Space Cone-Traced Reflections”, In GPU Pro 5, 2014
[Valiant14] Michal Valient, “Reflections and Volumetrics of Killzone: Shadow Fall”, SIGGRAPH, 2014
[Karis14] Brian Karis, “High-Quality Temporal Supersampling”, SIGGRAPH, 2014
[Wronski14] Bart Wronski, “Assassin’s Creed 4: Road to Next-gen Graphics”, GDC, 2014
[Persson14] Emil Persson, “Low-Level Shader Optimization for Next-Gen and DX11”, GDC, 2014
[Pettineo11] Matt Pettineo, “10 Things that need to die for Next-Gen”,
https://mynameismjp.wordpress.com/2011/12/06/things-that-need-to-die/
[Karis13] Brian Karis, “Tone Mapping”, http://graphicrants.blogspot.de/2013/12/tone-mapping.html
Thank You!
Email: mgiacalone@deck13.com
Twitter: miccode

More Related Content

What's hot

Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunFive Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunElectronic Arts / DICE
 
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Tiago Sousa
 
DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3Electronic Arts / DICE
 
Physically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in FrostbitePhysically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in FrostbiteElectronic Arts / DICE
 
Moving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based RenderingMoving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based RenderingElectronic Arts / DICE
 
Lighting of Killzone: Shadow Fall
Lighting of Killzone: Shadow FallLighting of Killzone: Shadow Fall
Lighting of Killzone: Shadow FallGuerrilla
 
Shiny PC Graphics in Battlefield 3
Shiny PC Graphics in Battlefield 3Shiny PC Graphics in Battlefield 3
Shiny PC Graphics in Battlefield 3Electronic Arts / DICE
 
The Rendering Technology of Killzone 2
The Rendering Technology of Killzone 2The Rendering Technology of Killzone 2
The Rendering Technology of Killzone 2Guerrilla
 
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...Electronic Arts / DICE
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologyTiago Sousa
 
Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Tiago Sousa
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect AndromedaElectronic Arts / DICE
 
Decima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero DawnDecima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero DawnGuerrilla
 
Physically Based Sky, Atmosphere and Cloud Rendering in Frostbite
Physically Based Sky, Atmosphere and Cloud Rendering in FrostbitePhysically Based Sky, Atmosphere and Cloud Rendering in Frostbite
Physically Based Sky, Atmosphere and Cloud Rendering in FrostbiteElectronic Arts / DICE
 
Rendering Tech of Space Marine
Rendering Tech of Space MarineRendering Tech of Space Marine
Rendering Tech of Space MarinePope Kim
 
Taking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationTaking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationGuerrilla
 
A Bizarre Way to do Real-Time Lighting
A Bizarre Way to do Real-Time LightingA Bizarre Way to do Real-Time Lighting
A Bizarre Way to do Real-Time LightingSteven Tovey
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Ki Hyunwoo
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility bufferWolfgang Engel
 

What's hot (20)

Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunFive Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
 
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)
 
Frostbite on Mobile
Frostbite on MobileFrostbite on Mobile
Frostbite on Mobile
 
DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3
 
Physically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in FrostbitePhysically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in Frostbite
 
Moving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based RenderingMoving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based Rendering
 
Lighting of Killzone: Shadow Fall
Lighting of Killzone: Shadow FallLighting of Killzone: Shadow Fall
Lighting of Killzone: Shadow Fall
 
Shiny PC Graphics in Battlefield 3
Shiny PC Graphics in Battlefield 3Shiny PC Graphics in Battlefield 3
Shiny PC Graphics in Battlefield 3
 
The Rendering Technology of Killzone 2
The Rendering Technology of Killzone 2The Rendering Technology of Killzone 2
The Rendering Technology of Killzone 2
 
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics Technology
 
Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
 
Decima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero DawnDecima Engine: Visibility in Horizon Zero Dawn
Decima Engine: Visibility in Horizon Zero Dawn
 
Physically Based Sky, Atmosphere and Cloud Rendering in Frostbite
Physically Based Sky, Atmosphere and Cloud Rendering in FrostbitePhysically Based Sky, Atmosphere and Cloud Rendering in Frostbite
Physically Based Sky, Atmosphere and Cloud Rendering in Frostbite
 
Rendering Tech of Space Marine
Rendering Tech of Space MarineRendering Tech of Space Marine
Rendering Tech of Space Marine
 
Taking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationTaking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next Generation
 
A Bizarre Way to do Real-Time Lighting
A Bizarre Way to do Real-Time LightingA Bizarre Way to do Real-Time Lighting
A Bizarre Way to do Real-Time Lighting
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility buffer
 

Similar to Screen Space Reflections in The Surge

Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicschangehee lee
 
Killzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo PostmortemKillzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo PostmortemGuerrilla
 
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...Johan Andersson
 
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...Unity Technologies
 
NVIDIA effects GDC09
NVIDIA effects GDC09NVIDIA effects GDC09
NVIDIA effects GDC09IGDA_London
 
Introduction To Massive Model Visualization
Introduction To Massive Model VisualizationIntroduction To Massive Model Visualization
Introduction To Massive Model Visualizationpjcozzi
 
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016Alex Vlachos
 
GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11smashflt
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And EffectsThomas Goddard
 
Computer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming IComputer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming IđŸ’» Anton Gerdelan
 
Look Ma, No Jutter! Optimizing Performance Across Oculus Mobile
Look Ma, No Jutter! Optimizing Performance Across Oculus MobileLook Ma, No Jutter! Optimizing Performance Across Oculus Mobile
Look Ma, No Jutter! Optimizing Performance Across Oculus MobileUnity Technologies
 
Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Graham Wihlidal
 
Deferred shading
Deferred shadingDeferred shading
Deferred shadingFrank Chao
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Johan Andersson
 
Technologies Used In Graphics Rendering
Technologies Used In Graphics RenderingTechnologies Used In Graphics Rendering
Technologies Used In Graphics RenderingBhupinder Singh
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Johan Andersson
 
Dx11 performancereloaded
Dx11 performancereloadedDx11 performancereloaded
Dx11 performancereloadedmistercteam
 
High Performance Rust UI.pdf
High Performance Rust UI.pdfHigh Performance Rust UI.pdf
High Performance Rust UI.pdfmraaaaa
 
OpenGL for 2015
OpenGL for 2015OpenGL for 2015
OpenGL for 2015Mark Kilgard
 
Android performance
Android performanceAndroid performance
Android performanceEugene Dubovik
 

Similar to Screen Space Reflections in The Surge (20)

Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphics
 
Killzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo PostmortemKillzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo Postmortem
 
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
 
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
 
NVIDIA effects GDC09
NVIDIA effects GDC09NVIDIA effects GDC09
NVIDIA effects GDC09
 
Introduction To Massive Model Visualization
Introduction To Massive Model VisualizationIntroduction To Massive Model Visualization
Introduction To Massive Model Visualization
 
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
 
GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And Effects
 
Computer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming IComputer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming I
 
Look Ma, No Jutter! Optimizing Performance Across Oculus Mobile
Look Ma, No Jutter! Optimizing Performance Across Oculus MobileLook Ma, No Jutter! Optimizing Performance Across Oculus Mobile
Look Ma, No Jutter! Optimizing Performance Across Oculus Mobile
 
Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016Optimizing the Graphics Pipeline with Compute, GDC 2016
Optimizing the Graphics Pipeline with Compute, GDC 2016
 
Deferred shading
Deferred shadingDeferred shading
Deferred shading
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
Technologies Used In Graphics Rendering
Technologies Used In Graphics RenderingTechnologies Used In Graphics Rendering
Technologies Used In Graphics Rendering
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!
 
Dx11 performancereloaded
Dx11 performancereloadedDx11 performancereloaded
Dx11 performancereloaded
 
High Performance Rust UI.pdf
High Performance Rust UI.pdfHigh Performance Rust UI.pdf
High Performance Rust UI.pdf
 
OpenGL for 2015
OpenGL for 2015OpenGL for 2015
OpenGL for 2015
 
Android performance
Android performanceAndroid performance
Android performance
 

Recently uploaded

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo GarcĂ­a Lavilla
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Recently uploaded (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Screen Space Reflections in The Surge

  • 1. Screen Space Reflections in Michele Giacalone Graphics Programmer @ Deck13
  • 2. ▶ Current working title ▶ Sci-Fi third person action RPG ▶ Exoskeletons! ▶ In-house engine “Fledge” ▶ Multiplatform ▶ PC (D3D11) ▶ Xbox One ▶ PS4 Deck13 - The Surge
  • 3. What this talk is NOT about: ▶ Novel rendering technique ▶ Accurate physically based approach ▶ Heavy math formulas Disclaimer
  • 4. What this talk IS about: ▶ What worked for us ▶ How we approached the problem ▶ Share ideas that can be used for other techniques Disclaimer
  • 8. ▶ Overview ▶ Rendering ▶ Async Compute ▶ Conclusions Agenda
  • 10. ▶ Performance ▶ Not that much frame time to spare ▶ Maximum budget allowed < 2 ms in worst case ▶ Still other features to implement ▶ Particularly true on Xbox One platform ▶ Quality ▶ Plausible BRDF match with our IBL ▶ Contact hardening reflections ▶ Ambient specular occlusion approximation ▶ No aggressive masking based on roughness Overview - What we wanted
  • 11. ▶ Compute reflection vector from view direction ▶ Use GBuffer normals ▶ Ray marching against depth buffer ▶ Iterate until ray ‘intersects’ the depth buffer ▶ Use hit coordinate to resolve reflection color ▶ Reproject from previous frame Overview - Screen Space Reflections Hit Point
  • 13. ▶ Tile classification ▶ Ray Marching ▶ Convolve Scene ▶ Resolve Reflections ▶ Deinterleave and Reproject ▶ Async Compute Rendering - Overview
  • 14. Rendering - Tile Classification ▶ Some texels are not contributing ▶ Other texels might require extra marching steps ▶ Divide screen in 16x16 texel tiles ▶ Fast ray march ▶ Sparse ray distribution [Wronski14] ▶ 64 rays in 16x16 texel ▶ Non-uniform jittered ▶ Different each frame to maximize coverage ▶ Estimate tile ray hit variance ▶ Discard non contributing tiles ▶ Produce GPU job queue ▶ Encode tile data into uint32 ▶ Append to GPU job queue ▶ Consume later on with DispatchIndirect (0, 4) (0,5) (0,6) (0, 8) ... GPU job queue
  • 15. ▶ Naive approach is simple but it is also slow ▶ Hi-Z is sexy but might have too much overhead ▶ Depth sample distribution is a serious thing [McGuire14] ▶ Don’t forget you’re bound to screen space data ▶ What about depth thickness? ▶ And sampling coherency? ▶ What else? ▶ (ノàȠ益àČ )ăƒŽćœĄâ”»â”â”» Rendering - Ray Marching Overview
  • 16. ▶ Ray march at lower resolution (720p, 900p) ▶ Interleaved rendering ▶ Even/Odd checkerboard pattern [El Mansouri16] ▶ Successive passes work with interleaved data ▶ Use low resolution depth buffer ▶ Less bandwidth, better cache usage ▶ No big impact on quality ▶ Importance sampling (GGX distributed rays) ▶ Fixed ray step count ▶ Line segment intersection [Valient14][Timonen15] ▶ Jitter ray start time, reduce banding artifacts ▶ Noise filtered out with temporal reprojection ▶ Process 4 depth values at time to hide VMEM latency (GCN) ▶ Output hit coordinate in a R10G10B10A2_UNORM target Rendering - Ray Marching A B C D E F G H I J K L M N O P B D E G J L M O A C F H I K N P Odd Frame Checkerboard Pattern Even Frame
  • 17. Ray Hit Point (Interleaved) Attenuation mask (Interleaved)
  • 18. ▶ Based on “Screen-Space Cone-Traced Reflections” [Uludag14] ▶ Create convolved scene buffer mip chain ▶ Use previous frame buffer ▶ Includes reflections ▶ Accumulate multiple bounces ▶ 7x7 separable blur in a single dispatch ▶ Derive cone angle from roughness ▶ Best fit to match IBL ▶ Accumulate samples ▶ Use roughness as weight factor ▶ On Consoles ▶ Compute mip chain on same resource ▶ Avoid unnecessary copies ▶ Saves ~0.1 ms Rendering - Convolve Scene And Resolve Reflections
  • 19. MIP 0 MIP 1 MIP 2 MIP 3 MIP 4 MIP 5
  • 21. ▶ Based on “Screen-Space Cone-Traced Reflections” [Uludag14] ▶ Create convolved scene buffer mip chain ▶ Use previous frame buffer ▶ Includes reflections ▶ Accumulate multiple bounces ▶ 7x7 separable blur in a single dispatch ▶ Derive cone angle from roughness ▶ Best fit to match IBL ▶ Accumulate samples ▶ Use roughness as weight factor ▶ On Consoles ▶ Compute mip chain on same resource ▶ Avoid unnecessary copies ▶ Saves ~0.1 ms Rendering - Convolve Scene And Resolve Reflections
  • 22. ▶ Deinterleave samples into LDS (Local Data Share) ▶ Load samples into LDS ▶ Extra samples required for reconstruct neighbour data ▶ Combine reads with gather ▶ Reconstruct missing samples using neighbors ▶ Temporal Reprojection ▶ Neighbors color data already available in LDS â˜ș ▶ Clamp history with 3x3 neighborhood AABB [Karis14] ▶ Use reversible tone map operator to reduce fireflies [Karis13] ▶ Local Data Storage (Grandma's Home Remedy) ▶ "Careful With That Axe, Eugene" ▶ Store separate RGB channels ▶ Pack two color channel into a single slot Rendering - Deinterleave and Reproject Loaded Samples into LDS
  • 23. Final Reflections (Deinterleaved + Temporal Reprojection)
  • 25. Async Compute - Dependencies Tile Classification Convolve Scene Depth Buffer Prev Frame Buffer Deinterleave And Reproject Resolve Reflections Ray Marching Main dependencies: ▶ Depth Buffer ▶ Available after GBuffer ▶ Previous Frame Buffer ▶ Available after scene combine
  • 26. ▶ Start computing data in previous frame directly â˜ș ▶ Async dispatch Convolve Scene right after scene is resolved ▶ Overlaps mostly SAT and Post Process ▶ Bandwidth intensive, limit occupancy ▶ Async dispatch Tile Classification right after GBuffer ▶ Overlaps Decal Rendering ▶ Helps filling the holes in the pipeline ▶ Async dispatch Ray Marching ▶ Remaining Passes ▶ Async Dispatch while Shadow Rendering ▶ Find the right balance with Compute Lighting ▶ Do not use CS if you can use PS instead! ▶ On PC D3D11, no async dispatch available ▶ On GCN, going through CB cache is generally faster [Persson14] Async Compute - Dispatch
  • 28. ▶ Usually few depth samples are enough ▶ Line segment intersection works great! ▶ Thin objects require more samples ▶ Use hybrid tracing algorithms [Stachowiak15] ▶ Interleaved rendering is awesome! ▶ Easy to use with other passes (e.g. SSAO) ▶ GPU work queues can be useful ▶ Dispatch only required threads ▶ Can overlap other Compute jobs (Console, D3D12, Vulkan, etc.) ▶ Reality check! ▶ Screen space data inherited problems ▶ Extremely easy to break ▶ Maybe invest GPU time in something else? [Pettineo11] Conclusions - What we learnt
  • 29. Conclusions - Performance Table Tile Classification Ray Marching Convolve Scene Resolve Reflections Deinterleave and Reproject Total 0.07 ms 0.21 ms 0.43 ms 0.41 ms 0.27 ms 1.39 ms Xbox One, SSR @ 720p, (no ESRAM, No Async Compute)
  • 30. References [ElMonsouri16] Jalal El Mansouri, “Rendering Rainbow Six Siege”, GDC, 2016 [Stachowiak15] Tomasz Stachowiak, “Stochastic Screen-Space Reflections”, SIGGRAPH, 2015 [Timonen15] Ari Silvennoinen and Ville Timonen, “Multi-Scale Global Illumination in Quantum Break”, SIGGRAPH, 2015 [McGuire14] Morgan McGuire and Michael Mara, “Efficient GPU Screen-Space Ray Tracing”, JCGT, 2014 [Uludag14] Yasin Uludag, “Hi-Z Screen-Space Cone-Traced Reflections”, In GPU Pro 5, 2014 [Valiant14] Michal Valient, “Reflections and Volumetrics of Killzone: Shadow Fall”, SIGGRAPH, 2014 [Karis14] Brian Karis, “High-Quality Temporal Supersampling”, SIGGRAPH, 2014 [Wronski14] Bart Wronski, “Assassin’s Creed 4: Road to Next-gen Graphics”, GDC, 2014 [Persson14] Emil Persson, “Low-Level Shader Optimization for Next-Gen and DX11”, GDC, 2014 [Pettineo11] Matt Pettineo, “10 Things that need to die for Next-Gen”, https://mynameismjp.wordpress.com/2011/12/06/things-that-need-to-die/ [Karis13] Brian Karis, “Tone Mapping”, http://graphicrants.blogspot.de/2013/12/tone-mapping.html