SlideShare a Scribd company logo
1 of 63
Download to read offline
FrameGraph:
Extensible Rendering
Architecture in Frostbite
Yuriy O’Donnell
Rendering Engineer
Frostbite
Outline
 Introduction and history
 Frame Graph
 Transient Resource System
 Conclusions
Spoilers:
 Improved engine extensibility
 Simplified async compute
 Automated ESRAM aliasing
 Saved tons of GPU memory
Introduction
FROSTBITE EVOLUTION OVER THE LAST DECADE
Frostbite 2007 vs 2017
2007
 DICE next-gen engine
 Built from the ground up for
 Xbox 360
 PlayStation 3
 Multi-core PCs
 DirectX 9 SM3 & Direct3D 10
 To be used in future DICE games
2017
 The EA engine
 Evolved and scaled up for
 Xbox One
 PlayStation 4
 Multi-core PCs
 DirectX 12
 Used in ~15 current and future EA games
Rendering system overview `07
Game Renderer
World Renderer
UI
Shading system
Direct3D / libGCM
Meshes
Terrain
Sky
Undergrowth
Particles
Decals
Rendering system overview `17
Game Renderer
World Renderer
UI
Shading system
Direct3D 11 / Direct3D 12 / libGNM
(Metal / GLES / Mantle)
Meshes
Terrain
Sky
Undergrowth
Particles
Decals
Reflections
PBRShadows
GI
Post-processing
Volumetric FX HDR
Game-specific
rendering features
Rendering system overview (simplified)
World Renderer
Shading System
Features
Features
Render Context
GFX APIs
WorldRenderer
 Orchestrates all rendering
 Code-driven architecture
 Main world geometry (via )
 Lighting, Post-processing (via )
 Knows about all views and render passes
 Marshalls settings and resources between systems
 Allocates resources (render targets, buffers)
Shading
System
Render Context
World Renderer
Shading System
Features
Features
Render Context
GFX APIs
Battlefield 4 rendering passes ( )
 mainTransDecal
 fgOpaqueEmissive
 subsurfaceScattering
 skyAndFog
 hairCoverage
 mainTransDepth
 linerarizeZ
 mainTransparent
 halfResUpsample
 motionBlurDerive
 motionBlurVelocity
 motionBlurFilter
 filmicEffectsEdge
 spriteDof
 fgTransparent
 lensScope
 filmicEffects
 bloom
 luminanceAvg
 finalPost
 overlay
 fxaa
 smaa
 resample
 screenEffect
 hmdDistortion
 spotlightShadowmaps
 downsampleZ
 linearizeZ
 ssao
 hbaoHalfZ
 hbao
 ssr
 halfResZPass
 halfResTransp
 mainDistort
 lightPassEnd
 mainOpaque
 linearizeZ
 mainOpaqueEmissive
 reflectionCapture
 planarReflections
 dynamicEnvmap
 mainZPass
 mainGBuffer
 mainGBufferSimple
 mainGBufferDecal
 decalVolumes
 mainGBufferFixup
 msaaZDown
 msaaClassify
 lensFlareOcclusionQueries
 lightPassBegin
 cascadedShadowmaps
Features
WorldRenderer challenges
 Explicit immediate mode rendering
 Explicit resource management
 Bespoke, artisanal hand-crafted ESRAM management
 Multiple implementations by different game teams
 Tight coupling between rendering systems
 Limited extensibility
 Game teams must fork / diverge to customize
 Organically grew from 4k to 15k SLOC
 Single functions with over 2k SLOC
 Expensive to maintain, extend and merge/integrate
World Renderer
Shading System
Features
Features
Render Context
GFX APIs
Modular WorldRenderer goals
 High-level knowledge of the full frame
 Improved extensibility
 Decoupled and composable code modules
 Automatic resource management
 Better visualizations and diagnostics
World Renderer
Shading System
Features
Features
Render Context
GFX APIs
New architectural components
 Frame Graph
 High-level representation of
render passes and resources
 Full knowledge of the frame
 Transient Resource System
 Resource allocation
 Memory aliasing
World Renderer
Shading System
FeaturesFeatures
GFX APIs
Render Context
Frame Graph
Transient Resources
Frame Graph
Frame Graph goals
 Build high-level knowledge of the entire frame
 Simplify resource management
 Simplify rendering pipeline configuration
 Simplify async compute and resource barriers
 Allow self-contained and efficient rendering modules
 Visualize and debug complex rendering pipelines
Frame Graph example
Gbuffer pass
Depth Buffer
Depth pass
Gbuffer 1
Gbuffer 2
Lighting buffer
Post
Backbuffer
Gbuffer 3
Depth Buffer
Present
Lighting
Render operations and resources for the entire frame
expressed as a directed acyclic graph
Graph of a Battlefield 4 frame
Typically see few hundred passes and resources
Frame Graph design
 Moving away from immediate mode rendering
 Rendering code split into passes
 Multi-phase retained mode rendering API
1. Setup phase
2. Compile phase
3. Execute phase
 Built from scratch every frame
 Code-driven architecture
Frame Graph setup phase
 Define render / compute passes
 Define inputs and output resources for each pass
 Code flow is similar to immediate mode rendering
Setup
Compile
Execute
Frame Graph resources
 Render passes must declare all used resources
 Read
 Write
 Create
 External permanent resources are imported to Frame Graph
 History buffer for TAA
 Backbuffer
 etc.
Setup
Compile
Execute
Frame Graph resource example
RenderPass::RenderPass(FrameGraphBuilder& builder)
{
// Declare new transient resource
FrameGraphTextureDesc desc;
desc.width = 1280;
desc.height = 720;
desc.format = RenderFormat_D32_FLOAT;
desc.initialSate = FrameGraphTextureDesc::Clear;
m_renderTarget = builder.createTexture(desc);
}
RenderPass Render Target
Frame Graph setup example
RenderPass::RenderPass(FrameGraphBuilder& builder,
FrameGraphResource input,
FrameGraphMutableResource renderTarget)
{
// Declare resource dependencies
m_input = builder.read(input, readFlags);
m_renderTarget = builder.write(renderTarget, writeFlags);
}
Input
RenderPass Render Target
(version 2)
Render Target
(version 1)
Advanced FrameGraph operations
 Deferred-created resources
 Declare resource early, allocate on first actual use
 Automatic resource bind flags, based on usage
 Derived resource parameters
 Create render pass output based on input size / format
 Derive bind flags based on usage
 MoveSubresource
 Forward one resource to another
 Automatically creates sub-resource views / aliases
 Allows “time travel”
Reflection module
Deferred shading module
MoveSubresource example
Gbuffer pass
Depth Buffer
Depth pass
Gbuffer 1
Gbuffer 2
Lighting buffer
2D Render Target
Gbuffer 3
Depth Buffer
Lighting
Cubemap X+Cubemap X+Cubemap X+Cubemap X+Cubemap X+Cubemap (Z+)
Move
Lighting buffer
2D Render Target
Subresource 5
Convolution
Reflection
probe
Frame Graph compilation phase
 Cull unreferenced resources and passes
 Can be a bit more sloppy during declaration phase
 Aim to reduce configuration complexity
 Simplifies conditional passes, debug rendering, etc.
 Calculate resource lifetimes
 Allocate concrete GPU resources based on usage
 Simple greedy allocation algorithm
 Acquire right before first use, release after last use
 Extend lifetimes for async compute
 Derive resource bind flags based on usage
Setup
Compile
Execute
Sub-graph culling example
Gbuffer pass
Depth Buffer
Depth pass
Gbuffer 1
Gbuffer 2
Lighting buffer
Post
Final target
Gbuffer 3
Depth Buffer
Debug View
Debug output
Debug output texture is not
consumed, therefore it and the
render pass are culled Present
Lighting
Debug visualization is
switched on by connecting
the debug output to the back
buffer node
Sub-graph culling example
Gbuffer pass
Depth Buffer
Depth pass
Gbuffer 1
Gbuffer 2
Lighting buffer
Post
Final target
Depth Buffer
Debug View
Debug output Present
Lighting
Move
Lighting and postprocessing parts of the
pipeline are automatically disabled
Gbuffer 3
Frame Graph execution phase
 Execute callback functions for each render pass
 Immediate mode rendering code
 Using familiar RenderContext API
 Set state, resources, shaders
 Draw, Dispatch
 Get real GPU resources from handles generated in setup phase
Setup
Compile
Execute
Async compute
 Could derive from dependency graph automatically
 Manual control desired
 Great potential for performance savings, but…
 Memory increase
 Can hurt performance if misused
 Opt-in per render pass
 Kicked off on main timeline
 Sync point at first use of output resource on another queue
 Resource lifetimes automatically extended to sync point
Async compute
SSAO Shadows LightingDepth pass SSAO FilterMain queue
Depth Buffer
Raw AO
Filtered AO
Async compute
Shadows LightingDepth passMain queue
Async queue SSAO SSAO Filter
Depth Buffer
Filtered AO
Raw AO
Sync point
Frame Graph async setup example
AmbientOcclusionPass::AmbientOcclusionPass(FrameGraphBuilder& builder)
{
// The only change required to make this pass
// and all its child passes run on async queue
builder.asyncComputeEnable(true);
// Rest of the setup code is unaffected
// …
}
Pass declaration with C++
 Could just make a C++ class per RenderPass
 Breaks code flow
 Requires plenty of boilerplate
 Expensive to port existing code
 Settled on C++ lambdas
 Preserves code flow!
 Minimal changes to legacy code
 Wrap legacy code in a lambda
 Add a resource usage declarations
Pass declaration with C++ lambdas
FrameGraphResource addMyPass(FrameGraph& frameGraph,
FrameGraphResource input, FrameGraphMutableResource output)
{
struct PassData
{
FrameGraphResource input;
FrameGraphMutableResource output;
};
auto& renderPass = frameGraph.addCallbackPass<PassData>(“MyRenderPass",
[&](RenderPassBuilder& builder, PassData& data)
{
// Declare all resource accesses during setup phase
data.input = builder.read(input);
data.output = builder.useRenderTarget(output).targetTextures[0];
},
[=](const PassData& data, const RenderPassResources& resources, IRenderContext* renderContext)
{
// Render stuff during execution phase
drawTexture2d(renderContext, resources.getTexture(data.input));
});
return renderPass.output;
}
Setup
Execute
(deferred)
Resources
Render modules
 Two types of render modules:
1. Free-standing stateless functions
 Inputs and outputs are Frame Graph resource handles
 May create nested render passes
 Most common module type in Frostbite
2. Persistent render modules
 May have some persistent resources (LUTs, history buffers, etc.)
 WorldRenderer still orchestrates high-level rendering
 Does not allocate any GPU resources
 Just kicks off rendering modules at the high level
 Much easier to extend
 Code size reduced from 15K to 5K SLOC
Communication between modules
 Modules may communicate through a blackboard
 Hash table of components
 Accessed via component Type ID
 Allows controlled coupling
void BlurModule::renderBlurPyramid(
FrameGraph& frameGraph,
FrameGraphBlackboard& blackboard)
{
// Produce blur pyramid in the blur module
auto& blurData = blackboard.add<BlurPyramidData>();
addBlurPyramidPass(frameGraph, blurData);
}
#include ”BlurModule.h”
void TonemapModule::createBlurPyramid(
FrameGraph& frameGraph,
const FrameGraphBlackboard& blackboard)
{
// Consume blur pyramid in a different module
const auto& blurData = blackboard.get<BlurPyramidData>();
addTonemapPass(frameGraph, blurData);
}
Transient Resource System
Transient resource system
 Transient /ˈtranzɪənt/ adjective
Lasting only for a short time; impermanent.
 Resources that are alive for no longer than one frame
 Buffers, depth and color targets, UAVs
 Strive to minimize resource life times within a frame
 Allocate resources where they are used
 Directly in leaf rendering systems
 Deallocate as soon as possible
 Make it easier to write self-contained features
 Critical component of Frame Graph
Transient resource system back-end
 Implementation depends on platform capabilities
 Aliasing in physical memory ( )
 Aliasing in virtual memory ( )
 Object pools ( )
 Atomic linear allocator for buffers
 No aliasing, just blast through memory
 Mostly used for sending data to GPU
 Memory pools for textures
Complexity
Efficiency
DX11 PC
PS4
XB1
DX12 PC
DX11
DX12 PS4
XB1
Transient textures on PlayStation 4
Depth Buffer
Depth pass Gbuffer passSSAO Lighting
Gbuffer 1
Gbuffer 2
Gbuffer 3
AO
Lighting buffer
Post
Final output
Waste due to fragmentation
Time
VirtualAddress
Heap 6
Heap 5
Heap 4
Heap 3
Heap 2
Heap 1
Transient textures on DirectX 12 PC
Depth Buffer
Depth pass Gbuffer passSSAO Lighting
Gbuffer 1
Gbuffer 2
Gbuffer 3
AO
Lighting buffer
Post
Final output
Time
VirtualAddress
Many small
heaps mean
fragmented
address space
Transient textures on Xbox One
Time
PhysicalAddress
Depth Buffer
Depth pass Gbuffer passSSAO Lighting
Gbuffer 1
Gbuffer 2
Gbuffer 3
AO
Lighting buffer
Post
Final output
Lighting buffer
Light buffer is disjoint
in physical memory
Transient textures on Xbox One
Depth Buffer
Depth pass Gbuffer passSSAO Lighting
Gbuffer 1
Gbuffer 2
Gbuffer 3
AO
Post
Final output
Lighting buffer
Page 1
Page 2
Page 3
Page 4
Page 5
Page 0
Physical memory pool
Time
VirtualAddress
Memory aliasing considerations
 Must be very careful
 Ensure valid resource metadata state (FMASK, CMASK, DCC, etc.)
 Perform fast clears or discard / over-write resources or disable metadata
 Ensure resource lifetimes are correct
 Harder than it sounds
 Account for compute and graphics pipelining
 Account for async compute
 Ensure that physical pages are written to memory before reuse
DiscardResource & Clear
 Must be the first operation on a newly allocated resource
 Requires resource to be in the render target or depth write state
 Initializes resource metadata (HTILE, CMASK, FMASK, DCC, etc.)
 Similar to performing a fast-clear
 Resource contents remains undefined (not actually cleared)
 Prefer DiscardResource over Clear when possible
Aliasing barriers
Aliasing barriers
 Add synchronization between work on GPU
 Add necessary cache flushes
 Use precise barriers to minimize performance cost
 Can use wildcard barriers for difficult cases (but expect IHV tears)
 Batch with all your other resource barriers in DirectX 12!
Aliasing barrier example
 Potential aliasing hazard due to pipelined CS and PS work
 CS and PS use different D3D sources, so transition barriers aren’t enough
 Must flush CS before PS or extend CS resource lifetimes
Aliasing barrier example
 Serialized compute work ensures correctness when memory aliasing
 May hurt performance in some cases
 Use explicit async compute when overlap is critical for performance
Transient resource allocation results
Non-aliasing memory layout (720p)
Time
147 MB total
DirectX 12 PC memory layout (720p)
Time
80 MB total
PlayStation 4 memory layout (720p)
Time
77 MB total
Xbox One memory layout (720p)
Time
76 MB total
ESRAM
DRAM
32 MB ESRAM
44 MB DRAM
What about 4K?
Non-aliasing memory layout (4K, DX12 PC)
Time
1042 MB total
Aliasing memory layout (4K, DX12 PC)
Time
472 MB total
570 MB saved
Conclusion
Summary
 Many benefits from full frame knowledge
 Huge memory savings from resource aliasing
 Semi-automatic async compute
 Simplified rendering pipeline configuration
 Nice visualization and diagnostic tools
 Graphs are an attractive representation of rendering pipelines
 Intuitive and familiar concept
 Similar to CPU job graphs or shader graphs
 Modern C++ features ease the pain of retained mode API
Future work
 Global optimization of resource barriers
 Async compute bookmarks
 Profile-guided optimization
 Async compute
 Memory allocation
 ESRAM allocation
Special thanks
 Johan Andersson (Frostbite Labs)
 Charles de Rousiers (Frostbite)
 Tomasz Stachowiak (Frostbite)
 Simon Taylor (Frostbite)
 Jon Valdes (Frostbite)
 Ivan Nevraev (Microsoft)
 Matt Lee (Microsoft)
 Matthäus G. Chajdas (AMD)
 Christina Coffin (Light & Dark Arts)
 Julien Merceron (Bandai Namco)
Questions?
YURIY@FROSTBITE.COM
@YURIYODONNELL

More Related Content

What's hot

Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)Johan Andersson
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3guest11b095
 
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)Philip Hammer
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologyTiago Sousa
 
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Tiago Sousa
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Ki Hyunwoo
 
Advanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering PipelineAdvanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering PipelineNarann29
 
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Johan Andersson
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility bufferWolfgang Engel
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14AMD Developer Central
 
Taking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationTaking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationGuerrilla
 
Physically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in FrostbitePhysically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in FrostbiteElectronic Arts / DICE
 
Approaching zero driver overhead
Approaching zero driver overheadApproaching zero driver overhead
Approaching zero driver overheadCass Everitt
 
Parallel Futures of a Game Engine
Parallel Futures of a Game EngineParallel Futures of a Game Engine
Parallel Futures of a Game EngineJohan Andersson
 
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunFive Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunElectronic Arts / DICE
 
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3Electronic Arts / DICE
 

What's hot (20)

Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)Parallel Futures of a Game Engine (v2.0)
Parallel Futures of a Game Engine (v2.0)
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3
 
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics Technology
 
Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)Graphics Gems from CryENGINE 3 (Siggraph 2013)
Graphics Gems from CryENGINE 3 (Siggraph 2013)
 
Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1Rendering AAA-Quality Characters of Project A1
Rendering AAA-Quality Characters of Project A1
 
Advanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering PipelineAdvanced Scenegraph Rendering Pipeline
Advanced Scenegraph Rendering Pipeline
 
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
Terrain Rendering in Frostbite using Procedural Shader Splatting (Siggraph 2007)
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility buffer
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
 
Taking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationTaking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next Generation
 
Physically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in FrostbitePhysically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in Frostbite
 
Lighting you up in Battlefield 3
Lighting you up in Battlefield 3Lighting you up in Battlefield 3
Lighting you up in Battlefield 3
 
Frostbite on Mobile
Frostbite on MobileFrostbite on Mobile
Frostbite on Mobile
 
Approaching zero driver overhead
Approaching zero driver overheadApproaching zero driver overhead
Approaching zero driver overhead
 
Parallel Futures of a Game Engine
Parallel Futures of a Game EngineParallel Futures of a Game Engine
Parallel Futures of a Game Engine
 
DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3
 
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The RunFive Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
 
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
SPU-Based Deferred Shading in BATTLEFIELD 3 for Playstation 3
 
Beyond porting
Beyond portingBeyond porting
Beyond porting
 

Viewers also liked

NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017Mark Kilgard
 
[GPU Gems3] Chapter 28. Practical Post Process Depth Of Field
[GPU Gems3] Chapter 28. Practical Post Process Depth Of Field[GPU Gems3] Chapter 28. Practical Post Process Depth Of Field
[GPU Gems3] Chapter 28. Practical Post Process Depth Of Field종빈 오
 
Deferred Shading
Deferred ShadingDeferred Shading
Deferred Shading종빈 오
 
[실전 윈도우 디버깅] 13 포스트모템 디버깅
[실전 윈도우 디버깅] 13 포스트모템 디버깅[실전 윈도우 디버깅] 13 포스트모템 디버깅
[실전 윈도우 디버깅] 13 포스트모템 디버깅종빈 오
 
High Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteHigh Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteElectronic Arts / DICE
 
Physically Based Sky, Atmosphere and Cloud Rendering in Frostbite
Physically Based Sky, Atmosphere and Cloud Rendering in FrostbitePhysically Based Sky, Atmosphere and Cloud Rendering in Frostbite
Physically Based Sky, Atmosphere and Cloud Rendering in FrostbiteElectronic Arts / DICE
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect AndromedaElectronic Arts / DICE
 

Viewers also liked (7)

NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017NVIDIA OpenGL 4.6 in 2017
NVIDIA OpenGL 4.6 in 2017
 
[GPU Gems3] Chapter 28. Practical Post Process Depth Of Field
[GPU Gems3] Chapter 28. Practical Post Process Depth Of Field[GPU Gems3] Chapter 28. Practical Post Process Depth Of Field
[GPU Gems3] Chapter 28. Practical Post Process Depth Of Field
 
Deferred Shading
Deferred ShadingDeferred Shading
Deferred Shading
 
[실전 윈도우 디버깅] 13 포스트모템 디버깅
[실전 윈도우 디버깅] 13 포스트모템 디버깅[실전 윈도우 디버깅] 13 포스트모템 디버깅
[실전 윈도우 디버깅] 13 포스트모템 디버깅
 
High Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteHigh Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in Frostbite
 
Physically Based Sky, Atmosphere and Cloud Rendering in Frostbite
Physically Based Sky, Atmosphere and Cloud Rendering in FrostbitePhysically Based Sky, Atmosphere and Cloud Rendering in Frostbite
Physically Based Sky, Atmosphere and Cloud Rendering in Frostbite
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
 

Similar to FrameGraph: Extensible Rendering Architecture in Frostbite

BitSquid Tech: Benefits of a data-driven renderer
BitSquid Tech: Benefits of a data-driven rendererBitSquid Tech: Benefits of a data-driven renderer
BitSquid Tech: Benefits of a data-driven renderertobias_persson
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And EffectsThomas Goddard
 
Minko stage3d 20130222
Minko stage3d 20130222Minko stage3d 20130222
Minko stage3d 20130222Minko3D
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Prabindh Sundareson
 
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio Owen Wu
 
Siggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsSiggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsTristan Lorach
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Johan Andersson
 
Minko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSLMinko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSLMinko3D
 
NVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityNVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityMark Kilgard
 
XLcloud 3-d remote rendering
XLcloud 3-d remote renderingXLcloud 3-d remote rendering
XLcloud 3-d remote renderingMarius Preda PhD
 
Commandlistsiggraphasia2014 141204005310-conversion-gate02
Commandlistsiggraphasia2014 141204005310-conversion-gate02Commandlistsiggraphasia2014 141204005310-conversion-gate02
Commandlistsiggraphasia2014 141204005310-conversion-gate02RubnCuesta2
 
Well Behaved Mobile Apps on AIR - Performance Related
Well Behaved Mobile Apps on AIR - Performance RelatedWell Behaved Mobile Apps on AIR - Performance Related
Well Behaved Mobile Apps on AIR - Performance RelatedRenaun Erickson
 
Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3Slide_N
 
Introduction to Skia by Ryan Chou @20141008
Introduction to Skia by Ryan Chou @20141008Introduction to Skia by Ryan Chou @20141008
Introduction to Skia by Ryan Chou @20141008Ryan Chou
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevAMD Developer Central
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...AMD Developer Central
 
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...Johan Andersson
 

Similar to FrameGraph: Extensible Rendering Architecture in Frostbite (20)

BitSquid Tech: Benefits of a data-driven renderer
BitSquid Tech: Benefits of a data-driven rendererBitSquid Tech: Benefits of a data-driven renderer
BitSquid Tech: Benefits of a data-driven renderer
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And Effects
 
Minko stage3d 20130222
Minko stage3d 20130222Minko stage3d 20130222
Minko stage3d 20130222
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011
 
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
 
Siggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsSiggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentials
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!
 
Minko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSLMinko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSL
 
NVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityNVIDIA's OpenGL Functionality
NVIDIA's OpenGL Functionality
 
XLcloud 3-d remote rendering
XLcloud 3-d remote renderingXLcloud 3-d remote rendering
XLcloud 3-d remote rendering
 
Commandlistsiggraphasia2014 141204005310-conversion-gate02
Commandlistsiggraphasia2014 141204005310-conversion-gate02Commandlistsiggraphasia2014 141204005310-conversion-gate02
Commandlistsiggraphasia2014 141204005310-conversion-gate02
 
OpenGL 4 for 2010
OpenGL 4 for 2010OpenGL 4 for 2010
OpenGL 4 for 2010
 
Well Behaved Mobile Apps on AIR - Performance Related
Well Behaved Mobile Apps on AIR - Performance RelatedWell Behaved Mobile Apps on AIR - Performance Related
Well Behaved Mobile Apps on AIR - Performance Related
 
Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3
 
Introduction to Skia by Ryan Chou @20141008
Introduction to Skia by Ryan Chou @20141008Introduction to Skia by Ryan Chou @20141008
Introduction to Skia by Ryan Chou @20141008
 
Webrender 1.0
Webrender 1.0Webrender 1.0
Webrender 1.0
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
NvFX GTC 2013
NvFX GTC 2013NvFX GTC 2013
NvFX GTC 2013
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
 
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
 

More from Electronic Arts / DICE

GDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentGDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentElectronic Arts / DICE
 
SIGGRAPH 2010 - Style and Gameplay in the Mirror's Edge
SIGGRAPH 2010 - Style and Gameplay in the Mirror's EdgeSIGGRAPH 2010 - Style and Gameplay in the Mirror's Edge
SIGGRAPH 2010 - Style and Gameplay in the Mirror's EdgeElectronic Arts / DICE
 
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray Tracing
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray TracingSyysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray Tracing
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray TracingElectronic Arts / DICE
 
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time RaytracingCEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time RaytracingElectronic Arts / DICE
 
CEDEC 2018 - Functional Symbiosis of Art Direction and Proceduralism
CEDEC 2018 - Functional Symbiosis of Art Direction and ProceduralismCEDEC 2018 - Functional Symbiosis of Art Direction and Proceduralism
CEDEC 2018 - Functional Symbiosis of Art Direction and ProceduralismElectronic Arts / DICE
 
SIGGRAPH 2018 - PICA PICA and NVIDIA Turing
SIGGRAPH 2018 - PICA PICA and NVIDIA TuringSIGGRAPH 2018 - PICA PICA and NVIDIA Turing
SIGGRAPH 2018 - PICA PICA and NVIDIA TuringElectronic Arts / DICE
 
SIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time Raytracing
SIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time RaytracingSIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time Raytracing
SIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time RaytracingElectronic Arts / DICE
 
HPG 2018 - Game Ray Tracing: State-of-the-Art and Open Problems
HPG 2018 - Game Ray Tracing: State-of-the-Art and Open ProblemsHPG 2018 - Game Ray Tracing: State-of-the-Art and Open Problems
HPG 2018 - Game Ray Tracing: State-of-the-Art and Open ProblemsElectronic Arts / DICE
 
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...Electronic Arts / DICE
 
DD18 - SEED - Raytracing in Hybrid Real-Time Rendering
DD18 - SEED - Raytracing in Hybrid Real-Time RenderingDD18 - SEED - Raytracing in Hybrid Real-Time Rendering
DD18 - SEED - Raytracing in Hybrid Real-Time RenderingElectronic Arts / DICE
 
Creativity of Rules and Patterns: Designing Procedural Systems
Creativity of Rules and Patterns: Designing Procedural SystemsCreativity of Rules and Patterns: Designing Procedural Systems
Creativity of Rules and Patterns: Designing Procedural SystemsElectronic Arts / DICE
 
Shiny Pixels and Beyond: Real-Time Raytracing at SEED
Shiny Pixels and Beyond: Real-Time Raytracing at SEEDShiny Pixels and Beyond: Real-Time Raytracing at SEED
Shiny Pixels and Beyond: Real-Time Raytracing at SEEDElectronic Arts / DICE
 
Future Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsFuture Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsElectronic Arts / DICE
 
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...Electronic Arts / DICE
 
Photogrammetry and Star Wars Battlefront
Photogrammetry and Star Wars BattlefrontPhotogrammetry and Star Wars Battlefront
Photogrammetry and Star Wars BattlefrontElectronic Arts / DICE
 
Moving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based RenderingMoving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based RenderingElectronic Arts / DICE
 

More from Electronic Arts / DICE (20)

GDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game DevelopmentGDC2019 - SEED - Towards Deep Generative Models in Game Development
GDC2019 - SEED - Towards Deep Generative Models in Game Development
 
SIGGRAPH 2010 - Style and Gameplay in the Mirror's Edge
SIGGRAPH 2010 - Style and Gameplay in the Mirror's EdgeSIGGRAPH 2010 - Style and Gameplay in the Mirror's Edge
SIGGRAPH 2010 - Style and Gameplay in the Mirror's Edge
 
SEED - Halcyon Architecture
SEED - Halcyon ArchitectureSEED - Halcyon Architecture
SEED - Halcyon Architecture
 
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray Tracing
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray TracingSyysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray Tracing
Syysgraph 2018 - Modern Graphics Abstractions & Real-Time Ray Tracing
 
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time RaytracingCEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
CEDEC 2018 - Towards Effortless Photorealism Through Real-Time Raytracing
 
CEDEC 2018 - Functional Symbiosis of Art Direction and Proceduralism
CEDEC 2018 - Functional Symbiosis of Art Direction and ProceduralismCEDEC 2018 - Functional Symbiosis of Art Direction and Proceduralism
CEDEC 2018 - Functional Symbiosis of Art Direction and Proceduralism
 
SIGGRAPH 2018 - PICA PICA and NVIDIA Turing
SIGGRAPH 2018 - PICA PICA and NVIDIA TuringSIGGRAPH 2018 - PICA PICA and NVIDIA Turing
SIGGRAPH 2018 - PICA PICA and NVIDIA Turing
 
SIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time Raytracing
SIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time RaytracingSIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time Raytracing
SIGGRAPH 2018 - Full Rays Ahead! From Raster to Real-Time Raytracing
 
HPG 2018 - Game Ray Tracing: State-of-the-Art and Open Problems
HPG 2018 - Game Ray Tracing: State-of-the-Art and Open ProblemsHPG 2018 - Game Ray Tracing: State-of-the-Art and Open Problems
HPG 2018 - Game Ray Tracing: State-of-the-Art and Open Problems
 
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...
EPC 2018 - SEED - Exploring The Collaboration Between Proceduralism & Deep Le...
 
DD18 - SEED - Raytracing in Hybrid Real-Time Rendering
DD18 - SEED - Raytracing in Hybrid Real-Time RenderingDD18 - SEED - Raytracing in Hybrid Real-Time Rendering
DD18 - SEED - Raytracing in Hybrid Real-Time Rendering
 
Creativity of Rules and Patterns: Designing Procedural Systems
Creativity of Rules and Patterns: Designing Procedural SystemsCreativity of Rules and Patterns: Designing Procedural Systems
Creativity of Rules and Patterns: Designing Procedural Systems
 
Shiny Pixels and Beyond: Real-Time Raytracing at SEED
Shiny Pixels and Beyond: Real-Time Raytracing at SEEDShiny Pixels and Beyond: Real-Time Raytracing at SEED
Shiny Pixels and Beyond: Real-Time Raytracing at SEED
 
Future Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsFuture Directions for Compute-for-Graphics
Future Directions for Compute-for-Graphics
 
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
 
Lighting the City of Glass
Lighting the City of GlassLighting the City of Glass
Lighting the City of Glass
 
Photogrammetry and Star Wars Battlefront
Photogrammetry and Star Wars BattlefrontPhotogrammetry and Star Wars Battlefront
Photogrammetry and Star Wars Battlefront
 
Stochastic Screen-Space Reflections
Stochastic Screen-Space ReflectionsStochastic Screen-Space Reflections
Stochastic Screen-Space Reflections
 
Moving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based RenderingMoving Frostbite to Physically Based Rendering
Moving Frostbite to Physically Based Rendering
 
Rendering Battlefield 4 with Mantle
Rendering Battlefield 4 with MantleRendering Battlefield 4 with Mantle
Rendering Battlefield 4 with Mantle
 

Recently uploaded

UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
Data modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainData modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainAbdul Ahad
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
Copilot para Microsoft 365 y Power Platform Copilot
Copilot para Microsoft 365 y Power Platform CopilotCopilot para Microsoft 365 y Power Platform Copilot
Copilot para Microsoft 365 y Power Platform CopilotEdgard Alejos
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfmaor17
 

Recently uploaded (20)

UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
Data modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainData modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software Domain
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
Copilot para Microsoft 365 y Power Platform Copilot
Copilot para Microsoft 365 y Power Platform CopilotCopilot para Microsoft 365 y Power Platform Copilot
Copilot para Microsoft 365 y Power Platform Copilot
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdf
 

FrameGraph: Extensible Rendering Architecture in Frostbite

  • 1. FrameGraph: Extensible Rendering Architecture in Frostbite Yuriy O’Donnell Rendering Engineer Frostbite
  • 2. Outline  Introduction and history  Frame Graph  Transient Resource System  Conclusions Spoilers:  Improved engine extensibility  Simplified async compute  Automated ESRAM aliasing  Saved tons of GPU memory
  • 4. Frostbite 2007 vs 2017 2007  DICE next-gen engine  Built from the ground up for  Xbox 360  PlayStation 3  Multi-core PCs  DirectX 9 SM3 & Direct3D 10  To be used in future DICE games 2017  The EA engine  Evolved and scaled up for  Xbox One  PlayStation 4  Multi-core PCs  DirectX 12  Used in ~15 current and future EA games
  • 5.
  • 6. Rendering system overview `07 Game Renderer World Renderer UI Shading system Direct3D / libGCM Meshes Terrain Sky Undergrowth Particles Decals
  • 7. Rendering system overview `17 Game Renderer World Renderer UI Shading system Direct3D 11 / Direct3D 12 / libGNM (Metal / GLES / Mantle) Meshes Terrain Sky Undergrowth Particles Decals Reflections PBRShadows GI Post-processing Volumetric FX HDR Game-specific rendering features
  • 8. Rendering system overview (simplified) World Renderer Shading System Features Features Render Context GFX APIs
  • 9. WorldRenderer  Orchestrates all rendering  Code-driven architecture  Main world geometry (via )  Lighting, Post-processing (via )  Knows about all views and render passes  Marshalls settings and resources between systems  Allocates resources (render targets, buffers) Shading System Render Context World Renderer Shading System Features Features Render Context GFX APIs
  • 10. Battlefield 4 rendering passes ( )  mainTransDecal  fgOpaqueEmissive  subsurfaceScattering  skyAndFog  hairCoverage  mainTransDepth  linerarizeZ  mainTransparent  halfResUpsample  motionBlurDerive  motionBlurVelocity  motionBlurFilter  filmicEffectsEdge  spriteDof  fgTransparent  lensScope  filmicEffects  bloom  luminanceAvg  finalPost  overlay  fxaa  smaa  resample  screenEffect  hmdDistortion  spotlightShadowmaps  downsampleZ  linearizeZ  ssao  hbaoHalfZ  hbao  ssr  halfResZPass  halfResTransp  mainDistort  lightPassEnd  mainOpaque  linearizeZ  mainOpaqueEmissive  reflectionCapture  planarReflections  dynamicEnvmap  mainZPass  mainGBuffer  mainGBufferSimple  mainGBufferDecal  decalVolumes  mainGBufferFixup  msaaZDown  msaaClassify  lensFlareOcclusionQueries  lightPassBegin  cascadedShadowmaps Features
  • 11. WorldRenderer challenges  Explicit immediate mode rendering  Explicit resource management  Bespoke, artisanal hand-crafted ESRAM management  Multiple implementations by different game teams  Tight coupling between rendering systems  Limited extensibility  Game teams must fork / diverge to customize  Organically grew from 4k to 15k SLOC  Single functions with over 2k SLOC  Expensive to maintain, extend and merge/integrate World Renderer Shading System Features Features Render Context GFX APIs
  • 12. Modular WorldRenderer goals  High-level knowledge of the full frame  Improved extensibility  Decoupled and composable code modules  Automatic resource management  Better visualizations and diagnostics World Renderer Shading System Features Features Render Context GFX APIs
  • 13. New architectural components  Frame Graph  High-level representation of render passes and resources  Full knowledge of the frame  Transient Resource System  Resource allocation  Memory aliasing World Renderer Shading System FeaturesFeatures GFX APIs Render Context Frame Graph Transient Resources
  • 15. Frame Graph goals  Build high-level knowledge of the entire frame  Simplify resource management  Simplify rendering pipeline configuration  Simplify async compute and resource barriers  Allow self-contained and efficient rendering modules  Visualize and debug complex rendering pipelines
  • 16. Frame Graph example Gbuffer pass Depth Buffer Depth pass Gbuffer 1 Gbuffer 2 Lighting buffer Post Backbuffer Gbuffer 3 Depth Buffer Present Lighting Render operations and resources for the entire frame expressed as a directed acyclic graph
  • 17. Graph of a Battlefield 4 frame Typically see few hundred passes and resources
  • 18.
  • 19. Frame Graph design  Moving away from immediate mode rendering  Rendering code split into passes  Multi-phase retained mode rendering API 1. Setup phase 2. Compile phase 3. Execute phase  Built from scratch every frame  Code-driven architecture
  • 20. Frame Graph setup phase  Define render / compute passes  Define inputs and output resources for each pass  Code flow is similar to immediate mode rendering Setup Compile Execute
  • 21. Frame Graph resources  Render passes must declare all used resources  Read  Write  Create  External permanent resources are imported to Frame Graph  History buffer for TAA  Backbuffer  etc. Setup Compile Execute
  • 22. Frame Graph resource example RenderPass::RenderPass(FrameGraphBuilder& builder) { // Declare new transient resource FrameGraphTextureDesc desc; desc.width = 1280; desc.height = 720; desc.format = RenderFormat_D32_FLOAT; desc.initialSate = FrameGraphTextureDesc::Clear; m_renderTarget = builder.createTexture(desc); } RenderPass Render Target
  • 23. Frame Graph setup example RenderPass::RenderPass(FrameGraphBuilder& builder, FrameGraphResource input, FrameGraphMutableResource renderTarget) { // Declare resource dependencies m_input = builder.read(input, readFlags); m_renderTarget = builder.write(renderTarget, writeFlags); } Input RenderPass Render Target (version 2) Render Target (version 1)
  • 24. Advanced FrameGraph operations  Deferred-created resources  Declare resource early, allocate on first actual use  Automatic resource bind flags, based on usage  Derived resource parameters  Create render pass output based on input size / format  Derive bind flags based on usage  MoveSubresource  Forward one resource to another  Automatically creates sub-resource views / aliases  Allows “time travel”
  • 25. Reflection module Deferred shading module MoveSubresource example Gbuffer pass Depth Buffer Depth pass Gbuffer 1 Gbuffer 2 Lighting buffer 2D Render Target Gbuffer 3 Depth Buffer Lighting Cubemap X+Cubemap X+Cubemap X+Cubemap X+Cubemap X+Cubemap (Z+) Move Lighting buffer 2D Render Target Subresource 5 Convolution Reflection probe
  • 26. Frame Graph compilation phase  Cull unreferenced resources and passes  Can be a bit more sloppy during declaration phase  Aim to reduce configuration complexity  Simplifies conditional passes, debug rendering, etc.  Calculate resource lifetimes  Allocate concrete GPU resources based on usage  Simple greedy allocation algorithm  Acquire right before first use, release after last use  Extend lifetimes for async compute  Derive resource bind flags based on usage Setup Compile Execute
  • 27. Sub-graph culling example Gbuffer pass Depth Buffer Depth pass Gbuffer 1 Gbuffer 2 Lighting buffer Post Final target Gbuffer 3 Depth Buffer Debug View Debug output Debug output texture is not consumed, therefore it and the render pass are culled Present Lighting
  • 28. Debug visualization is switched on by connecting the debug output to the back buffer node Sub-graph culling example Gbuffer pass Depth Buffer Depth pass Gbuffer 1 Gbuffer 2 Lighting buffer Post Final target Depth Buffer Debug View Debug output Present Lighting Move Lighting and postprocessing parts of the pipeline are automatically disabled Gbuffer 3
  • 29. Frame Graph execution phase  Execute callback functions for each render pass  Immediate mode rendering code  Using familiar RenderContext API  Set state, resources, shaders  Draw, Dispatch  Get real GPU resources from handles generated in setup phase Setup Compile Execute
  • 30. Async compute  Could derive from dependency graph automatically  Manual control desired  Great potential for performance savings, but…  Memory increase  Can hurt performance if misused  Opt-in per render pass  Kicked off on main timeline  Sync point at first use of output resource on another queue  Resource lifetimes automatically extended to sync point
  • 31. Async compute SSAO Shadows LightingDepth pass SSAO FilterMain queue Depth Buffer Raw AO Filtered AO
  • 32. Async compute Shadows LightingDepth passMain queue Async queue SSAO SSAO Filter Depth Buffer Filtered AO Raw AO Sync point
  • 33. Frame Graph async setup example AmbientOcclusionPass::AmbientOcclusionPass(FrameGraphBuilder& builder) { // The only change required to make this pass // and all its child passes run on async queue builder.asyncComputeEnable(true); // Rest of the setup code is unaffected // … }
  • 34. Pass declaration with C++  Could just make a C++ class per RenderPass  Breaks code flow  Requires plenty of boilerplate  Expensive to port existing code  Settled on C++ lambdas  Preserves code flow!  Minimal changes to legacy code  Wrap legacy code in a lambda  Add a resource usage declarations
  • 35. Pass declaration with C++ lambdas FrameGraphResource addMyPass(FrameGraph& frameGraph, FrameGraphResource input, FrameGraphMutableResource output) { struct PassData { FrameGraphResource input; FrameGraphMutableResource output; }; auto& renderPass = frameGraph.addCallbackPass<PassData>(“MyRenderPass", [&](RenderPassBuilder& builder, PassData& data) { // Declare all resource accesses during setup phase data.input = builder.read(input); data.output = builder.useRenderTarget(output).targetTextures[0]; }, [=](const PassData& data, const RenderPassResources& resources, IRenderContext* renderContext) { // Render stuff during execution phase drawTexture2d(renderContext, resources.getTexture(data.input)); }); return renderPass.output; } Setup Execute (deferred) Resources
  • 36. Render modules  Two types of render modules: 1. Free-standing stateless functions  Inputs and outputs are Frame Graph resource handles  May create nested render passes  Most common module type in Frostbite 2. Persistent render modules  May have some persistent resources (LUTs, history buffers, etc.)  WorldRenderer still orchestrates high-level rendering  Does not allocate any GPU resources  Just kicks off rendering modules at the high level  Much easier to extend  Code size reduced from 15K to 5K SLOC
  • 37. Communication between modules  Modules may communicate through a blackboard  Hash table of components  Accessed via component Type ID  Allows controlled coupling void BlurModule::renderBlurPyramid( FrameGraph& frameGraph, FrameGraphBlackboard& blackboard) { // Produce blur pyramid in the blur module auto& blurData = blackboard.add<BlurPyramidData>(); addBlurPyramidPass(frameGraph, blurData); } #include ”BlurModule.h” void TonemapModule::createBlurPyramid( FrameGraph& frameGraph, const FrameGraphBlackboard& blackboard) { // Consume blur pyramid in a different module const auto& blurData = blackboard.get<BlurPyramidData>(); addTonemapPass(frameGraph, blurData); }
  • 39. Transient resource system  Transient /ˈtranzɪənt/ adjective Lasting only for a short time; impermanent.  Resources that are alive for no longer than one frame  Buffers, depth and color targets, UAVs  Strive to minimize resource life times within a frame  Allocate resources where they are used  Directly in leaf rendering systems  Deallocate as soon as possible  Make it easier to write self-contained features  Critical component of Frame Graph
  • 40. Transient resource system back-end  Implementation depends on platform capabilities  Aliasing in physical memory ( )  Aliasing in virtual memory ( )  Object pools ( )  Atomic linear allocator for buffers  No aliasing, just blast through memory  Mostly used for sending data to GPU  Memory pools for textures Complexity Efficiency DX11 PC PS4 XB1 DX12 PC DX11 DX12 PS4 XB1
  • 41. Transient textures on PlayStation 4 Depth Buffer Depth pass Gbuffer passSSAO Lighting Gbuffer 1 Gbuffer 2 Gbuffer 3 AO Lighting buffer Post Final output Waste due to fragmentation Time VirtualAddress
  • 42. Heap 6 Heap 5 Heap 4 Heap 3 Heap 2 Heap 1 Transient textures on DirectX 12 PC Depth Buffer Depth pass Gbuffer passSSAO Lighting Gbuffer 1 Gbuffer 2 Gbuffer 3 AO Lighting buffer Post Final output Time VirtualAddress Many small heaps mean fragmented address space
  • 43. Transient textures on Xbox One Time PhysicalAddress Depth Buffer Depth pass Gbuffer passSSAO Lighting Gbuffer 1 Gbuffer 2 Gbuffer 3 AO Lighting buffer Post Final output Lighting buffer Light buffer is disjoint in physical memory
  • 44. Transient textures on Xbox One Depth Buffer Depth pass Gbuffer passSSAO Lighting Gbuffer 1 Gbuffer 2 Gbuffer 3 AO Post Final output Lighting buffer Page 1 Page 2 Page 3 Page 4 Page 5 Page 0 Physical memory pool Time VirtualAddress
  • 45. Memory aliasing considerations  Must be very careful  Ensure valid resource metadata state (FMASK, CMASK, DCC, etc.)  Perform fast clears or discard / over-write resources or disable metadata  Ensure resource lifetimes are correct  Harder than it sounds  Account for compute and graphics pipelining  Account for async compute  Ensure that physical pages are written to memory before reuse
  • 46. DiscardResource & Clear  Must be the first operation on a newly allocated resource  Requires resource to be in the render target or depth write state  Initializes resource metadata (HTILE, CMASK, FMASK, DCC, etc.)  Similar to performing a fast-clear  Resource contents remains undefined (not actually cleared)  Prefer DiscardResource over Clear when possible
  • 48. Aliasing barriers  Add synchronization between work on GPU  Add necessary cache flushes  Use precise barriers to minimize performance cost  Can use wildcard barriers for difficult cases (but expect IHV tears)  Batch with all your other resource barriers in DirectX 12!
  • 49. Aliasing barrier example  Potential aliasing hazard due to pipelined CS and PS work  CS and PS use different D3D sources, so transition barriers aren’t enough  Must flush CS before PS or extend CS resource lifetimes
  • 50. Aliasing barrier example  Serialized compute work ensures correctness when memory aliasing  May hurt performance in some cases  Use explicit async compute when overlap is critical for performance
  • 52. Non-aliasing memory layout (720p) Time 147 MB total
  • 53. DirectX 12 PC memory layout (720p) Time 80 MB total
  • 54. PlayStation 4 memory layout (720p) Time 77 MB total
  • 55. Xbox One memory layout (720p) Time 76 MB total ESRAM DRAM 32 MB ESRAM 44 MB DRAM
  • 57. Non-aliasing memory layout (4K, DX12 PC) Time 1042 MB total
  • 58. Aliasing memory layout (4K, DX12 PC) Time 472 MB total 570 MB saved
  • 60. Summary  Many benefits from full frame knowledge  Huge memory savings from resource aliasing  Semi-automatic async compute  Simplified rendering pipeline configuration  Nice visualization and diagnostic tools  Graphs are an attractive representation of rendering pipelines  Intuitive and familiar concept  Similar to CPU job graphs or shader graphs  Modern C++ features ease the pain of retained mode API
  • 61. Future work  Global optimization of resource barriers  Async compute bookmarks  Profile-guided optimization  Async compute  Memory allocation  ESRAM allocation
  • 62. Special thanks  Johan Andersson (Frostbite Labs)  Charles de Rousiers (Frostbite)  Tomasz Stachowiak (Frostbite)  Simon Taylor (Frostbite)  Jon Valdes (Frostbite)  Ivan Nevraev (Microsoft)  Matt Lee (Microsoft)  Matthäus G. Chajdas (AMD)  Christina Coffin (Light & Dark Arts)  Julien Merceron (Bandai Namco)

Editor's Notes

  1. http://www.frostbite.com/2007/04/frostbite-rendering-architecture-and-real-time-procedural-shading-texturing-techniques
  2. More diverse games Not just Battlefield engine RPG, Racing, Sports, Action
  3. http://www.frostbite.com/2007/04/frostbite-rendering-architecture-and-real-time-procedural-shading-texturing-techniques slide 17
  4. Not meant to be a complete / representative graph, just illustrating the scaling challenges. Basically the same, except larger number of systems with more complicated coupling. Mostly unchanged since 2007 Until recently! Everything scaled up More features Much larger community Scaling and maintenance challenges Shading system described in detail by Johan in 2007. Rest of this talk will be about World Renderer and rendering features.
  5. World Renderer architecture is the focus of the presentation from this point.
  6. This is the rendering passes we had in Frostbite few years ago for Battlefield 4. Our pipeline now with PBR has even more passes and complexity.
  7. World Renderer state as it evolved from 2007 to 2016.
  8. Major World Renderer re-architecture in 2016 to address accumulated technical debt, improve extensibility and maintainability. Not micro-management of explicit passes and resources Not hacking monolithic functions inside engine code Not baby-sitting of memory allocation and aliasing.
  9. Toy example of a frame graph that implements a deferred shading pipeline. The graph contains render passes and resources as nodes. Access declarations / dependencies are edges.
  10. Debug graph visualization using GraphViz. Output is a searchable PDF (not static image). Graphs can be surprisingly large and complex. While can be useful in some cases, it’s definitely not the primary visualization tool that we ended up using.
  11. We wrote a custom visualization script (HTML+Javascript and JSON data exported from runtime). JSON data contains information about all render passes and resources. For each render pass we know which resources were created, read or written. For each resource we know its complete memory layout and various metadata (debug name, size, format, etc.). The visualization is interactive and provides a much more useful overview of what’s going on in a frame, similar to what you’d find in PIX.
  12. FrameGraph is a step away from immediate mode rendering towards retained. We build the graph every frame from scratch, since rendering configuration may change dynamically based on player actions, cut-scenes, etc. The big assumption is that setup phase is relatively cheap, as we’re only dealing with a relatively small number of render passes and resources.
  13. Flow is similar to IM rendering, but we are not generating any GPU commands during this phase. Just building up the information about rendering operations for the frame. All resources are virtual during graph building. Render pass inputs and outputs are declared using virtual resource handles.
  14. Some render passes may have effects that are not visible to FrameGraph (for example data read-back from GPU). Such passes are explicitly marked as having side-effects. Some persistent render targets are still required (TAA, SSR, etc.). They can be imported into FrameGraph. Writing to imported resource counts as side-effect of a render pass, which ensures that it is not culled during the compilation phase.
  15. Simple dummy render pass that produces a render target resource. Very similar to creating a regular texture, except we also specify initial resource state (clear or discard/undefined)
  16. Render pass that reads from one texture and writes to another. Writing to a texture produces a renamed handle. This allows us to catch errors when resources are modified in undefined order (when same resource is written by different passes). Renaming resources enforces a specific execution order of the render passes.
  17. Abstract / virtualized resources allow some convenient tricks. Resources may be declared early, but their memory will be allocated only on first use. An example use case is a depth buffer resource. We know that we will need one to do 3D rendering, but we don’t necessarily know (or care) if our rendering pipeline is using depth pre-pass. Depth pre-pass, gbuffer pass and forward-shaded geometry passes all simply write to the resource that they require to be declared early. FrameGraph resource handles have metadata attached to them that can be queried during setup phase. This allows some render passes to create derived resources. For example, a generic down-sample render pass can create an output resource that shares all properties of the input, but overrides width/height. Resource bind flags can be also automatically computed based on how the resource is used. The pass that creates a render target resource does not need to know that this resource is going to be used as a UAV, etc. One of the more magical operations that are possible on virtualized resources is MoveSubresource. Create aliases of resources that will be created by the future render passes.
  18. A generic rendering pipeline can be implemented that creates an output texture, which is a simple 2D image resource. It can be combined with a reflection probe filtering pipeline that takes a cubemap input. Move operation can be used to assign ”Lighting buffer” resource to one of the cubemap faces. This causes the deferred shading module to write directly to the cubemap face, instead of creating a separate render target. The same deferred shading module can be used in a different context, without move operation. In this case FrameGraph will allocate a transient render target for the output.
  19. FrameGraph data structures Flat array of used resource handles per RenderPass Flat array of RenderPasses in FrameGraph Flat array of resources in ResourceRegistry Resource handles are just indices into this array Compilation phase linearly walks through all RenderPasses Computes reference counts for resources Computes first and last users for resources Computes async wait points and resource barriers RenderPass execution order is defined by setup order No re-ordering during compilation Culling algorithm Simple graph flood-fill from unreferenced resources. Compute initial resource and pass reference counts renderPass.refCount++ for every resource write resource.refCount++ for every resource read Identify resources with refCount == 0 and push them on a stack While stack is non-empty Pop a resource and decrement ref count of its producer If producer.refCount == 0, decrement ref counts of resources that it reads Add them to the stack when their refCount == 0
  20. It is sometimes convenient to add render passes and resources to the graph without checking if they are needed first. For example, we can always add certain debug visualizations or specialized passes, such as depth buffer linearization. This cuts down on the rendering pipeline configuration complexity a bit.
  21. The engine contains many features and deciding whether to execute a certain render pass can be a chore. It also introduces some coupling between passes. Lighting passes don't need to know anything about the debug output. When debug is enabled, it overrides the lighting output, which will cull it. This leads to more decoupled / modular code.
  22. Execution phase is quite simple. Iterate over render passes that are not culled and call their execution callback function. This phase is almost identical to how rendering was done before FrameGraph. Just use RenderContext API, except must de-virtualize FrameGraph resources first.
  23. Efficient async compute requires some hand-holding today. While we do have all the render pass and resource dependencies in the graph, we don’t know what bottlenecks will exist on the GPU during execution. Don’t want bandwidth-heavy compute passes to run with bandwidth-heavy graphics work (shouldn’t be news to anyone). Async operations will increase memory water mark, because resource lifetimes are extended (more resources are alive simultaneously). Need to be a bit careful. Ended up with a manual opt-in mechanism for render passes. Async passes are kicked off on the main timeline at the point where they’d execute serially (we don’t re-order passes). Synchronization point is automatically added on the main pipeline before the first render pass that consumes the output of async pass.
  24. Example compute-based AO filtering pipeline.
  25. AO buffer generation and filtering can be moved to async queue, but resource lifetimes must be extended a bit (up to the sync point).
  26. This is all you have to do in high-level code. Super simple to answer questions like “what would happen if we ran this async?”. In the future we’d like to explore automatic render pass re-ordering, perhaps with profile-guided optimization step.
  27. Programmer convenience is very important. Don’t want to introduce too much boilerplate code or break the code flow. Started with a C++ class with virtual execute(), but quickly realized that such approach requires moving quite a lot of code around. It also requires a bit of plumbing to pass data between setup and execution phases. Implemented a lambda-based API to improve the convenience. This also greatly simplified the effort of porting legacy rendering code to the new system. Started by simply wrapping huge chunks of code in lambdas. Gradually replaced raw resources with transients and sub-divided monolithic lambdas into smaller ones. Eventually moved out final small code blocks into stand-alone functions. Spaghetti is mostly untangled  The price that we have to pay is a template-heavy FrameGraph setup API.
  28. addCallbackPass() is a template function that creates a render pass class behind the scenes that’s parametarized by the PassData and the execution lambda. Setup lambda is inlined in addMyPass(), but execute lambda is deferred. Setup lambda may capture everything by reference, but execute must capture by value. Capturing data by value is a little bit dangerous since it’s possible to accidentally capture a pointer that’s released before execution phase. It’s also possible to accidentally capture huge structures by value. Luckily, we can enforce that the size of execution lambda is below a certain size at compile time (we settled on 1KB limit).
  29. Not necessarily need a single global blackboard. Modules may create their own blackboard within their setup scope, propagate some data from the parent into it and then copy results into parent blackboard at the end. While blackboard is great for decoupling, it does make the code harder to understand. If a module takes a blackboard as a parameter, it’s not possible to tell at the call site which resources will actually be accessed. The module code itself must be viewed to answer this. Invalid blackboard access can only be validated at run-time. On balance, we believe that the benefits outweigh the drawbacks.
  30. The back-bone of FrameGraph.
  31. Reserve a single large virtual memory pool Allocate texture virtual memory block on first use Use a general purpose non-local memory allocator Patch or allocate GNM resource descriptors as needed Return virtual memory block after last use Commit physical memory to cover VA range used in current frame Grow the physical memory pool on demand Shrink down to the high water mark of last N frames Resources overlap in virtual address space Understood natively by PS4 graphics debugging tools (Razor)
  32. A bit similar to PS4, except many disjoint address ranges instead of just one. Can’t use a single range, as it’s impossible to shrink it without stalling the GPU or temporarily increasing memory usage. Despite these shortcomings, we’re still able to re-use memory sometimes and see a significant overall water mark reduction. Frostbite does not currently perform global memory allocation optimization, but it could theoretically be implemented. A global optimization pass would allow merging Heap 2 into Heap 6. This would bring down the overall number of heaps and the memory water mark. --- Concrete problems with resource heaps in current D3D12: Tier 1 heaps have restrictions on types of resources that can be placed in them. Only buffers or only textures or only render targets and depth buffers. Must create separate heaps for different resource types. Most transient resources that we alias are RT or DS, so it’s not too bad. We force the RT flag on a transient texture even if user did not specifically request it. Tier 2 heaps are better, as all types of resources can be aliased. They are still not ideal, as we must allocate many heaps and sub-allocate within them. This leads to more fragmentation compared to allocating from a single large address range. We can’t allocate a single huge heap, as we can’t shrink it. Compromise is to create one large-ish persistent transient resource heap and then create smaller overflow heaps. Once a resource is created, it can’t be moved. This means that if memory allocation “schedule” changes a bit, some objects will change their placements and will have to be re-created. It’s possible to work around this issue to some degree by caching placed D3D objects (resources and various views) and re-using them when possible (potentially many frames later, when allocation schedule changes again to a compatible one). Resource allocation schedule may change based on player actions, cut-scenes, UI, etc. However, there is typically only a handful of unique schedules, so it’s possible to use an LRU cache. These problems simply don’t exist on consoles. Tiled resources could be quite convenient in the future (almost the same level of efficiency as XB1 memory aliasing), however as of October 2016 there are significant CPU and GPU overheads to using them as RTVs / DSVs. Additionally, resource heap tier restrictions prevent efficient tile mapping updates via CopyTileMappings. We sometimes want to use multiple heaps as page sources to back a single resource. Current UpdateTileMappings API can only take a single heap pointer, therefore multiple API calls are required.
  33. Fragmentation-free dynamic memory allocation and aliasing Close to optimal ESRAM utilization automatically Don’t need contiguous memory blocks Resources may be fully or partially in ESRAM Overflow to DRAM when every ESRAM page is in use Hand-tune memory allocation based on profiling Deny ESRAM for some resources Allocate ESRAM top-down or bottom-up Restrict ESRAM to % of the resource, place rest in DRAM
  34. Use a physical memory pool of ESRAM and DRAM pages Allocate all resources at unique virtual addresses VirtualAlloc, CreatePlacedResourceX Allocate physical memory pages from pool on first use ESRAM pages first, overflow to DRAM Extend DRAM pool on demand Shrink DRAM pool based on high water mark of last N frames Return physical pages to the pool after last use Update GPU page table before executing other commands XB1-specific ID3D12CommandQueue API Conceptually similar to CopyTileMappings Page table update happens on GPU timeline
  35. Since different rendering passes may use the same physical memory, we need to add synchronization point between them to make sure that they don’t run in parallel and overwrite each others memory. We do this by using aliasing barriers, which will add the necessary pipeline and cache flushes.
  36. Graphics and compute passes don’t overlap logically, but run in parallel because they are independent (they don’t have any producer-consumer relationship or any shared logical resources).
  37. Using explicit async compute allows us to ensure that resource memory isn’t released until compute chain is done. In this particular example, overlapped and non-overlapped versions have exactly the same performance characteristics. Overlap does not always improve perf. In fact, it may sometimes hurt it.
  38. … now we finally have space for those 16k^2 eyeball textures!
  39. Full frame knowledge and visualization is an awesome tool that allowed us to spot inefficiencies in resource allocation, possibilities for async compute, etc.
  40. We’re only starting to scratch the surface of what’s possible with the modern rendering engine architecture and APIs. We expect to see more engines in the future moving to a similar design (high-level frame setup), since that appears to be the most optimal way to drive DX12 and Vulkan style renderers.
  41. The End