SlideShare a Scribd company logo
1 of 53
Download to read offline
Windows to Reality:
Getting the Most out of
Direct3D 10 Graphics in
Your Games
Shanon Drone
Software Development Engineer
XNA Developer Connection
Microsoft
Key areas
 Debug Layer
 Draw Calls
 Constant Updates
 State Management
 Shader Linkage
 Resource Updates
 Dynamic Geometry
 Porting Tips
Debug Layer
Use it!
  The D3D10 layer can help find performance
  issues
    App controlled by passing
    D3D10_CREATE_DEVICE_DEBUG into
    D3D10CreateDevice.
  Use the D3DX10 Debug Runtime
    Link against D3DX10d.lib
  Only do this for debug builds!
  Look for performance warnings in the debug
  output
Draw Calls
 Draw calls are still “not free”
 Draw overhead is reduced in D3D10
   But not enough that you can be lazy
 Efficiency in the number of draw calls will
 still give a performance win
Draw Calls
Excess baggage
  An increase in the number of draw calls
  generally increases the number of API
  calls associated with those draws
    ConstantBuffer updates
    Resource changes (VBs, IBs, Textures)
    InputLayout changes
  These all have effects on performance
  that vary with draw call count
Constant Updates
 Updating shader constants was often a
 bottleneck in D3D9
 It can still be a bottleneck in D3D10
 The main difference between the two is
 the new Constant Buffer object in D3D10
 This is the largest section of this talk
Constant Updates
Constant Buffer Recap
  Constant Buffers are buffer objects that
  hold shader constant data
  They are updated using
  D3D10_MAP_WRITE_DISCARD or by calling
  UpdateSubresource
  There are 16 Constant Buffer slots
  available to each shader in the pipeline
    Try not to use all 16 to leave some headroom
Constant Updates
Porting Issues
  D3D9 constants were updated individually
  by calling SetXXXXXShaderConstantX
  In D3D10, you have to update the entire
  constant buffer all at once
  A naïve port from D3D9 to D3D10 can have
  crippling performance implications if
  Constant Buffers are not handled
  correctly!
  Rule of thumb: Do not update more data
  than you need to
Constant Updates
Naïve Port: AKA how to cripple perf
  Each shader uses one big constant buffer
  Submitting one value submits them all!
  If you have one 4096 byte Constant
  Buffer, and you only need to update your
  World matrix, you will still have to update
  4096 bytes of data and send it across the
  bus
  Don’t do this!
Constant Updates
Naïve Port: AKA how to cripple perf
  100 skinned meshes (100 materials), 900
  static meshes (400 materials), 1 shadow +
  1 lighting pass    Shadow Pass
                        Update VSGlobalCB
                                     6560 Bytes   x 100 = 656000 Bytes
 cbuffer VSGlobalsCB
                                   Update VSGlobalCB
 {                       6560
   matrix ViewProj;     Bytes        6560 Bytes   x 900 = 5904000 Bytes
   matrix Bones[100];           Light Pass
   matrix World;                    Update VSGlobalCB
   float SpecPower;
                                     6560 Bytes   x 100 = 656000 Bytes
   float4 BDRFCoefficients;
   float AppTime;                  Update VSGlobalCB
   uint2 RenderTargetSize;           6560 Bytes   x 900 = 5904000 Bytes
 };
                                = 13,120,000 Bytes
Constant Updates
Organize Constants
  The first step is to organize constants by
  frequency of update
  One shader will generally be used to draw
  several objects
  Some data in this shader doesn’t need to
  be set for every draw
    For example: Time, ViewProj matrices
  Split these out into their own buffers
Begin Frame
cbuffer VSGlobalPerFrameCB              Update VSGlobalPerFrameCB
{                        4 Bytes
  float AppTime;                           4 Bytes    x 1 = 4 Bytes
};                                     Update VSPerSkinnedCBs
cbuffer VSPerSkinnedCB                   6400 Bytes   x 100 = 640000 Bytes
{                         6400
                          Bytes        Update VSPerStaticCBs
  matrix Bones[100];
};                                        64 Bytes    x 900 = 57600 Bytes
cbuffer VSPerStaticCB               Shadow Pass
{                        64 Bytes       Update VSPerPassCB
  matrix World
};
                                          72 Bytes    x 1 = 72 Bytes
cbuffer VSPerPassCB                 Light Pass
{                                       Update VSPerPassCB
  matrix ViewProj;       72 Bytes
                                          72 Bytes    x 1 = 72 Bytes
  uint2 RenderTargetSize;
};                                     Update VSPerMaterialCBs

cbuffer VSPerMaterialCB                   20 Bytes    x 500 = 10000 Bytes
{
                         20 Bytes
  float SpecPower;
  float4 BDRFCoefficients;              = 707,748 Bytes
};
Constant Updates



 13,120,000
   Bytes
              /   707,748
                   Bytes    =   18x
Constant Updates
Managing Buffers
  Constant buffers need to be managed in
  the application
  Creating a few buffers that are used for
  all shader constants just won’t work
    We update more data than necessary due to
    large buffers
Constant Updates
Managing Buffers
  Solution 1 (Fastest)
    Create Constant Buffers that line up exactly
    with the number of elements of each
    frequency group
      Global CBs
      CBs per Mesh
      CBs per Material
      CBs per Pass
    This ensures that EVERY constant buffer is no
    larger than it absolutely needs to be
    This also ensures the most efficient update of
    CBs based upon frequency
Constant Updates
Managing Buffers
  Solution 2 (Second Best)
    If you cannot create a CBs that line up exactly
    with elements, you can create a tiered constant
    buffer system
    Create arrays of 32-byte, 64-byte, 128-byte, 256-
    byte, etc. constant buffers
    Keep a shadow copy of the constant data in
    system memory
    When it comes time to render, select the
    smallest CB from the array that will hold the
    necessary constant data
    May have to resubmit redundant data for
    separate passes
    Hybrid approach?
Constant Updates
Case Study: Skinning using Solution 1
  Skinning in D3D9 (or a bad D3D10 port)
    Multiple passes causes redundant bone data
    uploads to the GPU
  Skinning in D3D10
    Using Constant Buffers we only need to
    upload it once
Constant Updates
D3D9 Version / or Naïve D3D10 Version
   Pass1                        Mesh2 Bone0
                                Mesh1

   Set Mesh1 Bones              Mesh2 Bone1
                                Mesh1 Bone1
     Draw Mesh1
                                Mesh2 Bone2
                                Mesh1
   Set Mesh2 Bones
                     Constant   Mesh2 Bone3
                                Mesh1
     Draw Mesh2
                     Data
   Pass2                        Mesh2 Bone4
                                Mesh1

   Set Mesh1 Bones                  …
     Draw Mesh1
                                Mesh2 BoneN
                                Mesh1
   Set Mesh2 Bones
     Draw Mesh2
Constant Updates
Preferred D3D10 Version
                         Mesh1 CB                 Mesh2 CB
 Frame Start
                         Mesh1 Bone0              Mesh2 Bone0
   Update Mesh1 CB
                         Mesh1 Bone1              Mesh2 Bone1
   Update Mesh2 CB
                         Mesh1 Bone2              Mesh2 Bone2
   Pass1
                         Mesh1 Bone3              Mesh2 Bone3
    Bind Mesh1 CB
     Draw Mesh1          Mesh1 Bone4              Mesh2 Bone4
    Bind Mesh2 CB             …                        …
     Draw Mesh2          Mesh1 BoneN              Mesh2 BoneN
   Pass2
    Bind Mesh1 CB
     Draw Mesh1
    Bind Mesh2 CB    CB Slot 0         Mesh1
                                       Mesh2 CB
     Draw Mesh2
Constant Updates
Advanced D3D10 Version
 Why not store all of our characters’ bones in
 a 128-bit FP texture?
 We can upload bones for all visible
 characters at the start of a frame
 We can draw similar characters using
 instancing instead of individual draws
   Use SV_InstanceID to select the start of the
   character’s bone data in the texture
 Stream the skinned meshes to memory using
 Stream Output and render all subsequent
 passes from the post-skinned buffer
State Management
 Individual state setting is no longer
 possible in D3D10
 State in D3D10 is stored in state objects
 These state objects are immutable
 To change even one aspect of a state
 object requires that you create an
 entirely new state object with that one
 change
State Management
Managing State Objects
  Solution 1 (Fastest)
    If you have a known set of materials and
    required states, you can create all state
    objects at load time
    State objects are small and there are finite
    set of permutations
    With all state objects created at runtime, all
    that needs to be done during rendering is to
    bind the object
State Management
Managing State Objects
  Solution 2 (Second Best)
    If your content is not finalized, or if you
    CANNOT get your engine to lump state
    together
    Create a state object hash table
    Hash off of the setting that has the most
    unique states
    Grab pre-created states from the hash-table
    Why not give your tools pipeline the ability to
    do this for a level and save out the results?
Shader Linkage
 D3D9 shader linkage was based off of
 semantics (POSITION, NORMAL,
 TEXCOORDN)
 D3D10 linkage is based off of offsets and
 sizes
 This means stricter linkage rules
 This also means that the driver doesn’t
 have to link shaders together at every
 draw call!
Shader Linkage
No Holes Allowed!
   Elements must be read in the order they
   are output from the previous stage
   Cannot have “holes” between linkages

Struct VS_OUTPUT                           Struct PS_INPUT
{                                          {
    float3 Norm :   NORMAL;                    float2 Tex : TEXCOORD0;
                                               float3 Norm   NORMAL;
    float2 Tex :    TEXCOORD0;                 float3 Norm : NORMAL;
                                                      Tex    TEXCOORD0;
    float2 Tex2 :   TEXCOORD1;                 float2 Tex2 : TEXCOORD1;
    float4 Pos :    SV_POSITION;
};                                         };




                                   Holes at the end are OK
Shader Linkage
Input Assembler to Vertex Shader
  Input Layouts define the signature of the
  vertex stream data
  Input Layouts are the similar to Vertex
  Declarations in D3D9
    Strict linkage rules are a big difference
  Creating Input Layouts on the fly is not
  recommended
  CreateInputLayout requires a shader
  signature to validate against
Shader Linkage
Input Assembler to Vertex Shader
  Solution 1 (Fastest)
    Create an Input Layout for each unique
    Vertex Stream / Vertex Shader combination
    up front
    Input Layouts are small
    This assumes that the shader input signature
    is available when you call CreateInputLayout
    Try to normalize Input Layouts across level or
    be art directed
Shader Linkage
Input Assembler to Vertex Shader
  Solution 2 (Second Best)
    If you load meshes and create input layouts
    before loading shaders, you might have a
    problem
    You can use a similar hashing scheme as the
    one used for State Objects
    When the Input Layout is needed, search the
    hash for an Input Layout that matches the
    Vertex Stream and Vertex Shader signature
    Why not store this data to a file and pre-
    populate the Input Layouts after your content
    is tuned?
Shader Linkage
Aside: Instancing
  Instancing is a first class citizen on D3D10!
  Stream source frequency is now part of
  the Input Layout
  Multiple frequencies will mean multiple
  Input Layouts
Resource Updates
 Updating resources is different in D3D10
 Create / Lock / Fill / Unlock paradigm is
 no longer necessary (although you can
 still do it)
 Texture data can be passed into the
 texture at create time
Resource Updates
Resource Usage Types
 D3D10_USAGE_DEFAULT
 D3D10_USAGE_IMMUTABLE
 D3D10_USAGE_DYNAMIC
 D3D10_USAGE_STAGING
Resource Updates
D3D10_USAGE_DEFAULT
 Use for resources that need fast GPU read
 and write access
 Can only be updated using
 UpdateSubresource
 Render targets are good candidates
 Textures that are updated infrequently
 (less than once per frame) are good
 candidates
Resource Updates
D3D10_USAGE_IMMUTABLE
 Use for resources that need fast GPU read
 access only
 Once they are created, they cannot be
 updated... ever
 Initial data must be passed in during the
 creation call
 Resources that will never change (static
 textures, VBs / Ibs) are good candidates
 Don’t bend over backwards trying to make
 everything D3D10_USAGE_IMMUTABLE
Resource Updates
D3D10_USAGE_DYNAMIC
 Use for resources that need fast CPU write
 access (at the expense of slower GPU read
 access)
 No CPU read access
 Can only be updated using Map with:
   D3D10_MAP_WRITE_DISCARD
   D3D10_MAP_WRITE_NO_OVERWRITE
 Dynamic Vertex Buffers are good candidates
 Dynamic (> once per frame) textures are
 good candidates
Resource Updates
D3D10_USAGE_STAGING
 This is the only way to read data back
 from the GPU
 Can only be updated using Map
 Cannot map with
 D3D10_MAP_WRITE_DISCARD or
 D3D10_MAP_WRITE_NO_OVERWRITE
 Might want to double buffer to keep from
 stalling GPU
 The GPU cannot directly use these
Resource Updates
Summary
 CPU updates the resource frequently
 (more than once per frame)
   Use D3D10_USAGE_DYNAMIC
 CPU updates the resource infrequently
 (once per frame or less)
   Use D3D10_USAGE_DEFAULT
 CPU doesn’t update the resource
   Use D3D10_USAGE_IMMUTABLE
 CPU needs to read the resource
   Use D3D10_USAGE_STAGING
Resource Updates
Example: Vertex Buffer
  The vertex buffer is touched by the CPU
  less than once per frame
    Create it with D3D10_USAGE_DEFAULT
    Update it with UpdateSubresource
  The vertex buffer is used for dynamic
  geometry and CPU need to update if
  multiple times per frame
    Create it with D3D10_USAGE_DYNAMIC
    Update it with Map
Resource Updates
The Exception: Constant Buffers
  CBs are always expected to be updated
  frequently
  Select CB usage based upon which one
  causes the least amount of system
  memory to be transferred
    Not just to the GPU, but system-to-system
    memory copies as well
Resource Updates
UpdateSubresource
 UpdateSubresource requires a system
 memory buffer and incurs an extra copy
 Use if you have system copies of your
 constant data already in one place
Resource Updates
Map
 Map requires no extra system memory but
 may hit driver renaming limits if abused
 Use if compositing values on the fly or
 collecting values from other places
Resource Updates
A note on overusing discard
  Use D3D10_MAP_WRITE_DISCARD carefully
  with buffers!
  D3D10_MAP_WRITE_DISCARD tells the driver to
  give us a new memory buffer if the current
  one is busy
  There are a LIMITED set of temporary buffers
  If these run out, then your app will stall until
  another buffer can be freed
  This can happen if you do dynamic geometry
  using one VB and D3D10_MAP_WRITE_DISCARD
Dynamic Geometry
 DrawIndexedPrimitiveUP is gone!
 DrawPrimitiveUP is gone!
 Your well-behaved D3D9 app isn’t using
 these anyway, right?
Dynamic Geometry
Solution: Same as in D3D9
  Use one large buffer, and map it with
  D3D10_MAP_WRITE_NO_OVERWRITE
  Advance the write position with every draw
    Wrap to the beginning
  Make sure your buffer is large enough that
  you’re not overwriting data that the GPU is
  reading
  This is what happens under the covers for
  D3D9 when using DIPUP or DUP in Windows
  Vista
Porting Tips
 StretchRect is Gone
   Work around using render-to-texture
 A8R8G8B8 have been replaced with
 R8G8B8A8 formats
   Swizzle on texture load or swizzle in the
   shader
 Fixed Function AlphaTest is Gone
   Add logic to the shader and call discard
 Fixed Function Fog is Gone
   Add it to the shader
Porting Tips
Continued
 User Clip Planes usage has changed
   They’ve move to the shader
   Experiment with the SV_ClipDistance SEMANTIC vs
   discard in the PS to determine which is faster for
   your shader
 Query data sizes might have changed
   Occlusion queries are UINT64 vs DWORD
 No Triangle Fan Support
   Work around in content pipeline or on load
 SetCursorProperties, ShowCursor are gone
   Use Win32 APIs to handle cursors now
Porting Tips
Continued
 No offsets on Map calls
   This was basically API clutter in D3D9
   Calculate the offset from the returned pointer
 Clears are no longer bound to pipeline state
   If you want a clear call to respect scissor,
   stencil, or other state, draw a full-screen quad
   This is closer to the HW
   The Driver/HW has been doing for you for years
 OMSetBlendState
   Never set the SampleMask to 0 in
   OMSetBlendState
Porting Tips
Continued
 Input Layout conversions tightened up
   D3DDECLTYPE_UBYTE4 in the vertex stream
   could be converted to a float4 in the VS in D3D9
   IE. 255u in the stream would show up as 255.0 in
   the VS
   In D3D10 you either get a normalized [0..1] value
   or 255 (u)int
 Register keyword
   It doesn’t mean the same thing in D3D10
   Use register to determine which CB slot a CB
   binds to
   Use packoffset to place a variable inside a CB
Porting Tips
Continued
 Sampler and Texture bindings
   Samplers can be bound independently of textures
   This is very flexible!
   Sampler and Texture slots are not always the
   same
 Register Packing
   In D3D9 all variables took up at least one float4
   register (even if you only used a single float!)
   In D3D10 variables are packed together
   This saves a lot of space
   Make sure your engine doesn’t do everything
   based upon register offsets or your variables
   might alias
Porting Tips
Continued
 D3DSAMP_SRGBTEXTURE
   This sampler state setting does not exist on
   D3D10
   Instead it’s included in the texture format
   This is more like the Xbox 360
 Consider re-optimizing resource usage and
 upload for better D3D10 performance
   But use D3D10_USAGE_DEFAULT resources
   and UpdateSubresource and a baseline
Summary
 Use the debug runtime!
 More draw calls usually means more constant
 updating and state changing calls
 Be frugal with constant updates
   Avoid resubmitting redundant data!
 Create as much state and input layout
 information up front as possible
 Select D3D10_USAGE for resources based
 upon the CPU access patterns needed
 Use D3D10_MAP_NO_OVERWRITE and a big
 buffer as a replacement for DIPUP and DUP
Call to Action
 Actually exploit D3D10!
 This talk tells you how to get performance
 gains from a straight port
 You can get a whole lot more by using
 D3D10’s advanced features!
   StreamOut to minimize skinning costs
   First class instancing support
   Store some vertex data in textures
   Move some systems to the GPU (Particles?)
   Aggressive use of Constant Buffers
http://www.xna.com




                                © 2007 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

More Related Content

What's hot

[KGC2014] DX9에서DX11로의이행경험공유
[KGC2014] DX9에서DX11로의이행경험공유[KGC2014] DX9에서DX11로의이행경험공유
[KGC2014] DX9에서DX11로의이행경험공유Hwan Min
 
Physically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in FrostbitePhysically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in FrostbiteElectronic Arts / DICE
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility bufferWolfgang Engel
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologyTiago Sousa
 
Direct x 11 입문
Direct x 11 입문Direct x 11 입문
Direct x 11 입문Jin Woo Lee
 
[Ndc11 박민근] deferred shading
[Ndc11 박민근] deferred shading[Ndc11 박민근] deferred shading
[Ndc11 박민근] deferred shadingMinGeun Park
 
Game Engine Architecture
Game Engine ArchitectureGame Engine Architecture
Game Engine ArchitectureAttila Jenei
 
Compute shader
Compute shaderCompute shader
Compute shaderQooJuice
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14AMD Developer Central
 
Implements Cascaded Shadow Maps with using Texture Array
Implements Cascaded Shadow Maps with using Texture ArrayImplements Cascaded Shadow Maps with using Texture Array
Implements Cascaded Shadow Maps with using Texture ArrayYEONG-CHEON YOU
 
Taking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationTaking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationGuerrilla
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Johan Andersson
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect AndromedaElectronic Arts / DICE
 
NDC2011 - 절차적 지형과 트렌드의 추적자들
NDC2011 - 절차적 지형과 트렌드의 추적자들NDC2011 - 절차적 지형과 트렌드의 추적자들
NDC2011 - 절차적 지형과 트렌드의 추적자들Jubok Kim
 
NDC2016 프로젝트 A1의 AAA급 캐릭터 렌더링 기술
NDC2016 프로젝트 A1의 AAA급 캐릭터 렌더링 기술NDC2016 프로젝트 A1의 AAA급 캐릭터 렌더링 기술
NDC2016 프로젝트 A1의 AAA급 캐릭터 렌더링 기술Ki Hyunwoo
 
Penner pre-integrated skin rendering (siggraph 2011 advances in real-time r...
Penner   pre-integrated skin rendering (siggraph 2011 advances in real-time r...Penner   pre-integrated skin rendering (siggraph 2011 advances in real-time r...
Penner pre-integrated skin rendering (siggraph 2011 advances in real-time r...JP Lee
 
[0903 구경원] recast 네비메쉬
[0903 구경원] recast 네비메쉬[0903 구경원] recast 네비메쉬
[0903 구경원] recast 네비메쉬KyeongWon Koo
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3guest11b095
 

What's hot (20)

[KGC2014] DX9에서DX11로의이행경험공유
[KGC2014] DX9에서DX11로의이행경험공유[KGC2014] DX9에서DX11로의이행경험공유
[KGC2014] DX9에서DX11로의이행경험공유
 
Physically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in FrostbitePhysically Based and Unified Volumetric Rendering in Frostbite
Physically Based and Unified Volumetric Rendering in Frostbite
 
DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3DirectX 11 Rendering in Battlefield 3
DirectX 11 Rendering in Battlefield 3
 
Triangle Visibility buffer
Triangle Visibility bufferTriangle Visibility buffer
Triangle Visibility buffer
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics Technology
 
Direct x 11 입문
Direct x 11 입문Direct x 11 입문
Direct x 11 입문
 
[Ndc11 박민근] deferred shading
[Ndc11 박민근] deferred shading[Ndc11 박민근] deferred shading
[Ndc11 박민근] deferred shading
 
Game Engine Architecture
Game Engine ArchitectureGame Engine Architecture
Game Engine Architecture
 
Compute shader
Compute shaderCompute shader
Compute shader
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
 
Implements Cascaded Shadow Maps with using Texture Array
Implements Cascaded Shadow Maps with using Texture ArrayImplements Cascaded Shadow Maps with using Texture Array
Implements Cascaded Shadow Maps with using Texture Array
 
Taking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next GenerationTaking Killzone Shadow Fall Image Quality Into The Next Generation
Taking Killzone Shadow Fall Image Quality Into The Next Generation
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
 
NDC2011 - 절차적 지형과 트렌드의 추적자들
NDC2011 - 절차적 지형과 트렌드의 추적자들NDC2011 - 절차적 지형과 트렌드의 추적자들
NDC2011 - 절차적 지형과 트렌드의 추적자들
 
NDC2016 프로젝트 A1의 AAA급 캐릭터 렌더링 기술
NDC2016 프로젝트 A1의 AAA급 캐릭터 렌더링 기술NDC2016 프로젝트 A1의 AAA급 캐릭터 렌더링 기술
NDC2016 프로젝트 A1의 AAA급 캐릭터 렌더링 기술
 
Penner pre-integrated skin rendering (siggraph 2011 advances in real-time r...
Penner   pre-integrated skin rendering (siggraph 2011 advances in real-time r...Penner   pre-integrated skin rendering (siggraph 2011 advances in real-time r...
Penner pre-integrated skin rendering (siggraph 2011 advances in real-time r...
 
[0903 구경원] recast 네비메쉬
[0903 구경원] recast 네비메쉬[0903 구경원] recast 네비메쉬
[0903 구경원] recast 네비메쉬
 
Modular Rigging in Battlefield 3
Modular Rigging in Battlefield 3Modular Rigging in Battlefield 3
Modular Rigging in Battlefield 3
 
A Bit More Deferred Cry Engine3
A Bit More Deferred   Cry Engine3A Bit More Deferred   Cry Engine3
A Bit More Deferred Cry Engine3
 

Viewers also liked

[NHN_NEXT] DirectX Tutorial 강의 자료
[NHN_NEXT] DirectX Tutorial 강의 자료[NHN_NEXT] DirectX Tutorial 강의 자료
[NHN_NEXT] DirectX Tutorial 강의 자료MinGeun Park
 
Porting direct x 11 desktop game to uwp app
Porting direct x 11 desktop game to uwp appPorting direct x 11 desktop game to uwp app
Porting direct x 11 desktop game to uwp appYEONG-CHEON YOU
 
[1023 박민수] 깊이_버퍼_그림자_1
[1023 박민수] 깊이_버퍼_그림자_1[1023 박민수] 깊이_버퍼_그림자_1
[1023 박민수] 깊이_버퍼_그림자_1MoonLightMS
 
[Gpg1권 박민근] 5.10 게임을 위한 그럴듯한 유리 효과
[Gpg1권 박민근] 5.10 게임을 위한 그럴듯한 유리 효과[Gpg1권 박민근] 5.10 게임을 위한 그럴듯한 유리 효과
[Gpg1권 박민근] 5.10 게임을 위한 그럴듯한 유리 효과MinGeun Park
 
[0107 박민근] 쉽게 배우는 hdr과 톤맵핑
[0107 박민근] 쉽게 배우는 hdr과 톤맵핑[0107 박민근] 쉽게 배우는 hdr과 톤맵핑
[0107 박민근] 쉽게 배우는 hdr과 톤맵핑MinGeun Park
 
[1126 박민근] 비전엔진을 이용한 mmorpg 개발
[1126 박민근] 비전엔진을 이용한 mmorpg 개발[1126 박민근] 비전엔진을 이용한 mmorpg 개발
[1126 박민근] 비전엔진을 이용한 mmorpg 개발MinGeun Park
 
모바일 게임 최적화
모바일 게임 최적화 모바일 게임 최적화
모바일 게임 최적화 tartist
 
NDC 2015 박주은,최재혁 물리기반렌더링 지난1년간의 경험
NDC 2015 박주은,최재혁 물리기반렌더링 지난1년간의 경험NDC 2015 박주은,최재혁 물리기반렌더링 지난1년간의 경험
NDC 2015 박주은,최재혁 물리기반렌더링 지난1년간의 경험Jooeun Park
 
물리 기반 셰이더의 이해
물리 기반 셰이더의 이해물리 기반 셰이더의 이해
물리 기반 셰이더의 이해tartist
 
[C++ Korea 2nd Seminar] C++17 Key Features Summary
[C++ Korea 2nd Seminar] C++17 Key Features Summary[C++ Korea 2nd Seminar] C++17 Key Features Summary
[C++ Korea 2nd Seminar] C++17 Key Features SummaryChris Ohk
 
NDC2015 유니티 정적 라이팅 이게 최선인가요
NDC2015 유니티 정적 라이팅 이게 최선인가요  NDC2015 유니티 정적 라이팅 이게 최선인가요
NDC2015 유니티 정적 라이팅 이게 최선인가요 Wuwon Yu
 
[0312 조진현] good bye dx9
[0312 조진현] good bye dx9[0312 조진현] good bye dx9
[0312 조진현] good bye dx9진현 조
 
[C++ Korea 3rd Seminar] 새 C++은 새 Visual Studio에, 좌충우돌 마이그레이션 이야기
[C++ Korea 3rd Seminar] 새 C++은 새 Visual Studio에, 좌충우돌 마이그레이션 이야기[C++ Korea 3rd Seminar] 새 C++은 새 Visual Studio에, 좌충우돌 마이그레이션 이야기
[C++ Korea 3rd Seminar] 새 C++은 새 Visual Studio에, 좌충우돌 마이그레이션 이야기Chris Ohk
 
[Kgc2012] deferred forward 이창희
[Kgc2012] deferred forward 이창희[Kgc2012] deferred forward 이창희
[Kgc2012] deferred forward 이창희changehee lee
 
[160402_데브루키_박민근] UniRx 소개
[160402_데브루키_박민근] UniRx 소개[160402_데브루키_박민근] UniRx 소개
[160402_데브루키_박민근] UniRx 소개MinGeun Park
 
물리 기반 셰이더의 허와 실:물리기반 셰이더를 가르쳐 봤습니다 공개용
물리 기반 셰이더의 허와 실:물리기반 셰이더를 가르쳐 봤습니다  공개용물리 기반 셰이더의 허와 실:물리기반 셰이더를 가르쳐 봤습니다  공개용
물리 기반 셰이더의 허와 실:물리기반 셰이더를 가르쳐 봤습니다 공개용JP Jung
 
유니티의 툰셰이딩을 사용한 3D 애니메이션 표현
유니티의 툰셰이딩을 사용한 3D 애니메이션 표현유니티의 툰셰이딩을 사용한 3D 애니메이션 표현
유니티의 툰셰이딩을 사용한 3D 애니메이션 표현MinGeun Park
 
Modern gpu optimize blog
Modern gpu optimize blogModern gpu optimize blog
Modern gpu optimize blogozlael ozlael
 

Viewers also liked (19)

[NHN_NEXT] DirectX Tutorial 강의 자료
[NHN_NEXT] DirectX Tutorial 강의 자료[NHN_NEXT] DirectX Tutorial 강의 자료
[NHN_NEXT] DirectX Tutorial 강의 자료
 
Porting direct x 11 desktop game to uwp app
Porting direct x 11 desktop game to uwp appPorting direct x 11 desktop game to uwp app
Porting direct x 11 desktop game to uwp app
 
[1023 박민수] 깊이_버퍼_그림자_1
[1023 박민수] 깊이_버퍼_그림자_1[1023 박민수] 깊이_버퍼_그림자_1
[1023 박민수] 깊이_버퍼_그림자_1
 
[Gpg1권 박민근] 5.10 게임을 위한 그럴듯한 유리 효과
[Gpg1권 박민근] 5.10 게임을 위한 그럴듯한 유리 효과[Gpg1권 박민근] 5.10 게임을 위한 그럴듯한 유리 효과
[Gpg1권 박민근] 5.10 게임을 위한 그럴듯한 유리 효과
 
[0107 박민근] 쉽게 배우는 hdr과 톤맵핑
[0107 박민근] 쉽게 배우는 hdr과 톤맵핑[0107 박민근] 쉽게 배우는 hdr과 톤맵핑
[0107 박민근] 쉽게 배우는 hdr과 톤맵핑
 
[1126 박민근] 비전엔진을 이용한 mmorpg 개발
[1126 박민근] 비전엔진을 이용한 mmorpg 개발[1126 박민근] 비전엔진을 이용한 mmorpg 개발
[1126 박민근] 비전엔진을 이용한 mmorpg 개발
 
모바일 게임 최적화
모바일 게임 최적화 모바일 게임 최적화
모바일 게임 최적화
 
NDC 2015 박주은,최재혁 물리기반렌더링 지난1년간의 경험
NDC 2015 박주은,최재혁 물리기반렌더링 지난1년간의 경험NDC 2015 박주은,최재혁 물리기반렌더링 지난1년간의 경험
NDC 2015 박주은,최재혁 물리기반렌더링 지난1년간의 경험
 
물리 기반 셰이더의 이해
물리 기반 셰이더의 이해물리 기반 셰이더의 이해
물리 기반 셰이더의 이해
 
[C++ Korea 2nd Seminar] C++17 Key Features Summary
[C++ Korea 2nd Seminar] C++17 Key Features Summary[C++ Korea 2nd Seminar] C++17 Key Features Summary
[C++ Korea 2nd Seminar] C++17 Key Features Summary
 
NDC2015 유니티 정적 라이팅 이게 최선인가요
NDC2015 유니티 정적 라이팅 이게 최선인가요  NDC2015 유니티 정적 라이팅 이게 최선인가요
NDC2015 유니티 정적 라이팅 이게 최선인가요
 
[0312 조진현] good bye dx9
[0312 조진현] good bye dx9[0312 조진현] good bye dx9
[0312 조진현] good bye dx9
 
[C++ Korea 3rd Seminar] 새 C++은 새 Visual Studio에, 좌충우돌 마이그레이션 이야기
[C++ Korea 3rd Seminar] 새 C++은 새 Visual Studio에, 좌충우돌 마이그레이션 이야기[C++ Korea 3rd Seminar] 새 C++은 새 Visual Studio에, 좌충우돌 마이그레이션 이야기
[C++ Korea 3rd Seminar] 새 C++은 새 Visual Studio에, 좌충우돌 마이그레이션 이야기
 
[Kgc2012] deferred forward 이창희
[Kgc2012] deferred forward 이창희[Kgc2012] deferred forward 이창희
[Kgc2012] deferred forward 이창희
 
Visual shock vol.2
Visual shock   vol.2Visual shock   vol.2
Visual shock vol.2
 
[160402_데브루키_박민근] UniRx 소개
[160402_데브루키_박민근] UniRx 소개[160402_데브루키_박민근] UniRx 소개
[160402_데브루키_박민근] UniRx 소개
 
물리 기반 셰이더의 허와 실:물리기반 셰이더를 가르쳐 봤습니다 공개용
물리 기반 셰이더의 허와 실:물리기반 셰이더를 가르쳐 봤습니다  공개용물리 기반 셰이더의 허와 실:물리기반 셰이더를 가르쳐 봤습니다  공개용
물리 기반 셰이더의 허와 실:물리기반 셰이더를 가르쳐 봤습니다 공개용
 
유니티의 툰셰이딩을 사용한 3D 애니메이션 표현
유니티의 툰셰이딩을 사용한 3D 애니메이션 표현유니티의 툰셰이딩을 사용한 3D 애니메이션 표현
유니티의 툰셰이딩을 사용한 3D 애니메이션 표현
 
Modern gpu optimize blog
Modern gpu optimize blogModern gpu optimize blog
Modern gpu optimize blog
 

Similar to Windows to reality getting the most out of direct3 d 10 graphics in your games

CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...
CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...
CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...Thom Lane
 
The Technology behind Shadow Warrior, ZTG 2014
The Technology behind Shadow Warrior, ZTG 2014The Technology behind Shadow Warrior, ZTG 2014
The Technology behind Shadow Warrior, ZTG 2014Jarosław Pleskot
 
Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)basisspace
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And EffectsThomas Goddard
 
Video Compression Basics by sahil jain
Video Compression Basics by sahil jainVideo Compression Basics by sahil jain
Video Compression Basics by sahil jainSahil Jain
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0Russell Spitzer
 
그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...
그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...
그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...ozlael ozlael
 
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeksBeginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeksJinTaek Seo
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with CassandraJacek Lewandowski
 
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDB
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDBBuilding a Scalable Distributed Stats Infrastructure with Storm and KairosDB
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDBCody Ray
 
Actors, a Unifying Pattern for Scalable Concurrency | C4 2006
Actors, a Unifying Pattern for Scalable Concurrency | C4 2006 Actors, a Unifying Pattern for Scalable Concurrency | C4 2006
Actors, a Unifying Pattern for Scalable Concurrency | C4 2006 Real Nobile
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevAMD Developer Central
 
Masked Software Occlusion Culling
Masked Software Occlusion CullingMasked Software Occlusion Culling
Masked Software Occlusion CullingIntel® Software
 
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelParsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelchk49
 
Netflix at-disney-09-26-2014
Netflix at-disney-09-26-2014Netflix at-disney-09-26-2014
Netflix at-disney-09-26-2014Monal Daxini
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVMJohn Lee
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistencyScyllaDB
 
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by  ...WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by  ...
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...AMD Developer Central
 

Similar to Windows to reality getting the most out of direct3 d 10 graphics in your games (20)

CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...
CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...
CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...
 
The Technology behind Shadow Warrior, ZTG 2014
The Technology behind Shadow Warrior, ZTG 2014The Technology behind Shadow Warrior, ZTG 2014
The Technology behind Shadow Warrior, ZTG 2014
 
2020 icldla-updated
2020 icldla-updated2020 icldla-updated
2020 icldla-updated
 
Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)Realtime Per Face Texture Mapping (PTEX)
Realtime Per Face Texture Mapping (PTEX)
 
D3 D10 Unleashed New Features And Effects
D3 D10 Unleashed   New Features And EffectsD3 D10 Unleashed   New Features And Effects
D3 D10 Unleashed New Features And Effects
 
Video Compression Basics by sahil jain
Video Compression Basics by sahil jainVideo Compression Basics by sahil jain
Video Compression Basics by sahil jain
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0
 
그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...
그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...
그래픽 최적화로 가...가버렷! (부제: 배치! 배칭을 보자!) , Batch! Let's take a look at Batching! -...
 
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeksBeginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks
Beginning direct3d gameprogramming09_shaderprogramming_20160505_jintaeks
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with Cassandra
 
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDB
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDBBuilding a Scalable Distributed Stats Infrastructure with Storm and KairosDB
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDB
 
Actors, a Unifying Pattern for Scalable Concurrency | C4 2006
Actors, a Unifying Pattern for Scalable Concurrency | C4 2006 Actors, a Unifying Pattern for Scalable Concurrency | C4 2006
Actors, a Unifying Pattern for Scalable Concurrency | C4 2006
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Masked Software Occlusion Culling
Masked Software Occlusion CullingMasked Software Occlusion Culling
Masked Software Occlusion Culling
 
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelParsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
 
Netflix at-disney-09-26-2014
Netflix at-disney-09-26-2014Netflix at-disney-09-26-2014
Netflix at-disney-09-26-2014
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVM
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistency
 
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by  ...WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by  ...
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...
 

More from changehee lee

Gdc 14 bringing unreal engine 4 to open_gl
Gdc 14 bringing unreal engine 4 to open_glGdc 14 bringing unreal engine 4 to open_gl
Gdc 14 bringing unreal engine 4 to open_glchangehee lee
 
Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicschangehee lee
 
Fortugno nick design_and_monetization
Fortugno nick design_and_monetizationFortugno nick design_and_monetization
Fortugno nick design_and_monetizationchangehee lee
 
[Kgc2013] 모바일 엔진 개발기
[Kgc2013] 모바일 엔진 개발기[Kgc2013] 모바일 엔진 개발기
[Kgc2013] 모바일 엔진 개발기changehee lee
 
모바일 엔진 개발기
모바일 엔진 개발기모바일 엔진 개발기
모바일 엔진 개발기changehee lee
 
Mobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraphMobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraphchangehee lee
 
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)changehee lee
 
개발자여! 스터디를 하자!
개발자여! 스터디를 하자!개발자여! 스터디를 하자!
개발자여! 스터디를 하자!changehee lee
 
Gamificated game developing
Gamificated game developingGamificated game developing
Gamificated game developingchangehee lee
 
Basic ofreflectance kor
Basic ofreflectance korBasic ofreflectance kor
Basic ofreflectance korchangehee lee
 
Valve handbook low_res
Valve handbook low_resValve handbook low_res
Valve handbook low_reschangehee lee
 
Ndc12 이창희 render_pipeline
Ndc12 이창희 render_pipelineNdc12 이창희 render_pipeline
Ndc12 이창희 render_pipelinechangehee lee
 
아이폰에 포팅해보기
아이폰에 포팅해보기아이폰에 포팅해보기
아이폰에 포팅해보기changehee lee
 

More from changehee lee (20)

Shader compilation
Shader compilationShader compilation
Shader compilation
 
Gdc 14 bringing unreal engine 4 to open_gl
Gdc 14 bringing unreal engine 4 to open_glGdc 14 bringing unreal engine 4 to open_gl
Gdc 14 bringing unreal engine 4 to open_gl
 
Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphics
 
Fortugno nick design_and_monetization
Fortugno nick design_and_monetizationFortugno nick design_and_monetization
Fortugno nick design_and_monetization
 
카툰 렌더링
카툰 렌더링카툰 렌더링
카툰 렌더링
 
[Kgc2013] 모바일 엔진 개발기
[Kgc2013] 모바일 엔진 개발기[Kgc2013] 모바일 엔진 개발기
[Kgc2013] 모바일 엔진 개발기
 
Paper games 2013
Paper games 2013Paper games 2013
Paper games 2013
 
모바일 엔진 개발기
모바일 엔진 개발기모바일 엔진 개발기
모바일 엔진 개발기
 
V8
V8V8
V8
 
Wecanmakeengine
WecanmakeengineWecanmakeengine
Wecanmakeengine
 
Mobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraphMobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraph
 
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
 
개발자여! 스터디를 하자!
개발자여! 스터디를 하자!개발자여! 스터디를 하자!
개발자여! 스터디를 하자!
 
Light prepass
Light prepassLight prepass
Light prepass
 
Gamificated game developing
Gamificated game developingGamificated game developing
Gamificated game developing
 
Basic ofreflectance kor
Basic ofreflectance korBasic ofreflectance kor
Basic ofreflectance kor
 
C++11(최지웅)
C++11(최지웅)C++11(최지웅)
C++11(최지웅)
 
Valve handbook low_res
Valve handbook low_resValve handbook low_res
Valve handbook low_res
 
Ndc12 이창희 render_pipeline
Ndc12 이창희 render_pipelineNdc12 이창희 render_pipeline
Ndc12 이창희 render_pipeline
 
아이폰에 포팅해보기
아이폰에 포팅해보기아이폰에 포팅해보기
아이폰에 포팅해보기
 

Recently uploaded

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...itnewsafrica
 

Recently uploaded (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
 

Windows to reality getting the most out of direct3 d 10 graphics in your games

  • 1.
  • 2. Windows to Reality: Getting the Most out of Direct3D 10 Graphics in Your Games Shanon Drone Software Development Engineer XNA Developer Connection Microsoft
  • 3. Key areas Debug Layer Draw Calls Constant Updates State Management Shader Linkage Resource Updates Dynamic Geometry Porting Tips
  • 4. Debug Layer Use it! The D3D10 layer can help find performance issues App controlled by passing D3D10_CREATE_DEVICE_DEBUG into D3D10CreateDevice. Use the D3DX10 Debug Runtime Link against D3DX10d.lib Only do this for debug builds! Look for performance warnings in the debug output
  • 5. Draw Calls Draw calls are still “not free” Draw overhead is reduced in D3D10 But not enough that you can be lazy Efficiency in the number of draw calls will still give a performance win
  • 6. Draw Calls Excess baggage An increase in the number of draw calls generally increases the number of API calls associated with those draws ConstantBuffer updates Resource changes (VBs, IBs, Textures) InputLayout changes These all have effects on performance that vary with draw call count
  • 7. Constant Updates Updating shader constants was often a bottleneck in D3D9 It can still be a bottleneck in D3D10 The main difference between the two is the new Constant Buffer object in D3D10 This is the largest section of this talk
  • 8. Constant Updates Constant Buffer Recap Constant Buffers are buffer objects that hold shader constant data They are updated using D3D10_MAP_WRITE_DISCARD or by calling UpdateSubresource There are 16 Constant Buffer slots available to each shader in the pipeline Try not to use all 16 to leave some headroom
  • 9. Constant Updates Porting Issues D3D9 constants were updated individually by calling SetXXXXXShaderConstantX In D3D10, you have to update the entire constant buffer all at once A naïve port from D3D9 to D3D10 can have crippling performance implications if Constant Buffers are not handled correctly! Rule of thumb: Do not update more data than you need to
  • 10. Constant Updates Naïve Port: AKA how to cripple perf Each shader uses one big constant buffer Submitting one value submits them all! If you have one 4096 byte Constant Buffer, and you only need to update your World matrix, you will still have to update 4096 bytes of data and send it across the bus Don’t do this!
  • 11. Constant Updates Naïve Port: AKA how to cripple perf 100 skinned meshes (100 materials), 900 static meshes (400 materials), 1 shadow + 1 lighting pass Shadow Pass Update VSGlobalCB 6560 Bytes x 100 = 656000 Bytes cbuffer VSGlobalsCB Update VSGlobalCB { 6560 matrix ViewProj; Bytes 6560 Bytes x 900 = 5904000 Bytes matrix Bones[100]; Light Pass matrix World; Update VSGlobalCB float SpecPower; 6560 Bytes x 100 = 656000 Bytes float4 BDRFCoefficients; float AppTime; Update VSGlobalCB uint2 RenderTargetSize; 6560 Bytes x 900 = 5904000 Bytes }; = 13,120,000 Bytes
  • 12. Constant Updates Organize Constants The first step is to organize constants by frequency of update One shader will generally be used to draw several objects Some data in this shader doesn’t need to be set for every draw For example: Time, ViewProj matrices Split these out into their own buffers
  • 13. Begin Frame cbuffer VSGlobalPerFrameCB Update VSGlobalPerFrameCB { 4 Bytes float AppTime; 4 Bytes x 1 = 4 Bytes }; Update VSPerSkinnedCBs cbuffer VSPerSkinnedCB 6400 Bytes x 100 = 640000 Bytes { 6400 Bytes Update VSPerStaticCBs matrix Bones[100]; }; 64 Bytes x 900 = 57600 Bytes cbuffer VSPerStaticCB Shadow Pass { 64 Bytes Update VSPerPassCB matrix World }; 72 Bytes x 1 = 72 Bytes cbuffer VSPerPassCB Light Pass { Update VSPerPassCB matrix ViewProj; 72 Bytes 72 Bytes x 1 = 72 Bytes uint2 RenderTargetSize; }; Update VSPerMaterialCBs cbuffer VSPerMaterialCB 20 Bytes x 500 = 10000 Bytes { 20 Bytes float SpecPower; float4 BDRFCoefficients; = 707,748 Bytes };
  • 14. Constant Updates 13,120,000 Bytes / 707,748 Bytes = 18x
  • 15. Constant Updates Managing Buffers Constant buffers need to be managed in the application Creating a few buffers that are used for all shader constants just won’t work We update more data than necessary due to large buffers
  • 16. Constant Updates Managing Buffers Solution 1 (Fastest) Create Constant Buffers that line up exactly with the number of elements of each frequency group Global CBs CBs per Mesh CBs per Material CBs per Pass This ensures that EVERY constant buffer is no larger than it absolutely needs to be This also ensures the most efficient update of CBs based upon frequency
  • 17. Constant Updates Managing Buffers Solution 2 (Second Best) If you cannot create a CBs that line up exactly with elements, you can create a tiered constant buffer system Create arrays of 32-byte, 64-byte, 128-byte, 256- byte, etc. constant buffers Keep a shadow copy of the constant data in system memory When it comes time to render, select the smallest CB from the array that will hold the necessary constant data May have to resubmit redundant data for separate passes Hybrid approach?
  • 18. Constant Updates Case Study: Skinning using Solution 1 Skinning in D3D9 (or a bad D3D10 port) Multiple passes causes redundant bone data uploads to the GPU Skinning in D3D10 Using Constant Buffers we only need to upload it once
  • 19. Constant Updates D3D9 Version / or Naïve D3D10 Version Pass1 Mesh2 Bone0 Mesh1 Set Mesh1 Bones Mesh2 Bone1 Mesh1 Bone1 Draw Mesh1 Mesh2 Bone2 Mesh1 Set Mesh2 Bones Constant Mesh2 Bone3 Mesh1 Draw Mesh2 Data Pass2 Mesh2 Bone4 Mesh1 Set Mesh1 Bones … Draw Mesh1 Mesh2 BoneN Mesh1 Set Mesh2 Bones Draw Mesh2
  • 20. Constant Updates Preferred D3D10 Version Mesh1 CB Mesh2 CB Frame Start Mesh1 Bone0 Mesh2 Bone0 Update Mesh1 CB Mesh1 Bone1 Mesh2 Bone1 Update Mesh2 CB Mesh1 Bone2 Mesh2 Bone2 Pass1 Mesh1 Bone3 Mesh2 Bone3 Bind Mesh1 CB Draw Mesh1 Mesh1 Bone4 Mesh2 Bone4 Bind Mesh2 CB … … Draw Mesh2 Mesh1 BoneN Mesh2 BoneN Pass2 Bind Mesh1 CB Draw Mesh1 Bind Mesh2 CB CB Slot 0 Mesh1 Mesh2 CB Draw Mesh2
  • 21. Constant Updates Advanced D3D10 Version Why not store all of our characters’ bones in a 128-bit FP texture? We can upload bones for all visible characters at the start of a frame We can draw similar characters using instancing instead of individual draws Use SV_InstanceID to select the start of the character’s bone data in the texture Stream the skinned meshes to memory using Stream Output and render all subsequent passes from the post-skinned buffer
  • 22. State Management Individual state setting is no longer possible in D3D10 State in D3D10 is stored in state objects These state objects are immutable To change even one aspect of a state object requires that you create an entirely new state object with that one change
  • 23. State Management Managing State Objects Solution 1 (Fastest) If you have a known set of materials and required states, you can create all state objects at load time State objects are small and there are finite set of permutations With all state objects created at runtime, all that needs to be done during rendering is to bind the object
  • 24. State Management Managing State Objects Solution 2 (Second Best) If your content is not finalized, or if you CANNOT get your engine to lump state together Create a state object hash table Hash off of the setting that has the most unique states Grab pre-created states from the hash-table Why not give your tools pipeline the ability to do this for a level and save out the results?
  • 25. Shader Linkage D3D9 shader linkage was based off of semantics (POSITION, NORMAL, TEXCOORDN) D3D10 linkage is based off of offsets and sizes This means stricter linkage rules This also means that the driver doesn’t have to link shaders together at every draw call!
  • 26. Shader Linkage No Holes Allowed! Elements must be read in the order they are output from the previous stage Cannot have “holes” between linkages Struct VS_OUTPUT Struct PS_INPUT { { float3 Norm : NORMAL; float2 Tex : TEXCOORD0; float3 Norm NORMAL; float2 Tex : TEXCOORD0; float3 Norm : NORMAL; Tex TEXCOORD0; float2 Tex2 : TEXCOORD1; float2 Tex2 : TEXCOORD1; float4 Pos : SV_POSITION; }; }; Holes at the end are OK
  • 27. Shader Linkage Input Assembler to Vertex Shader Input Layouts define the signature of the vertex stream data Input Layouts are the similar to Vertex Declarations in D3D9 Strict linkage rules are a big difference Creating Input Layouts on the fly is not recommended CreateInputLayout requires a shader signature to validate against
  • 28. Shader Linkage Input Assembler to Vertex Shader Solution 1 (Fastest) Create an Input Layout for each unique Vertex Stream / Vertex Shader combination up front Input Layouts are small This assumes that the shader input signature is available when you call CreateInputLayout Try to normalize Input Layouts across level or be art directed
  • 29. Shader Linkage Input Assembler to Vertex Shader Solution 2 (Second Best) If you load meshes and create input layouts before loading shaders, you might have a problem You can use a similar hashing scheme as the one used for State Objects When the Input Layout is needed, search the hash for an Input Layout that matches the Vertex Stream and Vertex Shader signature Why not store this data to a file and pre- populate the Input Layouts after your content is tuned?
  • 30. Shader Linkage Aside: Instancing Instancing is a first class citizen on D3D10! Stream source frequency is now part of the Input Layout Multiple frequencies will mean multiple Input Layouts
  • 31. Resource Updates Updating resources is different in D3D10 Create / Lock / Fill / Unlock paradigm is no longer necessary (although you can still do it) Texture data can be passed into the texture at create time
  • 32. Resource Updates Resource Usage Types D3D10_USAGE_DEFAULT D3D10_USAGE_IMMUTABLE D3D10_USAGE_DYNAMIC D3D10_USAGE_STAGING
  • 33. Resource Updates D3D10_USAGE_DEFAULT Use for resources that need fast GPU read and write access Can only be updated using UpdateSubresource Render targets are good candidates Textures that are updated infrequently (less than once per frame) are good candidates
  • 34. Resource Updates D3D10_USAGE_IMMUTABLE Use for resources that need fast GPU read access only Once they are created, they cannot be updated... ever Initial data must be passed in during the creation call Resources that will never change (static textures, VBs / Ibs) are good candidates Don’t bend over backwards trying to make everything D3D10_USAGE_IMMUTABLE
  • 35. Resource Updates D3D10_USAGE_DYNAMIC Use for resources that need fast CPU write access (at the expense of slower GPU read access) No CPU read access Can only be updated using Map with: D3D10_MAP_WRITE_DISCARD D3D10_MAP_WRITE_NO_OVERWRITE Dynamic Vertex Buffers are good candidates Dynamic (> once per frame) textures are good candidates
  • 36. Resource Updates D3D10_USAGE_STAGING This is the only way to read data back from the GPU Can only be updated using Map Cannot map with D3D10_MAP_WRITE_DISCARD or D3D10_MAP_WRITE_NO_OVERWRITE Might want to double buffer to keep from stalling GPU The GPU cannot directly use these
  • 37. Resource Updates Summary CPU updates the resource frequently (more than once per frame) Use D3D10_USAGE_DYNAMIC CPU updates the resource infrequently (once per frame or less) Use D3D10_USAGE_DEFAULT CPU doesn’t update the resource Use D3D10_USAGE_IMMUTABLE CPU needs to read the resource Use D3D10_USAGE_STAGING
  • 38. Resource Updates Example: Vertex Buffer The vertex buffer is touched by the CPU less than once per frame Create it with D3D10_USAGE_DEFAULT Update it with UpdateSubresource The vertex buffer is used for dynamic geometry and CPU need to update if multiple times per frame Create it with D3D10_USAGE_DYNAMIC Update it with Map
  • 39. Resource Updates The Exception: Constant Buffers CBs are always expected to be updated frequently Select CB usage based upon which one causes the least amount of system memory to be transferred Not just to the GPU, but system-to-system memory copies as well
  • 40. Resource Updates UpdateSubresource UpdateSubresource requires a system memory buffer and incurs an extra copy Use if you have system copies of your constant data already in one place
  • 41. Resource Updates Map Map requires no extra system memory but may hit driver renaming limits if abused Use if compositing values on the fly or collecting values from other places
  • 42. Resource Updates A note on overusing discard Use D3D10_MAP_WRITE_DISCARD carefully with buffers! D3D10_MAP_WRITE_DISCARD tells the driver to give us a new memory buffer if the current one is busy There are a LIMITED set of temporary buffers If these run out, then your app will stall until another buffer can be freed This can happen if you do dynamic geometry using one VB and D3D10_MAP_WRITE_DISCARD
  • 43. Dynamic Geometry DrawIndexedPrimitiveUP is gone! DrawPrimitiveUP is gone! Your well-behaved D3D9 app isn’t using these anyway, right?
  • 44. Dynamic Geometry Solution: Same as in D3D9 Use one large buffer, and map it with D3D10_MAP_WRITE_NO_OVERWRITE Advance the write position with every draw Wrap to the beginning Make sure your buffer is large enough that you’re not overwriting data that the GPU is reading This is what happens under the covers for D3D9 when using DIPUP or DUP in Windows Vista
  • 45. Porting Tips StretchRect is Gone Work around using render-to-texture A8R8G8B8 have been replaced with R8G8B8A8 formats Swizzle on texture load or swizzle in the shader Fixed Function AlphaTest is Gone Add logic to the shader and call discard Fixed Function Fog is Gone Add it to the shader
  • 46. Porting Tips Continued User Clip Planes usage has changed They’ve move to the shader Experiment with the SV_ClipDistance SEMANTIC vs discard in the PS to determine which is faster for your shader Query data sizes might have changed Occlusion queries are UINT64 vs DWORD No Triangle Fan Support Work around in content pipeline or on load SetCursorProperties, ShowCursor are gone Use Win32 APIs to handle cursors now
  • 47. Porting Tips Continued No offsets on Map calls This was basically API clutter in D3D9 Calculate the offset from the returned pointer Clears are no longer bound to pipeline state If you want a clear call to respect scissor, stencil, or other state, draw a full-screen quad This is closer to the HW The Driver/HW has been doing for you for years OMSetBlendState Never set the SampleMask to 0 in OMSetBlendState
  • 48. Porting Tips Continued Input Layout conversions tightened up D3DDECLTYPE_UBYTE4 in the vertex stream could be converted to a float4 in the VS in D3D9 IE. 255u in the stream would show up as 255.0 in the VS In D3D10 you either get a normalized [0..1] value or 255 (u)int Register keyword It doesn’t mean the same thing in D3D10 Use register to determine which CB slot a CB binds to Use packoffset to place a variable inside a CB
  • 49. Porting Tips Continued Sampler and Texture bindings Samplers can be bound independently of textures This is very flexible! Sampler and Texture slots are not always the same Register Packing In D3D9 all variables took up at least one float4 register (even if you only used a single float!) In D3D10 variables are packed together This saves a lot of space Make sure your engine doesn’t do everything based upon register offsets or your variables might alias
  • 50. Porting Tips Continued D3DSAMP_SRGBTEXTURE This sampler state setting does not exist on D3D10 Instead it’s included in the texture format This is more like the Xbox 360 Consider re-optimizing resource usage and upload for better D3D10 performance But use D3D10_USAGE_DEFAULT resources and UpdateSubresource and a baseline
  • 51. Summary Use the debug runtime! More draw calls usually means more constant updating and state changing calls Be frugal with constant updates Avoid resubmitting redundant data! Create as much state and input layout information up front as possible Select D3D10_USAGE for resources based upon the CPU access patterns needed Use D3D10_MAP_NO_OVERWRITE and a big buffer as a replacement for DIPUP and DUP
  • 52. Call to Action Actually exploit D3D10! This talk tells you how to get performance gains from a straight port You can get a whole lot more by using D3D10’s advanced features! StreamOut to minimize skinning costs First class instancing support Store some vertex data in textures Move some systems to the GPU (Particles?) Aggressive use of Constant Buffers
  • 53. http://www.xna.com © 2007 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.