SlideShare a Scribd company logo
1 of 42
SIGGRAPH 2013
Shaping the Future
of Visual Computing
The Future of Visual Computing: OpenGL 4.4 on ARM
Cass Everitt, Principal Engineer
Major Influences
Rise of the SoC
Established path of desktop
Re-emergence of OpenGL
My 1st SIGGRAPH was 1993 in Anaheim
Took the OpenGL course!
Rise of the SoC
Dominant CPU architecture is ARM
Mobility devices define the market today
Embedded growing in relevance
Set-top-box, automotive, unusual form factors
Support for advanced 3D rendering ubiquitous
Platforms tend to be Linux-like
The price is right
Performance AND power draw matter
Dynamic range of power draw high
Established Path of Desktop
Desktop has continued forward march
Defining GL4-class (DX11-class)
Developer focus has lagged
Consoles remained GL2-class (DX9-class)
ES2 emerged as GL2-class
New console generation announced
Developer focus shifting toward GL4
Mobile still lags, but not for long
Re-Emergence of OpenGL
5 years ago GL was (still) ―almost dead‖
Now arguably the most important API for future development
Growth market
But GL is quite fragmented
Many incompatible versions, portability suffers
Regal can help (more on this later)
New in OpenGL 4.4
Immutable buffer storage
Persistent mapping
Texture clear
Without binding into FBO
Enhanced layouts in GLSL
Simpler CPU / GPU sharing
Multi-bind
All bindings in one fell swoop
Query result -> buffer
skip CPU
Texture mirror clamp
Stencil-only textures
Render-buffers obsolete
NEW!
New ARB Extensions
Dynamic compute group size
Argument to launch now
Indirect parameters
Further CPU skipping
Draw parameters in shader
Seamless cubemap
Per-texture toggling
Vote
Managing SIMD divergence
10/11/11 packed float
vertex format
Sparse texture
Partially resident textures
Bindless texture
Almost unlimited now
NEW!
SoC Porting Issues
Power and heat are a constant concerns in
mobile
Phones particularly sensitive
Both battery life and device temp are both
central to the user experience
ARM is less forgiving about unaligned reads
Result: unexpected segfaults
Screen density and resolutions quickly
outpaced desktop
Tricky for devices with a small fraction of
desktop power draw
SoC PC
Ubuntu 12.04 works great
Tegra 4 best so far for desktop user experience
Slower devices can have noticable lag
Distcc can really help building large source trees
Cross-compile with arm-linux-gnueabihf-g++-4.6
Linux on ARM has all the tools that Linux on x86 does
So nice to apt-get 3rd party libraries already compiled
Cross compilers also just an apt-get away on x86 systems
Porting Down
Develop high then port down Break into more pieces -
Small Steps
Ports of big apps have lots of moving parts
Take small steps when possible
Working system after each step
Skip too many steps, isolating bugs gets
hard
Sometimes steps can be done in parallel
After the port, still useful to have an
intermediate system
e.g. ARM – Linux – GL convenient for regression
systems that support Linux but not Android
x86 – Windows – D3D
x86 – Windows - GL
x86 – Linux - GL
ARM – Linux - GL
ARM – Android - GL
GL versions
Mobile has historically been OpenGL ES only
Full OpenGL 4.4 support is coming to mobile though
Key differences between OpenGL ES 2.0 and OpenGL 4.4
GL ES 2.0 is roughly at parity with GL 2.0 and Direct3D 9
GL 4.4 is at parity with Direct3D 11
Geometry, Tessellation, and Compute shaders in GL 4.4
Texture Arrays
Transform Feedback
Floating point textures and blending
Non-Power-of-Two textures
Multiple Draw Buffers
Techniques Enabled by GL 4.4 vs ES 2.0
Global illumination
Curved surfaces
Shadow maps
Deferred shading
Forward Plus
HDR rendering
Virtual texturing
Path rendering
Compute integration
PTEX – What is it?
The soul of Ptex:
Model with Quads instead of Triangles
You‘re doing this for your next-gen engine anyways, right?
Every Quad gets its own entire texture UV-space
UV orientation is implicit in surface
definition
No explicit UV parameterization
Resolution of each face is
independent of neighbors.
PTEX (2)
Invented by Brent Burley at Walt Disney Animation Studios
Used in every animated film at Disney since 2007
6 features and all shorts, plus everything in
production now and for the foreseeable
future
Used on ~100% of surfaces
Rapid adoption in DCC tools
Widespread usage throughout
the film industry
PTEX - Benefits
No UV unwraps
Allow artists to work at any resolution they want
Perform an offline pass on assets to decide what to ship for each platform
based on capabilities
Ship a texture pack later for tail revenue
Reduce your load times. And your memory
footprint. Improve your visual fidelity.
Reduce the cost of production‘s
long pole—art.
Advanced Blending
Various standards support “blend modes”
Compositing standards
2D standards
Path rendering standards
Example standards
PostScript, PDF, SVG, OpenVG, XRender, Cairo, Skia, Mac’s
Quartz 2D, Flash, Java 2D, Photoshop, Illustrator
Blend modes have a theory distinct from 3D’s
glBlendFunc, etc. functionality
glBlendFunc, etc. expose hardware operations
Blend modes based on sound compositing theory
Blend Mode
Examples
• Normal
• Multiply
• Screen
• Overlay
• Soft Light
• Hard Light
• Color Dodge
• Color Burn
• Darken
• Lighten
• Difference
• Exclusion
• Hue
• Saturation
• Color
• Luminosity
Conventional OpenGL
blend modes support
a small subset of
those used in other
environments.
Target for Advanced Blending
Market justification: 2D, compositing, and path rendering
standards key to smart phones, tablets, and similar devices
Motivation is primarily for low-end, power-constrained devices
Also part of content creation
Autodesk Mudbox, Adobe Illustrator, etc. all use blend modes as
their vocabulary for compositing
Power-efficient hardware support for “blend modes”
exposed in NV_blend_equation_advanced
Path Rendering
A rendering approach
Resolution-independent two-
dimensional graphics
Occlusion & transparency depend on
rendering order
So called ―Painter‘s Algorithm‖
Basic primitive is a path to be filled
or stroked
Path is a sequence of path
commands
Commands are
– moveto, lineto, curveto,
arcto, closepath, etc.
Not just zoomed & rotated,
also perspective
No tricks
Every glyph is
rendered from its
outline; no render-to-texture
Magnify & minify with
no transitional
pixelization
or tile popping
artifacts
synced
to refresh
rate; 60 Hz
updates
Live demo!
Web page Control points of
TrueType glyphs
visualized
Zoomed in
Projected
NV_path_rendering
Compared to Alternatives
Alternative APIs rendering same content
-
200.00
400.00
600.00
800.00
1,000.00
1,200.00
1,400.00
1,600.00
1,800.00
2,000.00
100x100
200x200
300x300
400x400
500x500
600x600
700x700
800x800
900x900
1000x1000
1100x1100
Window Resolution in Pixels
Framespersecond
Cairo
Qt
Skia Bitmap
Skia Ganesh FBO (16x)
Skia Ganesh Aliased (1x)
Direct2D GPU
Direct2D WARP
With Release 300 driver NV_path_rendering
-
200.00
400.00
600.00
800.00
1,000.00
1,200.00
1,400.00
1,600.00
1,800.00
2,000.00
100x100
200x200
300x300
400x400
500x500
600x600
700x700
800x800
900x900
1000x1000
1100x1100
Window Resolution in Pixels
Framespersecond
16x
8x
4x
2x
1x
Configuration
GPU: GeForce 480 GTX (GF100)
CPU: Core i7 950 @ 3.07 GHz
Alternative approaches
are all much slower
Path Rendering Standards
Document
Printing and
Exchange
Immersive
Web
Experience
2D Graphics
Programming
Interfaces
Office
Productivity
Applications
Resolution-
Independent
Fonts
OpenType
TrueType
Flash
Open XML
Paper (XPS)
Java 2D
API
Mac OS X
2D API
Khronos API
Adobe Illustrator
Inkscape
Open Source
Scalable
Vector
Graphics
QtGui
API
HTML 5
Ocean Simulation
Simulation based on Tessendorf‘s
algorithm (Phillips spectrum)
Used from Titanic (‗97) to Life of Pi
Models a fully developed sea state
Computed in the CPU using FFT/FFT
-1
into a heightmap
FFT on heights
FFT on chop
128x128 repeated
Ocean - Rendering
Dynamic water surface
Per pixel reflection
Per pixel refraction
Fresnel
Simple water light scattering
Shader math-gic
Simple caustics simulation
5x5 upward rays refracted and dotted
against sun. Per Vertex after tess.
Ocean - Tessellation
Tessellation
Used in terrain with displacement maps
On water surface using FFT
LOD control: dynamic tessellation factors base on k*1/distance to camera
Ocean – All together
Demo
API Futures Commentary
API needs to develop better support for
High efficiency
Multiple threads
Clarity
Modernize API with an eye toward preserving compatibilty
Prematurely deprecated
Fixed function – great for efficiency, simplicity
Display lists – efficiency and multi-thread
Immediate mode – actually better with display lists…
Display Lists
Display lists with client side vertex arrays were
pointless
Had to source all the vertex data at compile time
Advent of VBO should have changed this to keep vertex data
indirect
Without display lists, API cannot express coherence well
Particularly frame-to-frame coherence
Need a way to express a static rendering sequence
Display lists also natural for multi-threaded support
Compile lists on worker threads
Execute lists main render thread
Direct3D 11 attempted something similar, but it did not scale
well
Multi-Draw Indirect
Another overhead reduction strategy
Valuable, valid growth direction
But don‘t allow state changes between the Draw calls like
display lists do
Makes things less obvious
Not simply back references in the log
But the draw calls can be constructed by the GPU shaders
Stable Abstractions
Why do stable software abstractions exist?
Division of labor, allow competing implementations
Better for the ecosystem
Graphics APIs have trended toward lower-level
abstraction
Good to expose lower level abstractions
Can get better efficiency for some uses
However they are less portable
Higher level not bad
Allowed Reyes and Ray Tracing as back end for RenderMan
Both high and low level important going forward
Random Thoughts
Steal some good ideas from the web
State dump string in canonical form
Skip default state, alphabetize rest… (forward compatible)
Careful with ―variable‖ names
– Object names, slots
State setting via structured string – e.g. JSON
Differential rendering
API that minimizes chatter without sacrificing clarity and efficiency
Ecosystem needs better compiler tools
That run without a GL context, especially in mobile
Ecosystem
Community of things that depend on each other
Success of individual components only part of the story
Prosperity depends on mix of components
OpenGL ecosystem more bazaar than cathedral
Often difficult to monetize some vital components
Some success stories
NSight
Apitrace
Regal
NVIDIA® Nsight™ Visual Studio Edition
Visual Studio integrated development for GPU and CPU
ProfileDebugBuild
Frame profiler - OpenGL 4.2, Direct3D 9/11
• Automatic GPU bottleneck determination
• Draw call and Frame timings
• Direct3D Perf Markers and render state grouping/sorting
Application and system trace
• Inspect OpenGL and Direct3D activities / CPU and GPU
• Correlate threads, call stack, API calls, WDDM kernel
queues and resulting GPU workloads
• Concurrent draw call execution and memory transfer trace
Frame debugger - OpenGL 4.2, Direct3D 9/11
• Draw call and state inspection
• Frame capture and playback (source code gen D3D9/11)
• Nsight HUD for draw call scrubbing and inspection
HLSL and GLSL Shader debugger
• Native GPU shader debugging and GPU memory views
• Complex condition breakpoints and Pixel History
• Local Single GPU shader debugging
NVIDIA Nsight for Graphics Developers
OpenGL 4.2, Direct3D 9/11
apitrace
Community maintained project on GitHub
http://github.com/apitrace/apitrace
Means of sharing repro command sequence
Good for functional bugs (repro app is often a real pain)
Analyzing best-case perf
Enables easy editing and variation experiments
Playback via glretrace not currently optimized for speed
Trace -> code probably most reliable for perf study
Though glretrace can be made much faster
Regal
Community maintained project on GitHub
http://github.com/p3/regal
Defragmented GL – write one portable back end
Support compatibility in software
If driver does not support
Immediate mode & fixed function work again!
Large class of graphics apps for which this model is preferable for most
rendering
Broad support for emulated features – even on ES
DSA, VAO
Planned: SSO, path rendering, enhanced display lists
Regal (2)
Ecosystem anchor point
Integrated http server for debug
Inspect or alter context state or objects, pause rendering
API log dumps
Apitrace integration
Shader, texture dump and replacement
Open source – BSD license
On github (http://github.com/p3/regal)
Numerous contributors from all over
Platform support
Windows, OS X, Linux, Android, iOS, NaCl
Questions?
cass@nvidia.com
Thanks to
Seth Williams, Bastiaan Aarts, John McDonald, Mark Kilgard, Miguel Sainz,
Simon Green, Jeff Kiel, Jan Paul van Waveren
FIN

More Related Content

What's hot

EXT_window_rectangles
EXT_window_rectanglesEXT_window_rectangles
EXT_window_rectanglesMark Kilgard
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Prabindh Sundareson
 
FlameWorks GTC 2014
FlameWorks GTC 2014FlameWorks GTC 2014
FlameWorks GTC 2014Simon Green
 
NVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityNVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityMark Kilgard
 
Avoiding 19 Common OpenGL Pitfalls
Avoiding 19 Common OpenGL PitfallsAvoiding 19 Common OpenGL Pitfalls
Avoiding 19 Common OpenGL PitfallsMark Kilgard
 
Siggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsSiggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsTristan Lorach
 
GFX Part 4 - Introduction to Texturing in OpenGL ES
GFX Part 4 - Introduction to Texturing in OpenGL ESGFX Part 4 - Introduction to Texturing in OpenGL ES
GFX Part 4 - Introduction to Texturing in OpenGL ESPrabindh Sundareson
 
GTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path RenderingGTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path Rendering Mark Kilgard
 
Future Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsFuture Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsElectronic Arts / DICE
 
Oit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsOit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsHolger Gruen
 
vkFX: Effect(ive) approach for Vulkan API
vkFX: Effect(ive) approach for Vulkan APIvkFX: Effect(ive) approach for Vulkan API
vkFX: Effect(ive) approach for Vulkan APITristan Lorach
 
SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012Mark Kilgard
 
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...Mark Kilgard
 
Checkerboard Rendering in Dark Souls: Remastered by QLOC
Checkerboard Rendering in Dark Souls: Remastered by QLOCCheckerboard Rendering in Dark Souls: Remastered by QLOC
Checkerboard Rendering in Dark Souls: Remastered by QLOCQLOC
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect AndromedaElectronic Arts / DICE
 
GS-4147, TressFX 2.0, by Bill-Bilodeau
GS-4147, TressFX 2.0, by Bill-BilodeauGS-4147, TressFX 2.0, by Bill-Bilodeau
GS-4147, TressFX 2.0, by Bill-BilodeauAMD Developer Central
 
GFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsGFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsPrabindh Sundareson
 
GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11smashflt
 
GPU accelerated path rendering fastforward
GPU accelerated path rendering fastforwardGPU accelerated path rendering fastforward
GPU accelerated path rendering fastforwardMark Kilgard
 

What's hot (20)

EXT_window_rectangles
EXT_window_rectanglesEXT_window_rectangles
EXT_window_rectangles
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011
 
FlameWorks GTC 2014
FlameWorks GTC 2014FlameWorks GTC 2014
FlameWorks GTC 2014
 
NVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityNVIDIA's OpenGL Functionality
NVIDIA's OpenGL Functionality
 
Avoiding 19 Common OpenGL Pitfalls
Avoiding 19 Common OpenGL PitfallsAvoiding 19 Common OpenGL Pitfalls
Avoiding 19 Common OpenGL Pitfalls
 
Siggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentialsSiggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentials
 
OpenGL 4 for 2010
OpenGL 4 for 2010OpenGL 4 for 2010
OpenGL 4 for 2010
 
GFX Part 4 - Introduction to Texturing in OpenGL ES
GFX Part 4 - Introduction to Texturing in OpenGL ESGFX Part 4 - Introduction to Texturing in OpenGL ES
GFX Part 4 - Introduction to Texturing in OpenGL ES
 
GTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path RenderingGTC 2012: GPU-Accelerated Path Rendering
GTC 2012: GPU-Accelerated Path Rendering
 
Future Directions for Compute-for-Graphics
Future Directions for Compute-for-GraphicsFuture Directions for Compute-for-Graphics
Future Directions for Compute-for-Graphics
 
Oit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked ListsOit And Indirect Illumination Using Dx11 Linked Lists
Oit And Indirect Illumination Using Dx11 Linked Lists
 
vkFX: Effect(ive) approach for Vulkan API
vkFX: Effect(ive) approach for Vulkan APIvkFX: Effect(ive) approach for Vulkan API
vkFX: Effect(ive) approach for Vulkan API
 
SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012SIGGRAPH 2012: NVIDIA OpenGL for 2012
SIGGRAPH 2012: NVIDIA OpenGL for 2012
 
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
Slides: Accelerating Vector Graphics Rendering using the Graphics Hardware Pi...
 
Checkerboard Rendering in Dark Souls: Remastered by QLOC
Checkerboard Rendering in Dark Souls: Remastered by QLOCCheckerboard Rendering in Dark Souls: Remastered by QLOC
Checkerboard Rendering in Dark Souls: Remastered by QLOC
 
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
4K Checkerboard in Battlefield 1 and Mass Effect Andromeda
 
GS-4147, TressFX 2.0, by Bill-Bilodeau
GS-4147, TressFX 2.0, by Bill-BilodeauGS-4147, TressFX 2.0, by Bill-Bilodeau
GS-4147, TressFX 2.0, by Bill-Bilodeau
 
GFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsGFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
 
GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11
 
GPU accelerated path rendering fastforward
GPU accelerated path rendering fastforwardGPU accelerated path rendering fastforward
GPU accelerated path rendering fastforward
 

Similar to SIGGRAPH 2013: The Future of Visual Computing on ARM

The Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next StepsThe Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next StepsJohan Andersson
 
Minko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSLMinko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSLMinko3D
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...AMD Developer Central
 
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...AMD Developer Central
 
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...Edge AI and Vision Alliance
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Intel® Software
 
High DPI for desktop applications
High DPI for desktop applicationsHigh DPI for desktop applications
High DPI for desktop applicationsKirill Grouchnikov
 
Minko - Flash Conference #5
Minko - Flash Conference #5Minko - Flash Conference #5
Minko - Flash Conference #5Minko3D
 
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated SystemsPetapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated Systemsdairsie
 
TopMod3d - Texas Open Source Symposium
TopMod3d - Texas Open Source SymposiumTopMod3d - Texas Open Source Symposium
TopMod3d - Texas Open Source SymposiumDavid Morris
 
EFL: Scaling From the Embedded World to the Desktop
EFL: Scaling From the Embedded World to the DesktopEFL: Scaling From the Embedded World to the Desktop
EFL: Scaling From the Embedded World to the DesktopSamsung Open Source Group
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Johan Andersson
 
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio Owen Wu
 
Is Multicore Hardware For General-Purpose Parallel Processing Broken? : Notes
Is Multicore Hardware For General-Purpose Parallel Processing Broken? : NotesIs Multicore Hardware For General-Purpose Parallel Processing Broken? : Notes
Is Multicore Hardware For General-Purpose Parallel Processing Broken? : NotesSubhajit Sahu
 
Stream SQL eventflow visual programming for real programmers presentation
Stream SQL eventflow visual programming for real programmers presentationStream SQL eventflow visual programming for real programmers presentation
Stream SQL eventflow visual programming for real programmers presentationstreambase
 
CS 354 Programmable Shading
CS 354 Programmable ShadingCS 354 Programmable Shading
CS 354 Programmable ShadingMark Kilgard
 
Learn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFVLearn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFVGhodhbane Mohamed Amine
 
Final apu13 phil-rogers-keynote-21
Final apu13 phil-rogers-keynote-21Final apu13 phil-rogers-keynote-21
Final apu13 phil-rogers-keynote-21r Skip
 
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono..."The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...Edge AI and Vision Alliance
 

Similar to SIGGRAPH 2013: The Future of Visual Computing on ARM (20)

The Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next StepsThe Rendering Pipeline - Challenges & Next Steps
The Rendering Pipeline - Challenges & Next Steps
 
Minko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSLMinko - Targeting Flash/Stage3D with C++ and GLSL
Minko - Targeting Flash/Stage3D with C++ and GLSL
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
 
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
 
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
“Open Standards: Powering the Future of Embedded Vision,” a Presentation from...
 
Mantle for Developers
Mantle for DevelopersMantle for Developers
Mantle for Developers
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)
 
High DPI for desktop applications
High DPI for desktop applicationsHigh DPI for desktop applications
High DPI for desktop applications
 
Minko - Flash Conference #5
Minko - Flash Conference #5Minko - Flash Conference #5
Minko - Flash Conference #5
 
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated SystemsPetapath HP Cast 12 - Programming for High Performance Accelerated Systems
Petapath HP Cast 12 - Programming for High Performance Accelerated Systems
 
TopMod3d - Texas Open Source Symposium
TopMod3d - Texas Open Source SymposiumTopMod3d - Texas Open Source Symposium
TopMod3d - Texas Open Source Symposium
 
EFL: Scaling From the Embedded World to the Desktop
EFL: Scaling From the Embedded World to the DesktopEFL: Scaling From the Embedded World to the Desktop
EFL: Scaling From the Embedded World to the Desktop
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
 
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
 
Is Multicore Hardware For General-Purpose Parallel Processing Broken? : Notes
Is Multicore Hardware For General-Purpose Parallel Processing Broken? : NotesIs Multicore Hardware For General-Purpose Parallel Processing Broken? : Notes
Is Multicore Hardware For General-Purpose Parallel Processing Broken? : Notes
 
Stream SQL eventflow visual programming for real programmers presentation
Stream SQL eventflow visual programming for real programmers presentationStream SQL eventflow visual programming for real programmers presentation
Stream SQL eventflow visual programming for real programmers presentation
 
CS 354 Programmable Shading
CS 354 Programmable ShadingCS 354 Programmable Shading
CS 354 Programmable Shading
 
Learn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFVLearn more about the tremendous value Open Data Plane brings to NFV
Learn more about the tremendous value Open Data Plane brings to NFV
 
Final apu13 phil-rogers-keynote-21
Final apu13 phil-rogers-keynote-21Final apu13 phil-rogers-keynote-21
Final apu13 phil-rogers-keynote-21
 
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono..."The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
"The Vision API Maze: Options and Trade-offs," a Presentation from the Khrono...
 

Recently uploaded

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Recently uploaded (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

SIGGRAPH 2013: The Future of Visual Computing on ARM

  • 1. SIGGRAPH 2013 Shaping the Future of Visual Computing The Future of Visual Computing: OpenGL 4.4 on ARM Cass Everitt, Principal Engineer
  • 2. Major Influences Rise of the SoC Established path of desktop Re-emergence of OpenGL My 1st SIGGRAPH was 1993 in Anaheim Took the OpenGL course!
  • 3. Rise of the SoC Dominant CPU architecture is ARM Mobility devices define the market today Embedded growing in relevance Set-top-box, automotive, unusual form factors Support for advanced 3D rendering ubiquitous Platforms tend to be Linux-like The price is right Performance AND power draw matter Dynamic range of power draw high
  • 4. Established Path of Desktop Desktop has continued forward march Defining GL4-class (DX11-class) Developer focus has lagged Consoles remained GL2-class (DX9-class) ES2 emerged as GL2-class New console generation announced Developer focus shifting toward GL4 Mobile still lags, but not for long
  • 5. Re-Emergence of OpenGL 5 years ago GL was (still) ―almost dead‖ Now arguably the most important API for future development Growth market But GL is quite fragmented Many incompatible versions, portability suffers Regal can help (more on this later)
  • 6. New in OpenGL 4.4 Immutable buffer storage Persistent mapping Texture clear Without binding into FBO Enhanced layouts in GLSL Simpler CPU / GPU sharing Multi-bind All bindings in one fell swoop Query result -> buffer skip CPU Texture mirror clamp Stencil-only textures Render-buffers obsolete NEW!
  • 7. New ARB Extensions Dynamic compute group size Argument to launch now Indirect parameters Further CPU skipping Draw parameters in shader Seamless cubemap Per-texture toggling Vote Managing SIMD divergence 10/11/11 packed float vertex format Sparse texture Partially resident textures Bindless texture Almost unlimited now NEW!
  • 8. SoC Porting Issues Power and heat are a constant concerns in mobile Phones particularly sensitive Both battery life and device temp are both central to the user experience ARM is less forgiving about unaligned reads Result: unexpected segfaults Screen density and resolutions quickly outpaced desktop Tricky for devices with a small fraction of desktop power draw
  • 9. SoC PC Ubuntu 12.04 works great Tegra 4 best so far for desktop user experience Slower devices can have noticable lag Distcc can really help building large source trees Cross-compile with arm-linux-gnueabihf-g++-4.6 Linux on ARM has all the tools that Linux on x86 does So nice to apt-get 3rd party libraries already compiled Cross compilers also just an apt-get away on x86 systems
  • 10. Porting Down Develop high then port down Break into more pieces -
  • 11. Small Steps Ports of big apps have lots of moving parts Take small steps when possible Working system after each step Skip too many steps, isolating bugs gets hard Sometimes steps can be done in parallel After the port, still useful to have an intermediate system e.g. ARM – Linux – GL convenient for regression systems that support Linux but not Android x86 – Windows – D3D x86 – Windows - GL x86 – Linux - GL ARM – Linux - GL ARM – Android - GL
  • 12. GL versions Mobile has historically been OpenGL ES only Full OpenGL 4.4 support is coming to mobile though Key differences between OpenGL ES 2.0 and OpenGL 4.4 GL ES 2.0 is roughly at parity with GL 2.0 and Direct3D 9 GL 4.4 is at parity with Direct3D 11 Geometry, Tessellation, and Compute shaders in GL 4.4 Texture Arrays Transform Feedback Floating point textures and blending Non-Power-of-Two textures Multiple Draw Buffers
  • 13. Techniques Enabled by GL 4.4 vs ES 2.0 Global illumination Curved surfaces Shadow maps Deferred shading Forward Plus HDR rendering Virtual texturing Path rendering Compute integration
  • 14. PTEX – What is it? The soul of Ptex: Model with Quads instead of Triangles You‘re doing this for your next-gen engine anyways, right? Every Quad gets its own entire texture UV-space UV orientation is implicit in surface definition No explicit UV parameterization Resolution of each face is independent of neighbors.
  • 15. PTEX (2) Invented by Brent Burley at Walt Disney Animation Studios Used in every animated film at Disney since 2007 6 features and all shorts, plus everything in production now and for the foreseeable future Used on ~100% of surfaces Rapid adoption in DCC tools Widespread usage throughout the film industry
  • 16. PTEX - Benefits No UV unwraps Allow artists to work at any resolution they want Perform an offline pass on assets to decide what to ship for each platform based on capabilities Ship a texture pack later for tail revenue Reduce your load times. And your memory footprint. Improve your visual fidelity. Reduce the cost of production‘s long pole—art.
  • 17. Advanced Blending Various standards support “blend modes” Compositing standards 2D standards Path rendering standards Example standards PostScript, PDF, SVG, OpenVG, XRender, Cairo, Skia, Mac’s Quartz 2D, Flash, Java 2D, Photoshop, Illustrator Blend modes have a theory distinct from 3D’s glBlendFunc, etc. functionality glBlendFunc, etc. expose hardware operations Blend modes based on sound compositing theory
  • 18. Blend Mode Examples • Normal • Multiply • Screen • Overlay • Soft Light • Hard Light • Color Dodge • Color Burn • Darken • Lighten • Difference • Exclusion • Hue • Saturation • Color • Luminosity Conventional OpenGL blend modes support a small subset of those used in other environments.
  • 19. Target for Advanced Blending Market justification: 2D, compositing, and path rendering standards key to smart phones, tablets, and similar devices Motivation is primarily for low-end, power-constrained devices Also part of content creation Autodesk Mudbox, Adobe Illustrator, etc. all use blend modes as their vocabulary for compositing Power-efficient hardware support for “blend modes” exposed in NV_blend_equation_advanced
  • 20. Path Rendering A rendering approach Resolution-independent two- dimensional graphics Occlusion & transparency depend on rendering order So called ―Painter‘s Algorithm‖ Basic primitive is a path to be filled or stroked Path is a sequence of path commands Commands are – moveto, lineto, curveto, arcto, closepath, etc.
  • 21. Not just zoomed & rotated, also perspective No tricks Every glyph is rendered from its outline; no render-to-texture Magnify & minify with no transitional pixelization or tile popping artifacts synced to refresh rate; 60 Hz updates
  • 22. Live demo! Web page Control points of TrueType glyphs visualized Zoomed in Projected
  • 23. NV_path_rendering Compared to Alternatives Alternative APIs rendering same content - 200.00 400.00 600.00 800.00 1,000.00 1,200.00 1,400.00 1,600.00 1,800.00 2,000.00 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 Window Resolution in Pixels Framespersecond Cairo Qt Skia Bitmap Skia Ganesh FBO (16x) Skia Ganesh Aliased (1x) Direct2D GPU Direct2D WARP With Release 300 driver NV_path_rendering - 200.00 400.00 600.00 800.00 1,000.00 1,200.00 1,400.00 1,600.00 1,800.00 2,000.00 100x100 200x200 300x300 400x400 500x500 600x600 700x700 800x800 900x900 1000x1000 1100x1100 Window Resolution in Pixels Framespersecond 16x 8x 4x 2x 1x Configuration GPU: GeForce 480 GTX (GF100) CPU: Core i7 950 @ 3.07 GHz Alternative approaches are all much slower
  • 24. Path Rendering Standards Document Printing and Exchange Immersive Web Experience 2D Graphics Programming Interfaces Office Productivity Applications Resolution- Independent Fonts OpenType TrueType Flash Open XML Paper (XPS) Java 2D API Mac OS X 2D API Khronos API Adobe Illustrator Inkscape Open Source Scalable Vector Graphics QtGui API HTML 5
  • 25. Ocean Simulation Simulation based on Tessendorf‘s algorithm (Phillips spectrum) Used from Titanic (‗97) to Life of Pi Models a fully developed sea state Computed in the CPU using FFT/FFT -1 into a heightmap FFT on heights FFT on chop 128x128 repeated
  • 26. Ocean - Rendering Dynamic water surface Per pixel reflection Per pixel refraction Fresnel Simple water light scattering Shader math-gic Simple caustics simulation 5x5 upward rays refracted and dotted against sun. Per Vertex after tess.
  • 27. Ocean - Tessellation Tessellation Used in terrain with displacement maps On water surface using FFT LOD control: dynamic tessellation factors base on k*1/distance to camera
  • 28. Ocean – All together
  • 29. Demo
  • 30. API Futures Commentary API needs to develop better support for High efficiency Multiple threads Clarity Modernize API with an eye toward preserving compatibilty Prematurely deprecated Fixed function – great for efficiency, simplicity Display lists – efficiency and multi-thread Immediate mode – actually better with display lists…
  • 31. Display Lists Display lists with client side vertex arrays were pointless Had to source all the vertex data at compile time Advent of VBO should have changed this to keep vertex data indirect Without display lists, API cannot express coherence well Particularly frame-to-frame coherence Need a way to express a static rendering sequence Display lists also natural for multi-threaded support Compile lists on worker threads Execute lists main render thread Direct3D 11 attempted something similar, but it did not scale well
  • 32. Multi-Draw Indirect Another overhead reduction strategy Valuable, valid growth direction But don‘t allow state changes between the Draw calls like display lists do Makes things less obvious Not simply back references in the log But the draw calls can be constructed by the GPU shaders
  • 33. Stable Abstractions Why do stable software abstractions exist? Division of labor, allow competing implementations Better for the ecosystem Graphics APIs have trended toward lower-level abstraction Good to expose lower level abstractions Can get better efficiency for some uses However they are less portable Higher level not bad Allowed Reyes and Ray Tracing as back end for RenderMan Both high and low level important going forward
  • 34. Random Thoughts Steal some good ideas from the web State dump string in canonical form Skip default state, alphabetize rest… (forward compatible) Careful with ―variable‖ names – Object names, slots State setting via structured string – e.g. JSON Differential rendering API that minimizes chatter without sacrificing clarity and efficiency Ecosystem needs better compiler tools That run without a GL context, especially in mobile
  • 35. Ecosystem Community of things that depend on each other Success of individual components only part of the story Prosperity depends on mix of components OpenGL ecosystem more bazaar than cathedral Often difficult to monetize some vital components Some success stories NSight Apitrace Regal
  • 36. NVIDIA® Nsight™ Visual Studio Edition Visual Studio integrated development for GPU and CPU ProfileDebugBuild
  • 37. Frame profiler - OpenGL 4.2, Direct3D 9/11 • Automatic GPU bottleneck determination • Draw call and Frame timings • Direct3D Perf Markers and render state grouping/sorting Application and system trace • Inspect OpenGL and Direct3D activities / CPU and GPU • Correlate threads, call stack, API calls, WDDM kernel queues and resulting GPU workloads • Concurrent draw call execution and memory transfer trace Frame debugger - OpenGL 4.2, Direct3D 9/11 • Draw call and state inspection • Frame capture and playback (source code gen D3D9/11) • Nsight HUD for draw call scrubbing and inspection HLSL and GLSL Shader debugger • Native GPU shader debugging and GPU memory views • Complex condition breakpoints and Pixel History • Local Single GPU shader debugging NVIDIA Nsight for Graphics Developers OpenGL 4.2, Direct3D 9/11
  • 38. apitrace Community maintained project on GitHub http://github.com/apitrace/apitrace Means of sharing repro command sequence Good for functional bugs (repro app is often a real pain) Analyzing best-case perf Enables easy editing and variation experiments Playback via glretrace not currently optimized for speed Trace -> code probably most reliable for perf study Though glretrace can be made much faster
  • 39. Regal Community maintained project on GitHub http://github.com/p3/regal Defragmented GL – write one portable back end Support compatibility in software If driver does not support Immediate mode & fixed function work again! Large class of graphics apps for which this model is preferable for most rendering Broad support for emulated features – even on ES DSA, VAO Planned: SSO, path rendering, enhanced display lists
  • 40. Regal (2) Ecosystem anchor point Integrated http server for debug Inspect or alter context state or objects, pause rendering API log dumps Apitrace integration Shader, texture dump and replacement Open source – BSD license On github (http://github.com/p3/regal) Numerous contributors from all over Platform support Windows, OS X, Linux, Android, iOS, NaCl
  • 41. Questions? cass@nvidia.com Thanks to Seth Williams, Bastiaan Aarts, John McDonald, Mark Kilgard, Miguel Sainz, Simon Green, Jeff Kiel, Jan Paul van Waveren
  • 42. FIN