SlideShare a Scribd company logo
1 of 39
Download to read offline
| © 2013 Aptina Imaging Corporation | Aptina Confidential1
© 2013 Aptina Imaging Corporation. All rights reserved. Products are warranted only to meet Aptina’s production data sheet specifications. Information, products, and/or
specifications are subject to change without notice. All information is provided on an “AS IS” basis without warranties of any kind. Dates are estimates only. Drawings not to
scale. Aptina and the Aptina logo are trademarks of Aptina Imaging Corporation. All other trademarks are the property of their respective owners.
Imaging on Embedded GPUs
Investigating flexible imaging pipelines using
embedded GPUs
Mikaël Bourges-Sévenier (msevenier at aptina dot com)
Director, High-Performance Imaging
December 19, 2013
Bay Area Multimedia
| © 2013 Aptina Imaging Corporation | Aptina Confidential2
•  Overview: the need for computational imaging
•  What is imaging?
•  Architecture of some embedded GPUs
•  8MP MobileHDR pipeline on ARM Mali T604
•  Khronos Camera: a standard API for computational imaging
•  Q&A
Agenda
| © 2013 Aptina Imaging Corporation | Aptina Confidential3
Computational Imaging evolution
Spatial
(Volumetric)
Gesture
AR
Face Detect
Face Track
Presence
Colorimetry
Brightness
Web Cam
Smart
Camera
True Color, Brightness
Compensation, Exposure control
User Identity
Access Control
Augmented Information
3D Imaging
Interactive
Services
| © 2013 Aptina Imaging Corporation | Aptina Confidential4
•  Requires significant computing over large data sets
Mobile Compute driving Imaging use cases
Augmented
Reality
Face, Body and
Gesture Tracking
Computational
Photography
3D Scene/Object
Reconstruction
Time
| © 2013 Aptina Imaging Corporation | Aptina Confidential5
Increasing Use of Imaging SensorsDifferentiationOpportunity
Time
Photography
Input = 2D Camera
Processors = ISP + CPU
Product = Static Images
Computational Photography
Input = MEMS + 2D Camera
Processors = ISP + CPU + GPU
Product = Real Time Images
We are here
Perceptual Imaging
Input = MEMS + Depth Camera
Processors = ISP + CPU + GPU + DSP
Product = Real Time Extracted Information
Perceptual Imaging1. Uses the full array of mobile sensors
2. to extract information in real-time
3. about the user and environment
4. to generate enhanced user interactions
| © 2013 Aptina Imaging Corporation | Aptina Confidential6
Hardware Save Power e.g. Camera Sensor ISP
•  CPU
‣  Single processor or Neon SIMD - running fast
‣  Makes heavy use of general memory
‣  Non-optimal performance and power
•  GPU
‣  Programmable and flexible
‣  Many way parallelism - run at lower frequency
‣  Efficient image caching close to processors
‣  BUT cycles frames in and out of memory
•  Camera ISP (Image Signal Processor)
‣  Little or no programmability
‣  Data flows thru compact hardware pipe
‣  Scan-line-based - no global memory
‣  Best perf/watt
| © 2013 Aptina Imaging Corporation | Aptina Confidential7
0
50
100
150
200
250
300
350
400
450
Sep-2011 Dec-2011 Apr-2012 Jul-2012 Oct-2012 Jan-2013 May-2013 Aug-2013 Nov-2013 Mar-2014 Jun-2014
Evolution of Embedded GPUs
GFLOPS
Trend
Adreno 320
Adreno 330
Mali T628
PowerVR 6
Tegra 5
PowerVR 5XT
Mali T604
40% more GFLOPS/quarter
Estimated at
sustained peak
performance.
Likely to be much
less in practice.
| © 2013 Aptina Imaging Corporation | Aptina Confidential8
•  Pre-processing: for non-standard Bayer pixels (e.g. iHDR)
•  ISP: for fast demosaic, lens shading, denoising, 3A, statistics …
•  Post-processing: for special reconstruction of colors (e.g. Clarity+)
•  Processing requires control of metadata aligned with data
Computational Imaging pipeline
Pre-processing
Image Signal Processor
(ISP)
Post-processing
CMOS sensor
Color Filter Array
Lens
Bayer RGB
YUV
App
Lens, sensor, aperture control
Metadata
3A
stats
| © 2013 Aptina Imaging Corporation | Aptina Confidential9
•  DSP are similar to CPU
‣  Typically integer optimized (some have rudimentary floating point support)
‣  With signal processing intrinsics
•  FPGA
‣  Can be tailored to a cross between CPU/DSP and GPU
Different Computing Devices
Latency-Optimized CPU
Fast serial
Processing
lots of big on-chip caches
sophisticated control
Throughput-Optimized GPU
Scalable parallel
Processing
multithreading can hide latency
simpler control, cost amortized over ALUs via SIMD
a b
c
+ +
SISD
(scalar ALU)
SIMD
(vector ALU)
b1 b2 b3 b4a2a1 a4a3
c1 c2 c3 c4
OpenCL works on
all devices but
performance
isn’t guaranteed
| © 2013 Aptina Imaging Corporation | Aptina Confidential10
•  Stream-based (ISP)
‣  For low-memory devices
‣  Set of lines processed by kernels
‣  Delay: #lines a kernel needs
•  Frame-based (GPU)
‣  For fast data-parallel devices
‣  Full image frame processed
‣  Delay: whole frame(s)
Stream-based vs. Frame-based
Kernel
continuous stream
of pixels
Q
Kernel
final image
accumulates lines
Kernel Kernel KernelFrame Frame
Frame Frame
Completely
different
kernels
| © 2013 Aptina Imaging Corporation | Aptina Confidential11
What is Imaging?
Capture image from a camera sensor and process it to get
a render-able image.
| © 2013 Aptina Imaging Corporation | Aptina Confidential12
How Imaging Sensors work
http://www.photoaxe.com
Bayer GRBG pattern
•  50% green
•  25% red and blue
Bayer CFA is one type
of pattern
| © 2013 Aptina Imaging Corporation | Aptina Confidential13
Bayer Demosaicing
•  50% More G than R, B since eye is more sensitive to luminance
than chrominance
•  Convert pixel colors from Bayer space to Full RGB color
•  Complex interpolation to avoid artifacts (e.g. on edges)
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
0 1
2 3
0 GRBG
1 RGGB
2 GBRG
3 BGGR
| © 2013 Aptina Imaging Corporation | Aptina Confidential14
OpenCL (memory system)
Desktop Embedded
Non-uniform memory
•  Data is physically copied between
GPU and CPU memory
Uniform memory
•  __local memory may be in __global
•  Cheap data exchange between
CPU and GPU
| © 2013 Aptina Imaging Corporation | Aptina Confidential15
A tour of some embedded GPUs
ARM Mali T604, Qualcomm Adreno 330
| © 2013 Aptina Imaging Corporation | Aptina Confidential16
ARM Mali T604, T628
•  Found in Samsung Exynos 5 Dual (T604)/Octa
(T628) Application Processors
‣  Chromebook, Nexus 10, Samsung S4…
•  32nm process for T604, 28nm for T628
•  T604 has 4 shader cores, T628 has 8 cores
•  Tri-pipe architecture: each GPU core has 3 types
of instruction pipelines
‣  1x load/store
‣  1x texture
‣  2x ALU (T604) / 4x ALU (T628)
•  64-bit integers and IEEE 754 floating-point ALUs
| © 2013 Aptina Imaging Corporation | Aptina Confidential17
29868v00
CONFIDENTIAL
OpenCL and OpenGL ES
The Vithar Architecture:
OpenGL ESOpenCL
Load/Store
Pipeline
Arithmetic
Pipeline
Arithmetic
Pipeline
Texturing
Pipeline
Thread Issue
Thread Completion
•  3 kinds of pipelines
‣  Arithmetic
‣  Load/Store
‣  Texture
•  Barrel-threaded (like AMD/NVIDIA)
•  No SIMT execution (unlike AMD/NVIDIA)
•  SIMD (like AMD)
‣  Use vectors for best performance!
•  256 threads max (64 in practice)
OpenCL and OpenGL ES
| © 2013 Aptina Imaging Corporation | Aptina Confidential18
•  Automatic hardware load
balancing
•  Seamless concurrent
execution
•  Integrated seamless power
manager
Midgard Job execution and Load-balancingJob Execution and Load-balancing
| © 2013 Aptina Imaging Corporation | Aptina Confidential19
Qualcomm MSM8974
•  Process: 28nm
•  CPU: 4x Krait 2.3 GHz,
‣  ARMv7A Neon instruction set
‣  Power and performance efficiencies over ARM
‣  4KB+4KB L0, 16KB+16KB L1, 2MB L2 cache
‣  No 64b support
•  GPU: Adreno 330 450 MHz
‣  32x 32b scalar ALUs/pipeline, 8 pipelines, 129.6 GFLOPS
•  16b kernels provide 2x performance
‣  128b registers
‣  8 KB local memory per shader core
‣  8 KB constant memory
‣  12 reads, 4 writes simultaneous per clock
‣  512 work-items max
‣  1.5 MB on-chip SRAM
‣  Tiled renderer max 3.6 GPix/s
•  Hexagon DSP
‣  3x core, 600 MHz, 16 KB L1, 256 KB L2, integrated MMU
‣  Limited floating-point support (no division, no log/
exp…)
•  RAM: 2GB 2x LP-DDR3 800 MHz (12.8 GB/s)
MSM8974 Adreno 330 vs Adreno 320
Adreno 330 has better performance
450 MHz GPU clock (up from 400 MHz in Adreno 320)
2x better shader performance than A320 – 2x more ALU blocks
Dedicated GPU power rail
Will allow GPU to be at a lower frequency and voltage than the FABR
Adreno 330 Shader Processor “SP” Block
Total of 32 (32-bit)
scalar ALUs
m
sevenier-aptina.com
98.248.48.48
2013.10.19
at21:47:19
PD
T
16-bit ALUs used if
all kernel is 16-bit,
otherwise 32b ALU is
used
| © 2013 Aptina Imaging Corporation | Aptina Confidential20
MobileHDR pipeline
| © 2013 Aptina Imaging Corporation | Aptina Confidential21
Arndale Samsung Exynos 5 Dual board
•  Arndale Samsung Exynos 5 board
‣  CPU: ARM Corte-A15 (2-core) 1.7 GHz 32nm
•  32KB L1 cache, 1MB L2 cache
‣  GPU: ARM MALI T604
•  64 concurrent threads
•  Vector ALUs
•  128b registers
•  OpenCL 1.1 Full Profile
‣  RAM: 2GB LP-DDR3 800 MHz (12.8 GB/s)
‣  Truly unified cached memory
•  CPU and GPU memory is shared – NO COPY!
•  128b wide L1 and L2 access
| © 2013 Aptina Imaging Corporation | Aptina Confidential22
ARM Mali T604 GPUs
In Samsung Exynos 5 Dual
Type Vector GPU Process 32nm
OpenCL 1.1 Full Profile Unified memory Yes
Rendering Tile Work-items 256
Clock 533MHz L2 cache 1MB
Register width 128b Global memory 2GB LP-DDR3 800Mhz (12.8 GB/s)
ALUs 8 (2 ALUs/core) Throughput 100 GFLOPS
Local memory 32KB/core (global)
Constant memory 64KB
Texture cache yes
Compute devices (shader
cores)
4
Cacheline 64 bytes
16/32/64b floats No/yes/yes
| © 2013 Aptina Imaging Corporation | Aptina Confidential23
Avoid buffer copy
•  Mali/Adreno have unified memory
‣  Use CL_MEM_ALLOC_PTR to avoid copy between CPU and GPU
•  Mali has no local memory
•  Adreno has local memory (1.5MB SRAM 115GB/s)
Host data pointers
Global
Memory
Buffer created
by malloc()
CPU
(Host)
GPU
(Compute
Device)
Buffers created by user (malloc) are not
mapped into the GPU memory space
Global
Memory
Buffer created
by malloc()
CPU
(Host)
Buffer created by
clCreateBuffer()
GPU
(Compute
Device)
COPY
clCreateBuffer(CL_MEM_USE_HOST_PTR)
creates a new buffer and copies the data over
(but the copy operations are expensive)
Global
Memory
Buffer created
by malloc()
Buffers created by user (malloc) are not
mapped into the GPU memory space
Global
Memory
Buffer created
by malloc()
CPU
(Host)
Buffer created by
clCreateBuffer()
GPU
(Compute
Device)
COPY
clCreateBuffer(CL_MEM_USE_HOST_PTR)
creates a new buffer and copies the data over
(but the copy operations are expensive)
Host data pointers
Global
Memory
CPU
(Host)
Buffer created by
clCreateBuffer()
GPU
(Compute
Device)
clCre
create
Where  possible  don’t  use  CL_
– Create buffers at the start of your app
– Use CL_MEM_ALLOC_HOST_PTR instead of m
– Then you can use the buffer on both
clCreateBuffer(CL_MEM_USE_HOST_PTR) clCreateBuffer(CL_MEM_ALLOC_HOST_PTR)malloc()
| © 2013 Aptina Imaging Corporation | Aptina Confidential24
Aptina Sensor with MobileHDR™ Turned off
| © 2013 Aptina Imaging Corporation | Aptina Confidential25
Aptina Sensor with MobileHDR™ Turned on
| © 2013 Aptina Imaging Corporation | Aptina Confidential26
AR0833 8MP Camera sensor
•  Frame is inscribed in a 1/3.2” circle
‣  4:3 for images e.g. 8MP 3264 x 2448
‣  16:9 for video e.g. 6MP 3264 x 1836
•  10-bit per pixel (framed in 16 bits)
•  At 30fps, we need 343 MB/s for 180 MPix/s
•  Interlaced HDR feature
•  Interface with ISP
‣  Data over MIPI CSI-2 (serial)
‣  Control over I2C
4:3
2448
3264
16:9
1836
3264
1/3.2" image circle
| © 2013 Aptina Imaging Corporation | Aptina Confidential27
Feature: Interlaced HDR
•  1 frame contains 2 exposures
interlaced
•  Ratio between odd and even pairs
‣  User controlled: 1x, 2x, 4x, 8x
single frame are captured at different integration times. This output is then mat
with an algorithm designed to reconstruct this output into an HDR still image or
The sensor HDR is controlled by two shutter pointers (Shutter pointer1, Shutter
pointer2) that control the integration of the odd (Shutter pointer1) and even (Sh
pointer 2) row pairs.
Figure 16: HDR Integration Time
Tint 1
Tint 2
Sample pointer
Shutter pointer 1
Shutter pointer 2
I-FRAME 1
I-FRAME 2
Output Frame from S
EXPOSURE
I-FRAME 1
EXPOSURE
I-FRAME 2
Output
I-FRAME 1 and 2
Features
Interlaced HDR Readout
The sensor enables HDR by outputting frames where even and odd row pairs within a
single frame are captured at different integration times. This output is then matched
with an algorithm designed to reconstruct this output into an HDR still image or video.
The sensor HDR is controlled by two shutter pointers (Shutter pointer1, Shutter
pointer2) that control the integration of the odd (Shutter pointer1) and even (Shutter
pointer 2) row pairs.
Figure 16: HDR Integration Time
Tint 1
Tint 2
Sample pointer
Shutter pointer 1
Shutter pointer 2
I-FRAME 1
I-FRAME 2
Output Frame from Sensor
EXPOSURE
I-FRAME 1
EXPOSURE
I-FRAME 2
Output
I-FRAME 1 and 2
Aptina reserves the right to change products or specifications witho
AR0833_DS - Rev. F Pub. 4/13 EN 30 ©2011 Aptina Imaging Corporation. All right
Figure 16: HDR Integration Time
Tint 1
Tint 2
Sample pointer
Shutter pointer 1
Shutter pointer 2
I-FRAME 1
I-FRAME 2
Output Frame from Senso
EXPOSURE
I-FRAME 1
EXPOSURE
I-FRAME 2
Output
I-FRAME 1 and 2
Exposure 1
Exposure 2
| © 2013 Aptina Imaging Corporation | Aptina Confidential28
mobileHDR demo
•  Zero-copy between sensor/OpenCL and OpenCL/OpenGL
•  On Arndale board (Samsung Exynos 5 Dual with Mali T604 GPU)
Noise
Reduction
iHDR
Reconstruction
Bayer scaler
Tone Mapping Color Correction
10b iHDR
3264x1836 14b
RGB888
EGLImage
CL Image
1080p
OpenCL
GL Texture
OpenGL ES
| © 2013 Aptina Imaging Corporation | Aptina Confidential29
Summary
•  Embedded GPUs are ideal candidates for computational imaging
‣  Performance at reasonable image size is now available
‣  Power efficiency is being addressed
•  OpenCL 1.1 is available on all recent application processors
‣  But may be reserved to OEM
‣  Performance portability isn’t guaranteed (but so it is true for any high-
performance applications)
•  Opening camera imaging processing “black box” is now feasible for
incredible new applications
| © 2013 Aptina Imaging Corporation | Aptina Confidential30
Khronos Camera
A standard to control image acquisition and
processing.
| © 2013 Aptina Imaging Corporation | Aptina Confidential31
Typical Imaging Pipeline
•  Pre- and Post-processing can be done on CPU, GPU, DSP…
•  ISP controls camera via 3A algorithms
Auto Exposure (AE), Auto White Balance (AWB), Auto Focus (AF)
•  ISP may be a separate chip or within Application Processor
Pre-processing
Image Signal Processor
(ISP)
Post-processing
CMOS sensor
Color Filter Array
Lens
Bayer RGB/YUV
App
Lens, sensor, aperture control 3A
Need for advanced camera control API:
- to drive more flexible app camera control
- over more types of camera sensors
- with tighter integration with the rest of the system
| © 2013 Aptina Imaging Corporation | Aptina Confidential32
Advanced Camera Control Use Cases
•  High-dynamic range (HDR) and computational flash photography
‣  High-speed burst with individual frame control over exposure and flash
•  Rolling shutter elimination
‣  High-precision intra-frame synchronization between camera and motion sensor
•  HDR Panorama, photo-spheres
‣  Continuous frame capture with constant exposure and white balance
•  Subject isolation and depth detection
•  High-speed burst with individual frame control over focus
•  Time-of-flight or structured light depth camera processing
‣  Aligned stacking of data from multiple sensors
•  Augmented Reality
‣  60Hz, low-latency capture with motion sensor synchronization
‣  Multiple Region of Interest (ROI) capture
‣  Multiple sensors for scene scaling
‣  Detailed feedback on camera operation per frame
| © 2013 Aptina Imaging Corporation | Aptina Confidential33
Camera API Architecture (FCAM based)
•  No global state
‣  State travels with image requests
‣  Every stage in the pipeline may have different state
•  -> allows fast, deterministic state changes
•  Synchronize devices
‣  Lens, flash, sound capture, gyro…
‣  Devices can schedule Actions
•  E.g. to be triggered on exposure change
•  Enables device synchronization
| © 2013 Aptina Imaging Corporation | Aptina Confidential34
Visual Sensor Revolution
•  Single sensor RGB cameras are just the start of the mobile visual revolution
‣  IR sensors – LEAP Motion, eye-trackers
•  Multi-sensors: Stereo pairs -> Plenoptic array -> Depth cameras
‣  Stereo pair can enable object scaling and enhanced depth extraction
‣  Plenoptic Field processing needs FFTs and ray-casting
•  Hybrid visual sensing solutions
‣  Different sensors mixed for different distances and lighting conditions
•  GPUs today – more dedicated ISPs tomorrow?
Dual Camera
LG Electronics
Plenoptic Array
Pelican imaging
Capri Structured Light 3D Camera
PrimeSense
| © 2013 Aptina Imaging Corporation | Aptina Confidential35
Khronos APIs for Augmented Reality
Advanced Camera
Control and stream
generation
3D Rendering and Video
Composition
On GPU
Audio
Rendering
Application
on CPUs, GPUs
and DSPs
Sensor
Fusion
Vision
Processing
MEMS
Sensors
Camera Control
API
EGLStream -
stream data
between APIs
Precision timestamps
on all sensor samples
AR needs not just advanced sensor processing, vision
acceleration, computation and rendering - but also for all
these subsystems to work efficiently together
| © 2013 Aptina Imaging Corporation | Aptina Confidential36
Khronos Camera API
•  Catalyze camera functionality not available on any current platform
‣  Open API that aligns with future platform directions for easy adoption
‣  E.g. could be used to implement future versions of Android Camera HAL
•  Control multiple sensors with synch and alignment
‣  E.g. Stereo pairs, Plenoptic arrays, TOF or structured light depth cameras
•  More detailed control per frame
‣  Format flexibility, Region of Interest (ROI) selection
•  Global Timing & Synchronization
‣  E.g. Between cameras and MEMS sensors
•  Application control over ISP processing (including 3A)
‣  Including multiple, re-entrant ISPs
•  Flexible processing/streaming
‣  Multiple output streams and streaming rows (not just frames)
‣  RAW, Bayer and YUV Processing
| © 2013 Aptina Imaging Corporation | Aptina Confidential37
Camera API Design Milestones and Philosophy
•  C-language API starting from proven designs
‣  e.g. FCAM, Android camera HAL V3
•  Design alignment with widely used hardware standards
‣  e.g. MIPI CSI
•  Focus on mobile, power-limited devices
‣  But do not preclude other use cases such as automotive, surveillance, DSLR…
•  Minimize overlap and maximize interoperability with other Khronos APIs
‣  But other Khronos APIs are not required
•  Provide support for vendor-specific extensions
Apr13
Jul13
Group charter
approved
4Q13
Provisional
specification
1Q14
First draft
specification
2Q14
Sample
implementation
and tests
3Q14
Specification
ratification
| © 2013 Aptina Imaging Corporation | Aptina Confidential38
Questions & Answers
Thank you!
Imaging on embedded GPUs

More Related Content

What's hot

Rsyslog vs Systemd Journal (Paper)
Rsyslog vs Systemd Journal (Paper)Rsyslog vs Systemd Journal (Paper)
Rsyslog vs Systemd Journal (Paper)Rainer Gerhards
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)Brendan Gregg
 
QEMU - Binary Translation
QEMU - Binary Translation QEMU - Binary Translation
QEMU - Binary Translation Jiann-Fuh Liaw
 
Cgroupあれこれ-第4回コンテナ型仮想化の情報交換会資料
Cgroupあれこれ-第4回コンテナ型仮想化の情報交換会資料Cgroupあれこれ-第4回コンテナ型仮想化の情報交換会資料
Cgroupあれこれ-第4回コンテナ型仮想化の情報交換会資料KamezawaHiroyuki
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machineAlexei Starovoitov
 
Writing the Container Network Interface(CNI) plugin in golang
Writing the Container Network Interface(CNI) plugin in golangWriting the Container Network Interface(CNI) plugin in golang
Writing the Container Network Interface(CNI) plugin in golangHungWei Chiu
 
CERN OpenStack Cloud Control Plane - From VMs to K8s
CERN OpenStack Cloud Control Plane - From VMs to K8sCERN OpenStack Cloud Control Plane - From VMs to K8s
CERN OpenStack Cloud Control Plane - From VMs to K8sBelmiro Moreira
 
Kernel Module Programming
Kernel Module ProgrammingKernel Module Programming
Kernel Module ProgrammingSaurabh Bangad
 
RISC-V : Berkeley Boot Loader & Proxy Kernelのソースコード解析
RISC-V : Berkeley Boot Loader & Proxy Kernelのソースコード解析RISC-V : Berkeley Boot Loader & Proxy Kernelのソースコード解析
RISC-V : Berkeley Boot Loader & Proxy Kernelのソースコード解析Mr. Vengineer
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016Brendan Gregg
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
 
New Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingScyllaDB
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMDevOps.com
 

What's hot (20)

Rsyslog vs Systemd Journal (Paper)
Rsyslog vs Systemd Journal (Paper)Rsyslog vs Systemd Journal (Paper)
Rsyslog vs Systemd Journal (Paper)
 
Qemu Introduction
Qemu IntroductionQemu Introduction
Qemu Introduction
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
GPU Acceleration for Containers on Intel Processor Graphics
GPU Acceleration for Containers on Intel Processor GraphicsGPU Acceleration for Containers on Intel Processor Graphics
GPU Acceleration for Containers on Intel Processor Graphics
 
DPDK & Cloud Native
DPDK & Cloud NativeDPDK & Cloud Native
DPDK & Cloud Native
 
QEMU - Binary Translation
QEMU - Binary Translation QEMU - Binary Translation
QEMU - Binary Translation
 
Cgroupあれこれ-第4回コンテナ型仮想化の情報交換会資料
Cgroupあれこれ-第4回コンテナ型仮想化の情報交換会資料Cgroupあれこれ-第4回コンテナ型仮想化の情報交換会資料
Cgroupあれこれ-第4回コンテナ型仮想化の情報交換会資料
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
 
Writing the Container Network Interface(CNI) plugin in golang
Writing the Container Network Interface(CNI) plugin in golangWriting the Container Network Interface(CNI) plugin in golang
Writing the Container Network Interface(CNI) plugin in golang
 
eBPFを用いたトレーシングについて
eBPFを用いたトレーシングについてeBPFを用いたトレーシングについて
eBPFを用いたトレーシングについて
 
CERN OpenStack Cloud Control Plane - From VMs to K8s
CERN OpenStack Cloud Control Plane - From VMs to K8sCERN OpenStack Cloud Control Plane - From VMs to K8s
CERN OpenStack Cloud Control Plane - From VMs to K8s
 
Kernel Module Programming
Kernel Module ProgrammingKernel Module Programming
Kernel Module Programming
 
RISC-V : Berkeley Boot Loader & Proxy Kernelのソースコード解析
RISC-V : Berkeley Boot Loader & Proxy Kernelのソースコード解析RISC-V : Berkeley Boot Loader & Proxy Kernelのソースコード解析
RISC-V : Berkeley Boot Loader & Proxy Kernelのソースコード解析
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
 
Ansible Automation - Enterprise Use Cases | Juncheng Anthony Lin
Ansible Automation - Enterprise Use Cases | Juncheng Anthony LinAnsible Automation - Enterprise Use Cases | Juncheng Anthony Lin
Ansible Automation - Enterprise Use Cases | Juncheng Anthony Lin
 
New Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using Tracing
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
 
FreeRTOS introduction
FreeRTOS introductionFreeRTOS introduction
FreeRTOS introduction
 
Linux Slab Allocator
Linux Slab AllocatorLinux Slab Allocator
Linux Slab Allocator
 

Viewers also liked

Qualcomm SnapDragon 800 Mobile Device
Qualcomm SnapDragon 800 Mobile DeviceQualcomm SnapDragon 800 Mobile Device
Qualcomm SnapDragon 800 Mobile DeviceJJ Wu
 
Svn에서 git으로 이주하기
Svn에서 git으로 이주하기Svn에서 git으로 이주하기
Svn에서 git으로 이주하기Seunghwa Song
 
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRV
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRVROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRV
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRVJuxi Leitner
 
ABS 2014 - Android Kit Kat Internals
ABS 2014 - Android Kit Kat InternalsABS 2014 - Android Kit Kat Internals
ABS 2014 - Android Kit Kat InternalsBenjamin Zores
 
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...Edge AI and Vision Alliance
 
OpenCV 에서 OpenCL 살짝 써보기
OpenCV 에서 OpenCL 살짝 써보기OpenCV 에서 OpenCL 살짝 써보기
OpenCV 에서 OpenCL 살짝 써보기Seunghwa Song
 

Viewers also liked (7)

Qualcomm SnapDragon 800 Mobile Device
Qualcomm SnapDragon 800 Mobile DeviceQualcomm SnapDragon 800 Mobile Device
Qualcomm SnapDragon 800 Mobile Device
 
CS-ISP Overview
CS-ISP OverviewCS-ISP Overview
CS-ISP Overview
 
Svn에서 git으로 이주하기
Svn에서 git으로 이주하기Svn에서 git으로 이주하기
Svn에서 git으로 이주하기
 
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRV
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRVROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRV
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRV
 
ABS 2014 - Android Kit Kat Internals
ABS 2014 - Android Kit Kat InternalsABS 2014 - Android Kit Kat Internals
ABS 2014 - Android Kit Kat Internals
 
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
 
OpenCV 에서 OpenCL 살짝 써보기
OpenCV 에서 OpenCL 살짝 써보기OpenCV 에서 OpenCL 살짝 써보기
OpenCV 에서 OpenCL 살짝 써보기
 

Similar to Imaging on embedded GPUs

“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...Edge AI and Vision Alliance
 
Droidcon2013 triangles gangolells_imagination
Droidcon2013 triangles gangolells_imaginationDroidcon2013 triangles gangolells_imagination
Droidcon2013 triangles gangolells_imaginationDroidcon Berlin
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)Fatima Qayyum
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architectureDhaval Kaneria
 
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision SystemHai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision SystemAI Frontiers
 
Arm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdfArm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdfPaul Yang
 
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...Edge AI and Vision Alliance
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre...
"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre..."An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre...
"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre...Edge AI and Vision Alliance
 
Machine Learning Developers - Know your GPUs
Machine Learning Developers - Know your GPUsMachine Learning Developers - Know your GPUs
Machine Learning Developers - Know your GPUsAmazon Web Services
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUAMD
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUsiguazio
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloadsinside-BigData.com
 
Lecture 15 ryuzo okada - vision processors for embedded computer vision
Lecture 15   ryuzo okada - vision processors for embedded computer visionLecture 15   ryuzo okada - vision processors for embedded computer vision
Lecture 15 ryuzo okada - vision processors for embedded computer visionmustafa sarac
 
Ximea - the pc camera, 90 gflps smart camera
Ximea  - the pc camera, 90 gflps smart cameraXimea  - the pc camera, 90 gflps smart camera
Ximea - the pc camera, 90 gflps smart cameraXIMEA
 
Nervana and the Future of Computing
Nervana and the Future of ComputingNervana and the Future of Computing
Nervana and the Future of ComputingIntel Nervana
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUAMD
 
19564926 graphics-processing-unit
19564926 graphics-processing-unit19564926 graphics-processing-unit
19564926 graphics-processing-unitDayakar Siddula
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...Edge AI and Vision Alliance
 

Similar to Imaging on embedded GPUs (20)

Imaging using ARM T6xx GPU
Imaging using ARM T6xx GPUImaging using ARM T6xx GPU
Imaging using ARM T6xx GPU
 
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
 
Droidcon2013 triangles gangolells_imagination
Droidcon2013 triangles gangolells_imaginationDroidcon2013 triangles gangolells_imagination
Droidcon2013 triangles gangolells_imagination
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
 
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision SystemHai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
 
Arm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdfArm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdf
 
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre...
"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre..."An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre...
"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre...
 
Machine Learning Developers - Know your GPUs
Machine Learning Developers - Know your GPUsMachine Learning Developers - Know your GPUs
Machine Learning Developers - Know your GPUs
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUs
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
 
Lecture 15 ryuzo okada - vision processors for embedded computer vision
Lecture 15   ryuzo okada - vision processors for embedded computer visionLecture 15   ryuzo okada - vision processors for embedded computer vision
Lecture 15 ryuzo okada - vision processors for embedded computer vision
 
Ximea - the pc camera, 90 gflps smart camera
Ximea  - the pc camera, 90 gflps smart cameraXimea  - the pc camera, 90 gflps smart camera
Ximea - the pc camera, 90 gflps smart camera
 
Nervana and the Future of Computing
Nervana and the Future of ComputingNervana and the Future of Computing
Nervana and the Future of Computing
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
 
19564926 graphics-processing-unit
19564926 graphics-processing-unit19564926 graphics-processing-unit
19564926 graphics-processing-unit
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
 

Recently uploaded

ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 

Recently uploaded (20)

ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

Imaging on embedded GPUs

  • 1. | © 2013 Aptina Imaging Corporation | Aptina Confidential1 © 2013 Aptina Imaging Corporation. All rights reserved. Products are warranted only to meet Aptina’s production data sheet specifications. Information, products, and/or specifications are subject to change without notice. All information is provided on an “AS IS” basis without warranties of any kind. Dates are estimates only. Drawings not to scale. Aptina and the Aptina logo are trademarks of Aptina Imaging Corporation. All other trademarks are the property of their respective owners. Imaging on Embedded GPUs Investigating flexible imaging pipelines using embedded GPUs Mikaël Bourges-Sévenier (msevenier at aptina dot com) Director, High-Performance Imaging December 19, 2013 Bay Area Multimedia
  • 2. | © 2013 Aptina Imaging Corporation | Aptina Confidential2 •  Overview: the need for computational imaging •  What is imaging? •  Architecture of some embedded GPUs •  8MP MobileHDR pipeline on ARM Mali T604 •  Khronos Camera: a standard API for computational imaging •  Q&A Agenda
  • 3. | © 2013 Aptina Imaging Corporation | Aptina Confidential3 Computational Imaging evolution Spatial (Volumetric) Gesture AR Face Detect Face Track Presence Colorimetry Brightness Web Cam Smart Camera True Color, Brightness Compensation, Exposure control User Identity Access Control Augmented Information 3D Imaging Interactive Services
  • 4. | © 2013 Aptina Imaging Corporation | Aptina Confidential4 •  Requires significant computing over large data sets Mobile Compute driving Imaging use cases Augmented Reality Face, Body and Gesture Tracking Computational Photography 3D Scene/Object Reconstruction Time
  • 5. | © 2013 Aptina Imaging Corporation | Aptina Confidential5 Increasing Use of Imaging SensorsDifferentiationOpportunity Time Photography Input = 2D Camera Processors = ISP + CPU Product = Static Images Computational Photography Input = MEMS + 2D Camera Processors = ISP + CPU + GPU Product = Real Time Images We are here Perceptual Imaging Input = MEMS + Depth Camera Processors = ISP + CPU + GPU + DSP Product = Real Time Extracted Information Perceptual Imaging1. Uses the full array of mobile sensors 2. to extract information in real-time 3. about the user and environment 4. to generate enhanced user interactions
  • 6. | © 2013 Aptina Imaging Corporation | Aptina Confidential6 Hardware Save Power e.g. Camera Sensor ISP •  CPU ‣  Single processor or Neon SIMD - running fast ‣  Makes heavy use of general memory ‣  Non-optimal performance and power •  GPU ‣  Programmable and flexible ‣  Many way parallelism - run at lower frequency ‣  Efficient image caching close to processors ‣  BUT cycles frames in and out of memory •  Camera ISP (Image Signal Processor) ‣  Little or no programmability ‣  Data flows thru compact hardware pipe ‣  Scan-line-based - no global memory ‣  Best perf/watt
  • 7. | © 2013 Aptina Imaging Corporation | Aptina Confidential7 0 50 100 150 200 250 300 350 400 450 Sep-2011 Dec-2011 Apr-2012 Jul-2012 Oct-2012 Jan-2013 May-2013 Aug-2013 Nov-2013 Mar-2014 Jun-2014 Evolution of Embedded GPUs GFLOPS Trend Adreno 320 Adreno 330 Mali T628 PowerVR 6 Tegra 5 PowerVR 5XT Mali T604 40% more GFLOPS/quarter Estimated at sustained peak performance. Likely to be much less in practice.
  • 8. | © 2013 Aptina Imaging Corporation | Aptina Confidential8 •  Pre-processing: for non-standard Bayer pixels (e.g. iHDR) •  ISP: for fast demosaic, lens shading, denoising, 3A, statistics … •  Post-processing: for special reconstruction of colors (e.g. Clarity+) •  Processing requires control of metadata aligned with data Computational Imaging pipeline Pre-processing Image Signal Processor (ISP) Post-processing CMOS sensor Color Filter Array Lens Bayer RGB YUV App Lens, sensor, aperture control Metadata 3A stats
  • 9. | © 2013 Aptina Imaging Corporation | Aptina Confidential9 •  DSP are similar to CPU ‣  Typically integer optimized (some have rudimentary floating point support) ‣  With signal processing intrinsics •  FPGA ‣  Can be tailored to a cross between CPU/DSP and GPU Different Computing Devices Latency-Optimized CPU Fast serial Processing lots of big on-chip caches sophisticated control Throughput-Optimized GPU Scalable parallel Processing multithreading can hide latency simpler control, cost amortized over ALUs via SIMD a b c + + SISD (scalar ALU) SIMD (vector ALU) b1 b2 b3 b4a2a1 a4a3 c1 c2 c3 c4 OpenCL works on all devices but performance isn’t guaranteed
  • 10. | © 2013 Aptina Imaging Corporation | Aptina Confidential10 •  Stream-based (ISP) ‣  For low-memory devices ‣  Set of lines processed by kernels ‣  Delay: #lines a kernel needs •  Frame-based (GPU) ‣  For fast data-parallel devices ‣  Full image frame processed ‣  Delay: whole frame(s) Stream-based vs. Frame-based Kernel continuous stream of pixels Q Kernel final image accumulates lines Kernel Kernel KernelFrame Frame Frame Frame Completely different kernels
  • 11. | © 2013 Aptina Imaging Corporation | Aptina Confidential11 What is Imaging? Capture image from a camera sensor and process it to get a render-able image.
  • 12. | © 2013 Aptina Imaging Corporation | Aptina Confidential12 How Imaging Sensors work http://www.photoaxe.com Bayer GRBG pattern •  50% green •  25% red and blue Bayer CFA is one type of pattern
  • 13. | © 2013 Aptina Imaging Corporation | Aptina Confidential13 Bayer Demosaicing •  50% More G than R, B since eye is more sensitive to luminance than chrominance •  Convert pixel colors from Bayer space to Full RGB color •  Complex interpolation to avoid artifacts (e.g. on edges) RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB 0 1 2 3 0 GRBG 1 RGGB 2 GBRG 3 BGGR
  • 14. | © 2013 Aptina Imaging Corporation | Aptina Confidential14 OpenCL (memory system) Desktop Embedded Non-uniform memory •  Data is physically copied between GPU and CPU memory Uniform memory •  __local memory may be in __global •  Cheap data exchange between CPU and GPU
  • 15. | © 2013 Aptina Imaging Corporation | Aptina Confidential15 A tour of some embedded GPUs ARM Mali T604, Qualcomm Adreno 330
  • 16. | © 2013 Aptina Imaging Corporation | Aptina Confidential16 ARM Mali T604, T628 •  Found in Samsung Exynos 5 Dual (T604)/Octa (T628) Application Processors ‣  Chromebook, Nexus 10, Samsung S4… •  32nm process for T604, 28nm for T628 •  T604 has 4 shader cores, T628 has 8 cores •  Tri-pipe architecture: each GPU core has 3 types of instruction pipelines ‣  1x load/store ‣  1x texture ‣  2x ALU (T604) / 4x ALU (T628) •  64-bit integers and IEEE 754 floating-point ALUs
  • 17. | © 2013 Aptina Imaging Corporation | Aptina Confidential17 29868v00 CONFIDENTIAL OpenCL and OpenGL ES The Vithar Architecture: OpenGL ESOpenCL Load/Store Pipeline Arithmetic Pipeline Arithmetic Pipeline Texturing Pipeline Thread Issue Thread Completion •  3 kinds of pipelines ‣  Arithmetic ‣  Load/Store ‣  Texture •  Barrel-threaded (like AMD/NVIDIA) •  No SIMT execution (unlike AMD/NVIDIA) •  SIMD (like AMD) ‣  Use vectors for best performance! •  256 threads max (64 in practice) OpenCL and OpenGL ES
  • 18. | © 2013 Aptina Imaging Corporation | Aptina Confidential18 •  Automatic hardware load balancing •  Seamless concurrent execution •  Integrated seamless power manager Midgard Job execution and Load-balancingJob Execution and Load-balancing
  • 19. | © 2013 Aptina Imaging Corporation | Aptina Confidential19 Qualcomm MSM8974 •  Process: 28nm •  CPU: 4x Krait 2.3 GHz, ‣  ARMv7A Neon instruction set ‣  Power and performance efficiencies over ARM ‣  4KB+4KB L0, 16KB+16KB L1, 2MB L2 cache ‣  No 64b support •  GPU: Adreno 330 450 MHz ‣  32x 32b scalar ALUs/pipeline, 8 pipelines, 129.6 GFLOPS •  16b kernels provide 2x performance ‣  128b registers ‣  8 KB local memory per shader core ‣  8 KB constant memory ‣  12 reads, 4 writes simultaneous per clock ‣  512 work-items max ‣  1.5 MB on-chip SRAM ‣  Tiled renderer max 3.6 GPix/s •  Hexagon DSP ‣  3x core, 600 MHz, 16 KB L1, 256 KB L2, integrated MMU ‣  Limited floating-point support (no division, no log/ exp…) •  RAM: 2GB 2x LP-DDR3 800 MHz (12.8 GB/s) MSM8974 Adreno 330 vs Adreno 320 Adreno 330 has better performance 450 MHz GPU clock (up from 400 MHz in Adreno 320) 2x better shader performance than A320 – 2x more ALU blocks Dedicated GPU power rail Will allow GPU to be at a lower frequency and voltage than the FABR Adreno 330 Shader Processor “SP” Block Total of 32 (32-bit) scalar ALUs m sevenier-aptina.com 98.248.48.48 2013.10.19 at21:47:19 PD T 16-bit ALUs used if all kernel is 16-bit, otherwise 32b ALU is used
  • 20. | © 2013 Aptina Imaging Corporation | Aptina Confidential20 MobileHDR pipeline
  • 21. | © 2013 Aptina Imaging Corporation | Aptina Confidential21 Arndale Samsung Exynos 5 Dual board •  Arndale Samsung Exynos 5 board ‣  CPU: ARM Corte-A15 (2-core) 1.7 GHz 32nm •  32KB L1 cache, 1MB L2 cache ‣  GPU: ARM MALI T604 •  64 concurrent threads •  Vector ALUs •  128b registers •  OpenCL 1.1 Full Profile ‣  RAM: 2GB LP-DDR3 800 MHz (12.8 GB/s) ‣  Truly unified cached memory •  CPU and GPU memory is shared – NO COPY! •  128b wide L1 and L2 access
  • 22. | © 2013 Aptina Imaging Corporation | Aptina Confidential22 ARM Mali T604 GPUs In Samsung Exynos 5 Dual Type Vector GPU Process 32nm OpenCL 1.1 Full Profile Unified memory Yes Rendering Tile Work-items 256 Clock 533MHz L2 cache 1MB Register width 128b Global memory 2GB LP-DDR3 800Mhz (12.8 GB/s) ALUs 8 (2 ALUs/core) Throughput 100 GFLOPS Local memory 32KB/core (global) Constant memory 64KB Texture cache yes Compute devices (shader cores) 4 Cacheline 64 bytes 16/32/64b floats No/yes/yes
  • 23. | © 2013 Aptina Imaging Corporation | Aptina Confidential23 Avoid buffer copy •  Mali/Adreno have unified memory ‣  Use CL_MEM_ALLOC_PTR to avoid copy between CPU and GPU •  Mali has no local memory •  Adreno has local memory (1.5MB SRAM 115GB/s) Host data pointers Global Memory Buffer created by malloc() CPU (Host) GPU (Compute Device) Buffers created by user (malloc) are not mapped into the GPU memory space Global Memory Buffer created by malloc() CPU (Host) Buffer created by clCreateBuffer() GPU (Compute Device) COPY clCreateBuffer(CL_MEM_USE_HOST_PTR) creates a new buffer and copies the data over (but the copy operations are expensive) Global Memory Buffer created by malloc() Buffers created by user (malloc) are not mapped into the GPU memory space Global Memory Buffer created by malloc() CPU (Host) Buffer created by clCreateBuffer() GPU (Compute Device) COPY clCreateBuffer(CL_MEM_USE_HOST_PTR) creates a new buffer and copies the data over (but the copy operations are expensive) Host data pointers Global Memory CPU (Host) Buffer created by clCreateBuffer() GPU (Compute Device) clCre create Where  possible  don’t  use  CL_ – Create buffers at the start of your app – Use CL_MEM_ALLOC_HOST_PTR instead of m – Then you can use the buffer on both clCreateBuffer(CL_MEM_USE_HOST_PTR) clCreateBuffer(CL_MEM_ALLOC_HOST_PTR)malloc()
  • 24. | © 2013 Aptina Imaging Corporation | Aptina Confidential24 Aptina Sensor with MobileHDR™ Turned off
  • 25. | © 2013 Aptina Imaging Corporation | Aptina Confidential25 Aptina Sensor with MobileHDR™ Turned on
  • 26. | © 2013 Aptina Imaging Corporation | Aptina Confidential26 AR0833 8MP Camera sensor •  Frame is inscribed in a 1/3.2” circle ‣  4:3 for images e.g. 8MP 3264 x 2448 ‣  16:9 for video e.g. 6MP 3264 x 1836 •  10-bit per pixel (framed in 16 bits) •  At 30fps, we need 343 MB/s for 180 MPix/s •  Interlaced HDR feature •  Interface with ISP ‣  Data over MIPI CSI-2 (serial) ‣  Control over I2C 4:3 2448 3264 16:9 1836 3264 1/3.2" image circle
  • 27. | © 2013 Aptina Imaging Corporation | Aptina Confidential27 Feature: Interlaced HDR •  1 frame contains 2 exposures interlaced •  Ratio between odd and even pairs ‣  User controlled: 1x, 2x, 4x, 8x single frame are captured at different integration times. This output is then mat with an algorithm designed to reconstruct this output into an HDR still image or The sensor HDR is controlled by two shutter pointers (Shutter pointer1, Shutter pointer2) that control the integration of the odd (Shutter pointer1) and even (Sh pointer 2) row pairs. Figure 16: HDR Integration Time Tint 1 Tint 2 Sample pointer Shutter pointer 1 Shutter pointer 2 I-FRAME 1 I-FRAME 2 Output Frame from S EXPOSURE I-FRAME 1 EXPOSURE I-FRAME 2 Output I-FRAME 1 and 2 Features Interlaced HDR Readout The sensor enables HDR by outputting frames where even and odd row pairs within a single frame are captured at different integration times. This output is then matched with an algorithm designed to reconstruct this output into an HDR still image or video. The sensor HDR is controlled by two shutter pointers (Shutter pointer1, Shutter pointer2) that control the integration of the odd (Shutter pointer1) and even (Shutter pointer 2) row pairs. Figure 16: HDR Integration Time Tint 1 Tint 2 Sample pointer Shutter pointer 1 Shutter pointer 2 I-FRAME 1 I-FRAME 2 Output Frame from Sensor EXPOSURE I-FRAME 1 EXPOSURE I-FRAME 2 Output I-FRAME 1 and 2 Aptina reserves the right to change products or specifications witho AR0833_DS - Rev. F Pub. 4/13 EN 30 ©2011 Aptina Imaging Corporation. All right Figure 16: HDR Integration Time Tint 1 Tint 2 Sample pointer Shutter pointer 1 Shutter pointer 2 I-FRAME 1 I-FRAME 2 Output Frame from Senso EXPOSURE I-FRAME 1 EXPOSURE I-FRAME 2 Output I-FRAME 1 and 2 Exposure 1 Exposure 2
  • 28. | © 2013 Aptina Imaging Corporation | Aptina Confidential28 mobileHDR demo •  Zero-copy between sensor/OpenCL and OpenCL/OpenGL •  On Arndale board (Samsung Exynos 5 Dual with Mali T604 GPU) Noise Reduction iHDR Reconstruction Bayer scaler Tone Mapping Color Correction 10b iHDR 3264x1836 14b RGB888 EGLImage CL Image 1080p OpenCL GL Texture OpenGL ES
  • 29. | © 2013 Aptina Imaging Corporation | Aptina Confidential29 Summary •  Embedded GPUs are ideal candidates for computational imaging ‣  Performance at reasonable image size is now available ‣  Power efficiency is being addressed •  OpenCL 1.1 is available on all recent application processors ‣  But may be reserved to OEM ‣  Performance portability isn’t guaranteed (but so it is true for any high- performance applications) •  Opening camera imaging processing “black box” is now feasible for incredible new applications
  • 30. | © 2013 Aptina Imaging Corporation | Aptina Confidential30 Khronos Camera A standard to control image acquisition and processing.
  • 31. | © 2013 Aptina Imaging Corporation | Aptina Confidential31 Typical Imaging Pipeline •  Pre- and Post-processing can be done on CPU, GPU, DSP… •  ISP controls camera via 3A algorithms Auto Exposure (AE), Auto White Balance (AWB), Auto Focus (AF) •  ISP may be a separate chip or within Application Processor Pre-processing Image Signal Processor (ISP) Post-processing CMOS sensor Color Filter Array Lens Bayer RGB/YUV App Lens, sensor, aperture control 3A Need for advanced camera control API: - to drive more flexible app camera control - over more types of camera sensors - with tighter integration with the rest of the system
  • 32. | © 2013 Aptina Imaging Corporation | Aptina Confidential32 Advanced Camera Control Use Cases •  High-dynamic range (HDR) and computational flash photography ‣  High-speed burst with individual frame control over exposure and flash •  Rolling shutter elimination ‣  High-precision intra-frame synchronization between camera and motion sensor •  HDR Panorama, photo-spheres ‣  Continuous frame capture with constant exposure and white balance •  Subject isolation and depth detection •  High-speed burst with individual frame control over focus •  Time-of-flight or structured light depth camera processing ‣  Aligned stacking of data from multiple sensors •  Augmented Reality ‣  60Hz, low-latency capture with motion sensor synchronization ‣  Multiple Region of Interest (ROI) capture ‣  Multiple sensors for scene scaling ‣  Detailed feedback on camera operation per frame
  • 33. | © 2013 Aptina Imaging Corporation | Aptina Confidential33 Camera API Architecture (FCAM based) •  No global state ‣  State travels with image requests ‣  Every stage in the pipeline may have different state •  -> allows fast, deterministic state changes •  Synchronize devices ‣  Lens, flash, sound capture, gyro… ‣  Devices can schedule Actions •  E.g. to be triggered on exposure change •  Enables device synchronization
  • 34. | © 2013 Aptina Imaging Corporation | Aptina Confidential34 Visual Sensor Revolution •  Single sensor RGB cameras are just the start of the mobile visual revolution ‣  IR sensors – LEAP Motion, eye-trackers •  Multi-sensors: Stereo pairs -> Plenoptic array -> Depth cameras ‣  Stereo pair can enable object scaling and enhanced depth extraction ‣  Plenoptic Field processing needs FFTs and ray-casting •  Hybrid visual sensing solutions ‣  Different sensors mixed for different distances and lighting conditions •  GPUs today – more dedicated ISPs tomorrow? Dual Camera LG Electronics Plenoptic Array Pelican imaging Capri Structured Light 3D Camera PrimeSense
  • 35. | © 2013 Aptina Imaging Corporation | Aptina Confidential35 Khronos APIs for Augmented Reality Advanced Camera Control and stream generation 3D Rendering and Video Composition On GPU Audio Rendering Application on CPUs, GPUs and DSPs Sensor Fusion Vision Processing MEMS Sensors Camera Control API EGLStream - stream data between APIs Precision timestamps on all sensor samples AR needs not just advanced sensor processing, vision acceleration, computation and rendering - but also for all these subsystems to work efficiently together
  • 36. | © 2013 Aptina Imaging Corporation | Aptina Confidential36 Khronos Camera API •  Catalyze camera functionality not available on any current platform ‣  Open API that aligns with future platform directions for easy adoption ‣  E.g. could be used to implement future versions of Android Camera HAL •  Control multiple sensors with synch and alignment ‣  E.g. Stereo pairs, Plenoptic arrays, TOF or structured light depth cameras •  More detailed control per frame ‣  Format flexibility, Region of Interest (ROI) selection •  Global Timing & Synchronization ‣  E.g. Between cameras and MEMS sensors •  Application control over ISP processing (including 3A) ‣  Including multiple, re-entrant ISPs •  Flexible processing/streaming ‣  Multiple output streams and streaming rows (not just frames) ‣  RAW, Bayer and YUV Processing
  • 37. | © 2013 Aptina Imaging Corporation | Aptina Confidential37 Camera API Design Milestones and Philosophy •  C-language API starting from proven designs ‣  e.g. FCAM, Android camera HAL V3 •  Design alignment with widely used hardware standards ‣  e.g. MIPI CSI •  Focus on mobile, power-limited devices ‣  But do not preclude other use cases such as automotive, surveillance, DSLR… •  Minimize overlap and maximize interoperability with other Khronos APIs ‣  But other Khronos APIs are not required •  Provide support for vendor-specific extensions Apr13 Jul13 Group charter approved 4Q13 Provisional specification 1Q14 First draft specification 2Q14 Sample implementation and tests 3Q14 Specification ratification
  • 38. | © 2013 Aptina Imaging Corporation | Aptina Confidential38 Questions & Answers Thank you!