SlideShare a Scribd company logo
1 of 72
Download to read offline
try! Swift Tokyo 2018
Shuichi Tsutsumi
@shu223
UIImageView vs Metal
Shuichi Tsutsumi @shu223
• iOS Developer
- @Fyusion Inc.
- @Freelance
Today’s Goal
• Learn “how to use Metal”
• Be conscious the GPU layer through Metal
Agenda
• Compare the graphics rendering performance of
Metal to UIImageView
→ Learn a lot around GPU
1. UIKit is optimized well with GPU.
2. Consider also the GPU, when measuring the performance.
3. Pay attention to the processing flow between CPU and GPU.
4. Be careful where the resource is.
imageView.image = image
What’s happening?
ScreenProcessor
Frame
Buffer
Pixel Data for a frame
Write 60 times/sec
Draw Pixels
* Resolution
Difference Between CPU and GPU
CPU is a Sports Car
• Very fast
• Can’t process many tasks in
parallel
GPU is a Bus
• Not as fast as CPU
• Can process many “same” tasks
in parallel
• CPU is very fast, good for any tasks (general-purpose
processor)
- However, if used to process everything, it will easily reach to
100% load.
→ Utilize GPU as much as possible, 

if the task is good for GPU

(= can be computed in parallel)
Processor ScreenFrame
Buffer
Write 60 times/sec
* Resolution
Pixel Data for 1 frame
GPU
What’s ?
Provide access to GPU
GPU
Your app
???
What’s the difference from
OpenGL?
OpenGL
• Cross-platform
• Supports many vendors’ GPUs
Metal
• Developed by Apple
• Optimized for Apple’s hardware
• 10x faster than OpenGL
Sounds great!
Metal Implementation
imageView.image = image
To achieve this with Metal…
func draw(in view: MTKView) {
guard let drawable = view.currentDrawable else {return}
guard let commandBuffer = commandQueue.makeCommandBuffer() else {fatalError()}
guard let blitEncoder = commandBuffer.makeBlitCommandEncoder() else {fatalError()}
let w = min(texture.width, drawable.texture.width)
let h = min(texture.height, drawable.texture.height)
blitEncoder.copy(from: texture,
sourceSlice: 0,
sourceLevel: 0,
sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0),
sourceSize: MTLSizeMake(w, h, texture.depth),
to: drawable.texture,
destinationSlice: 0,
destinationLevel: 0,
destinationOrigin: MTLOrigin(x: 0, y: 0, z: 0))
blitEncoder.endEncoding()
commandBuffer.present(drawable)
commandBuffer.commit()
commandBuffer.waitUntilCompleted()
}
private let device = MTLCreateSystemDefaultDevice()!


private func setup() {
commandQueue = device.makeCommandQueue()
let textureLoader = MTKTextureLoader(device: device)
texture = try! textureLoader.newTexture(
name: "highsierra",
scaleFactor: view.contentScaleFactor,
bundle: nil)
mtkView.device = device
mtkView.delegate = self
mtkView.colorPixelFormat = texture.pixelFormat
}
this is the Minimumimplementation
private let device = MTLCreateSystemDefaultDevice()!


private func setup() {
commandQueue = device.makeCommandQueue()
let textureLoader = MTKTextureLoader(device: device)
texture = try! textureLoader.newTexture(
name: "highsierra",
scaleFactor: view.contentScaleFactor,
bundle: nil)
mtkView.device = device
mtkView.delegate = self
mtkView.colorPixelFormat = texture.pixelFormat
}
func draw(in view: MTKView) {
guard let drawable = view.currentDrawable else {return}
guard let commandBuffer = commandQueue.makeCommandBuffer() else {fatalE
guard let blitEncoder = commandBuffer.makeBlitCommandEncoder() else {fa
let w = min(texture.width, drawable.texture.width)
let h = min(texture.height, drawable.texture.height)
blitEncoder.copy(from: texture,
sourceSlice: 0,
sourceLevel: 0,
sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0),
sourceSize: MTLSizeMake(w, h, texture.depth),
to: drawable.texture,
destinationSlice: 0,
destinationLevel: 0,
destinationOrigin: MTLOrigin(x: 0, y: 0, z: 0))
blitEncoder.endEncoding()
commandBuffer.present(drawable)
commandBuffer.commit()
commandBuffer.waitUntilCompleted()
}
imageView.image = image
private let device = MTLCreateSystemDefaultDevice()!


private func setup() {
commandQueue = device.makeCommandQueue()
let textureLoader = MTKTextureLoader(device: device)
texture = try! textureLoader.newTexture(
name: "highsierra",
scaleFactor: view.contentScaleFactor,
bundle: nil)
mtkView.device = device
mtkView.delegate = self
mtkView.colorPixelFormat = texture.pixelFormat
}
func draw(in view: MTKView) {
guard let drawable = view.currentDrawable else {return}
guard let commandBuffer = commandQueue.makeCommandBuffer() else {fatalE
guard let blitEncoder = commandBuffer.makeBlitCommandEncoder() else {fa
let w = min(texture.width, drawable.texture.width)
let h = min(texture.height, drawable.texture.height)
blitEncoder.copy(from: texture,
sourceSlice: 0,
sourceLevel: 0,
sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0),
sourceSize: MTLSizeMake(w, h, texture.depth),
to: drawable.texture,
destinationSlice: 0,
destinationLevel: 0,
destinationOrigin: MTLOrigin(x: 0, y: 0, z: 0))
blitEncoder.endEncoding()
commandBuffer.present(drawable)
commandBuffer.commit()
commandBuffer.waitUntilCompleted()
}
imageView.image = image
💡
My Idea:
A Metal wrapper class to draw an image
✓ Easy to use as UIImageView
✓ Metal Accelerated
“MetalImageView”
metalImageView.texture = texture
Powered by
Performance comparison
with UIImageView
Sample App
for the comparison
• Render large images in table cells.
- 5120 x 3200 (elcapitan.jpg)
- 1245 x 1245 (sierra.png)
Measuring Code
let time1 = CACurrentMediaTime()
if isMetal {
let metalCell = cell as! MetalTableViewCell
metalCell.metalImageView.textureName = name
} else {
let uikitCell = cell as! TableViewCell
uikitCell.uiImageView.image = UIImage(named: name)
}
let time2 = CACurrentMediaTime()
print("time:(time2-time1)")
Time
Interval
Render with UIImageView
Render with Metal
Results
• Metal is 10x - 20x faster!
Time to render an image
UIImageView 0.4 - 0.6 msec
Metal 0.02 - 0.05 msec
iPhone 6s
Something weird
Metal is more lagging, choppy
UIImageView Metal
Measuring Code
let time1 = CACurrentMediaTime()
if isMetal {
let metalCell = cell as! MetalTableViewCell
metalCell.metalImageView.textureName = name
} else {
let uikitCell = cell as! TableViewCell
uikitCell.uiImageView.image = UIImage(named: name)
}
let time2 = CACurrentMediaTime()
print("time:(time2-time1)")
Basic Concept
2. CPU creates GPU commands 

as a command buffer
1. Load image data to memory 

for GPU (& CPU)
4. GPU processes
the commands
3. Push it to GPU
let time1 = CACurrentMediaTime()
if isMetal {
let metalCell = cell as! MetalTableViewCell
metalCell.metalImageView.textureName = name
} else {
let uikitCell = cell as! TableViewCell
uikitCell.uiImageView.image = UIImage(named: name)
}
let time2 = CACurrentMediaTime()
print("time:(time2-time1)")
2. CPU creates GPU commands 

as a command buffer
1. Load image data to memory 

for GPU (& CPU)
3. Push it to GPU
4. GPU processes
the commands
NOT Considered!
• Measure the time until the GPU processing is completed
func draw(in view: MTKView) {
// Prepare the command buffer
...
// Push the command buffer
commandBuffer.commit()
// Wait
commandBuffer.waitUntilCompleted()
// Measure
let endTime = CACurrentMediaTime()
print(“Time: (endTime - startTime)”)
}
Fixed measuring code
Submit commands to GPU
Wait until the GPU processing
is completed
Calculate the total time
Results
• Metal is SLOWER !?
- Less than 30fps even the best case
→ My implementation should have problems
• UIImageView is fast enough anyways.
Time to render an image
UIImageView 0.4 - 0.6 msec
Metal 40 - 200 msec
Why does UIImageView so fast?
※WWDC17 Platforms State of the Unionより
UIKit internally uses Metal
• UIKit has been updated, and optimized well.
• Should use UIKit rather than making a custom UI
component with low level APIs (e.g. Metal) unless
there is particular reasons it can be better.
Point 1:
UIKit is optimized well
with GPU
Point 2:
Consider also the GPU, 

when measuring the performance
Why was MetalImageView so slow?
What was the problem?
(My Metal Wrapper)
Profile using Instruments
Metal System Trace
On CPU
On GPU
Create command buffers etc.(on CPU)
Submit command buffers etc.(on CPU)
Process shaders(on GPU)
On CPU
On GPU
Problem 1
Resize
(MPSImageLanczosScale)
Render
(MTLBlitCommandEncoder)
Unexpected interval
Measuring Time
Current processing flow
1. Resize with MPSImageLanczosScale
2. After 1 is completed, call setNeedsDisplay()
3. draw(in:) of MTKViewDelegate is called
4. Render to screen in the draw(in:)
Problem
The CPU is waiting for the GPU
On CPU
On GPU
Resize
Render
2. CPU creates GPU commands 

as a command buffer
4. GPU processes
the commands
3. Push it to
GPU
FIX: Combined the commands
• Encode both commands into a command buffer
- Resize
- Render
• Push the command buffer to a GPU
2. CPU creates GPU commands 

as a command buffer
4. GPU processes
the commands
3. Push it to
GPU
Resize
Render
Resize Render
Unexpected interval
Combine
Resize Render Resize+Render
Point 3:
Pay attention to the processing flow
between CPU and GPU
Problem 2
Speculation:
Loading textures is the bottleneck(?)
1. Load image data to
memory 

for GPU (& CPU)
Measure the time to load textures
let startTime = CACurrentMediaTime()
textureLoader.newTexture(name: name, scaleFactor: scaleFactor, bundle: nil) { (texture,
error) in
let endTime = CACurrentMediaTime()
print("Time to load (name): (endTime - startTime)")
• Results: 20 - 500 msec
→ It’s the bottleneck!
Fix: Cache the loaded textures
• UIImage(named:) caches internally, too
• “Caching loaded image data” is NOT a Metal/GPU
specific idea.
Metal/GPU specific point:
“Where is the resource?”
Memory for GPU
(& CPU)
private var cachedTextures: [String: MTLTexture] = [:]OK
private var cachedImages: [String: UIImage] = [:]NG
After adopting Cache
Point 4:
Be careful where the resource is.
Wrap up
Today’s Goal
• Learn “how to use Metal”
• Be conscious the GPU layer through Metal
• Compared the graphics rendering performance of
Metal to UIImageView
→ Learned a lot around GPU
1. UIKit is optimized well with GPU.
2. Consider also the GPU, when measuring the performance.
3. Pay attention to the processing flow between CPU and GPU.
4. Be careful where the resource is.
Thank you!
https://github.com/shu223

More Related Content

What's hot

Teknik pemesinan jilid 2
Teknik pemesinan jilid 2Teknik pemesinan jilid 2
Teknik pemesinan jilid 2
Alen Pepa
 
PUM1 - 3PsikoanalisaHumanisticPsikologikontemporer
PUM1 - 3PsikoanalisaHumanisticPsikologikontemporerPUM1 - 3PsikoanalisaHumanisticPsikologikontemporer
PUM1 - 3PsikoanalisaHumanisticPsikologikontemporer
mfrids
 
Part 10 pengantar basis data
Part 10 pengantar basis dataPart 10 pengantar basis data
Part 10 pengantar basis data
Dermawan12
 
Otak, pikiran, perilaku manusia
Otak, pikiran, perilaku manusiaOtak, pikiran, perilaku manusia
Otak, pikiran, perilaku manusia
anom monalope
 

What's hot (20)

Perkembangan RAM
Perkembangan RAMPerkembangan RAM
Perkembangan RAM
 
Diri sosial
Diri sosialDiri sosial
Diri sosial
 
Low level java programming
Low level java programmingLow level java programming
Low level java programming
 
set associative mapping
set associative mappingset associative mapping
set associative mapping
 
cache memory
cache memorycache memory
cache memory
 
Studi kasus perpindahan panas konduksi 2-Dimensi pada permukaaan datar
Studi kasus perpindahan panas konduksi 2-Dimensi pada permukaaan datarStudi kasus perpindahan panas konduksi 2-Dimensi pada permukaaan datar
Studi kasus perpindahan panas konduksi 2-Dimensi pada permukaaan datar
 
Sister 012 - name service
Sister   012 - name serviceSister   012 - name service
Sister 012 - name service
 
Teknik pemesinan jilid 2
Teknik pemesinan jilid 2Teknik pemesinan jilid 2
Teknik pemesinan jilid 2
 
Elastisitas Zat Padat
Elastisitas Zat PadatElastisitas Zat Padat
Elastisitas Zat Padat
 
Psikologi Sosial; Sosial Kognisi
Psikologi Sosial; Sosial KognisiPsikologi Sosial; Sosial Kognisi
Psikologi Sosial; Sosial Kognisi
 
Presentasi Matlab
Presentasi MatlabPresentasi Matlab
Presentasi Matlab
 
PUM1 - 3PsikoanalisaHumanisticPsikologikontemporer
PUM1 - 3PsikoanalisaHumanisticPsikologikontemporerPUM1 - 3PsikoanalisaHumanisticPsikologikontemporer
PUM1 - 3PsikoanalisaHumanisticPsikologikontemporer
 
Contoh knn
Contoh knnContoh knn
Contoh knn
 
MongoDB Internals
MongoDB InternalsMongoDB Internals
MongoDB Internals
 
Part 10 pengantar basis data
Part 10 pengantar basis dataPart 10 pengantar basis data
Part 10 pengantar basis data
 
GC free coding in @Java presented @Geecon
GC free coding in @Java presented @GeeconGC free coding in @Java presented @Geecon
GC free coding in @Java presented @Geecon
 
Otak, pikiran, perilaku manusia
Otak, pikiran, perilaku manusiaOtak, pikiran, perilaku manusia
Otak, pikiran, perilaku manusia
 
SEJARAH PSIKOLOGI ABNORMAL.ppt
SEJARAH PSIKOLOGI ABNORMAL.pptSEJARAH PSIKOLOGI ABNORMAL.ppt
SEJARAH PSIKOLOGI ABNORMAL.ppt
 
Perpan ii pertemuan 2 ok
Perpan ii pertemuan 2 okPerpan ii pertemuan 2 ok
Perpan ii pertemuan 2 ok
 
PPT.KELOMPOK 4 MP.pptx penambahan rencana proyek
PPT.KELOMPOK 4 MP.pptx penambahan rencana proyekPPT.KELOMPOK 4 MP.pptx penambahan rencana proyek
PPT.KELOMPOK 4 MP.pptx penambahan rencana proyek
 

Similar to UIImageView vs Metal #tryswiftconf

Advanced iOS Build Mechanics, Sebastien Pouliot
Advanced iOS Build Mechanics, Sebastien PouliotAdvanced iOS Build Mechanics, Sebastien Pouliot
Advanced iOS Build Mechanics, Sebastien Pouliot
Xamarin
 
2013-05-15 threads. why and how
2013-05-15 threads. why and how2013-05-15 threads. why and how
2013-05-15 threads. why and how
CocoaHeads Tricity
 
Writing 3D Applications Using ruby-processing
Writing 3D Applications Using ruby-processingWriting 3D Applications Using ruby-processing
Writing 3D Applications Using ruby-processing
Preston Lee
 
Intro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaIntro to GPGPU Programming with Cuda
Intro to GPGPU Programming with Cuda
Rob Gillen
 
Reactive & Realtime Web Applications with TurboGears2
Reactive & Realtime Web Applications with TurboGears2Reactive & Realtime Web Applications with TurboGears2
Reactive & Realtime Web Applications with TurboGears2
Alessandro Molina
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 

Similar to UIImageView vs Metal #tryswiftconf (20)

UIImageView vs Metal [日本語版] #tryswiftconf
UIImageView vs Metal [日本語版] #tryswiftconfUIImageView vs Metal [日本語版] #tryswiftconf
UIImageView vs Metal [日本語版] #tryswiftconf
 
飛び道具ではないMetal #iOSDC
飛び道具ではないMetal #iOSDC飛び道具ではないMetal #iOSDC
飛び道具ではないMetal #iOSDC
 
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...
 
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
 
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
【Unite 2017 Tokyo】インスタンシングを用いた美麗なグラフィックの実現方法
 
Optimizing unity games (Google IO 2014)
Optimizing unity games (Google IO 2014)Optimizing unity games (Google IO 2014)
Optimizing unity games (Google IO 2014)
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
Advanced iOS Build Mechanics, Sebastien Pouliot
Advanced iOS Build Mechanics, Sebastien PouliotAdvanced iOS Build Mechanics, Sebastien Pouliot
Advanced iOS Build Mechanics, Sebastien Pouliot
 
Tales from the Optimization Trenches - Unite Copenhagen 2019
Tales from the Optimization Trenches - Unite Copenhagen 2019Tales from the Optimization Trenches - Unite Copenhagen 2019
Tales from the Optimization Trenches - Unite Copenhagen 2019
 
20131212
2013121220131212
20131212
 
GPU-Accelerated Parallel Computing
GPU-Accelerated Parallel ComputingGPU-Accelerated Parallel Computing
GPU-Accelerated Parallel Computing
 
Tricks to Making a Realtime SurfaceView Actually Perform in Realtime - Maarte...
Tricks to Making a Realtime SurfaceView Actually Perform in Realtime - Maarte...Tricks to Making a Realtime SurfaceView Actually Perform in Realtime - Maarte...
Tricks to Making a Realtime SurfaceView Actually Perform in Realtime - Maarte...
 
2013-05-15 threads. why and how
2013-05-15 threads. why and how2013-05-15 threads. why and how
2013-05-15 threads. why and how
 
RxJava on Android
RxJava on AndroidRxJava on Android
RxJava on Android
 
Writing 3D Applications Using ruby-processing
Writing 3D Applications Using ruby-processingWriting 3D Applications Using ruby-processing
Writing 3D Applications Using ruby-processing
 
Intro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaIntro to GPGPU Programming with Cuda
Intro to GPGPU Programming with Cuda
 
Reactive & Realtime Web Applications with TurboGears2
Reactive & Realtime Web Applications with TurboGears2Reactive & Realtime Web Applications with TurboGears2
Reactive & Realtime Web Applications with TurboGears2
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
SMP implementation for OpenBSD/sgi
SMP implementation for OpenBSD/sgiSMP implementation for OpenBSD/sgi
SMP implementation for OpenBSD/sgi
 
SPU gameplay
SPU gameplaySPU gameplay
SPU gameplay
 

More from Shuichi Tsutsumi

More from Shuichi Tsutsumi (20)

Core MLのアップデートを3倍楽しむ方法 #wwdctokyo
Core MLのアップデートを3倍楽しむ方法 #wwdctokyoCore MLのアップデートを3倍楽しむ方法 #wwdctokyo
Core MLのアップデートを3倍楽しむ方法 #wwdctokyo
 
エンジニアという仕事を楽しみ続けるためのキャリア戦略
エンジニアという仕事を楽しみ続けるためのキャリア戦略エンジニアという仕事を楽しみ続けるためのキャリア戦略
エンジニアという仕事を楽しみ続けるためのキャリア戦略
 
Depth in Depth #iOSDC
Depth in Depth #iOSDCDepth in Depth #iOSDC
Depth in Depth #iOSDC
 
Deep Learning on iOS #360iDev
Deep Learning on iOS #360iDevDeep Learning on iOS #360iDev
Deep Learning on iOS #360iDev
 
Client-Side Deep Learning
Client-Side Deep LearningClient-Side Deep Learning
Client-Side Deep Learning
 
Building iOS apps using "Bluetooth Low Energy"
Building iOS apps using "Bluetooth Low Energy"Building iOS apps using "Bluetooth Low Energy"
Building iOS apps using "Bluetooth Low Energy"
 
Practical Core Bluetooth in IoT & Wearable projects @ AltConf 2016
Practical Core Bluetooth in IoT & Wearable projects @ AltConf 2016Practical Core Bluetooth in IoT & Wearable projects @ AltConf 2016
Practical Core Bluetooth in IoT & Wearable projects @ AltConf 2016
 
Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016
Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016
Practical Core Bluetooth in IoT & Wearable projects @ UIKonf 2016
 
オープンソースエコシステム #demodaytokyo
オープンソースエコシステム #demodaytokyoオープンソースエコシステム #demodaytokyo
オープンソースエコシステム #demodaytokyo
 
Core Image Tips & Tricks in iOS 9
Core Image Tips & Tricks in iOS 9Core Image Tips & Tricks in iOS 9
Core Image Tips & Tricks in iOS 9
 
Core Graphics on watchOS 2
Core Graphics on watchOS 2Core Graphics on watchOS 2
Core Graphics on watchOS 2
 
Audio Unit Extensions 〜オーディオエフェクトのアプリ間共有〜
Audio Unit Extensions 〜オーディオエフェクトのアプリ間共有〜Audio Unit Extensions 〜オーディオエフェクトのアプリ間共有〜
Audio Unit Extensions 〜オーディオエフェクトのアプリ間共有〜
 
iOS 9 の新機能 Core Image 編
iOS 9 の新機能 Core Image 編iOS 9 の新機能 Core Image 編
iOS 9 の新機能 Core Image 編
 
UI/UX に影響の大きい watchOS 2 の新機能 3つ
UI/UX に影響の大きい watchOS 2 の新機能 3つUI/UX に影響の大きい watchOS 2 の新機能 3つ
UI/UX に影響の大きい watchOS 2 の新機能 3つ
 
watchOS 2 新機能の細かい話
watchOS 2 新機能の細かい話watchOS 2 新機能の細かい話
watchOS 2 新機能の細かい話
 
Apple Watch 間通信
Apple Watch 間通信Apple Watch 間通信
Apple Watch 間通信
 
OpenCV 3.0 on iOS
OpenCV 3.0 on iOSOpenCV 3.0 on iOS
OpenCV 3.0 on iOS
 
殺しても死なないアプリ 〜Core Bluetooth の「状態の保存と復元」機能〜
殺しても死なないアプリ 〜Core Bluetooth の「状態の保存と復元」機能〜殺しても死なないアプリ 〜Core Bluetooth の「状態の保存と復元」機能〜
殺しても死なないアプリ 〜Core Bluetooth の「状態の保存と復元」機能〜
 
WatchKitを実際にさわってみてわかったこと
WatchKitを実際にさわってみてわかったことWatchKitを実際にさわってみてわかったこと
WatchKitを実際にさわってみてわかったこと
 
おもしろく働くための「わらしべ長者方式」
おもしろく働くための「わらしべ長者方式」おもしろく働くための「わらしべ長者方式」
おもしろく働くための「わらしべ長者方式」
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

UIImageView vs Metal #tryswiftconf

  • 1. try! Swift Tokyo 2018 Shuichi Tsutsumi @shu223 UIImageView vs Metal
  • 2. Shuichi Tsutsumi @shu223 • iOS Developer - @Fyusion Inc. - @Freelance
  • 3. Today’s Goal • Learn “how to use Metal” • Be conscious the GPU layer through Metal
  • 4. Agenda • Compare the graphics rendering performance of Metal to UIImageView → Learn a lot around GPU 1. UIKit is optimized well with GPU. 2. Consider also the GPU, when measuring the performance. 3. Pay attention to the processing flow between CPU and GPU. 4. Be careful where the resource is.
  • 6.
  • 8. ScreenProcessor Frame Buffer Pixel Data for a frame Write 60 times/sec Draw Pixels * Resolution
  • 9. Difference Between CPU and GPU CPU is a Sports Car • Very fast • Can’t process many tasks in parallel GPU is a Bus • Not as fast as CPU • Can process many “same” tasks in parallel
  • 10. • CPU is very fast, good for any tasks (general-purpose processor) - However, if used to process everything, it will easily reach to 100% load. → Utilize GPU as much as possible, 
 if the task is good for GPU
 (= can be computed in parallel)
  • 11. Processor ScreenFrame Buffer Write 60 times/sec * Resolution Pixel Data for 1 frame GPU
  • 13. Provide access to GPU GPU Your app ???
  • 14. What’s the difference from OpenGL?
  • 16. Metal • Developed by Apple • Optimized for Apple’s hardware • 10x faster than OpenGL
  • 20.
  • 21. To achieve this with Metal…
  • 22. func draw(in view: MTKView) { guard let drawable = view.currentDrawable else {return} guard let commandBuffer = commandQueue.makeCommandBuffer() else {fatalError()} guard let blitEncoder = commandBuffer.makeBlitCommandEncoder() else {fatalError()} let w = min(texture.width, drawable.texture.width) let h = min(texture.height, drawable.texture.height) blitEncoder.copy(from: texture, sourceSlice: 0, sourceLevel: 0, sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0), sourceSize: MTLSizeMake(w, h, texture.depth), to: drawable.texture, destinationSlice: 0, destinationLevel: 0, destinationOrigin: MTLOrigin(x: 0, y: 0, z: 0)) blitEncoder.endEncoding() commandBuffer.present(drawable) commandBuffer.commit() commandBuffer.waitUntilCompleted() } private let device = MTLCreateSystemDefaultDevice()! 
 private func setup() { commandQueue = device.makeCommandQueue() let textureLoader = MTKTextureLoader(device: device) texture = try! textureLoader.newTexture( name: "highsierra", scaleFactor: view.contentScaleFactor, bundle: nil) mtkView.device = device mtkView.delegate = self mtkView.colorPixelFormat = texture.pixelFormat }
  • 23. this is the Minimumimplementation
  • 24. private let device = MTLCreateSystemDefaultDevice()! 
 private func setup() { commandQueue = device.makeCommandQueue() let textureLoader = MTKTextureLoader(device: device) texture = try! textureLoader.newTexture( name: "highsierra", scaleFactor: view.contentScaleFactor, bundle: nil) mtkView.device = device mtkView.delegate = self mtkView.colorPixelFormat = texture.pixelFormat } func draw(in view: MTKView) { guard let drawable = view.currentDrawable else {return} guard let commandBuffer = commandQueue.makeCommandBuffer() else {fatalE guard let blitEncoder = commandBuffer.makeBlitCommandEncoder() else {fa let w = min(texture.width, drawable.texture.width) let h = min(texture.height, drawable.texture.height) blitEncoder.copy(from: texture, sourceSlice: 0, sourceLevel: 0, sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0), sourceSize: MTLSizeMake(w, h, texture.depth), to: drawable.texture, destinationSlice: 0, destinationLevel: 0, destinationOrigin: MTLOrigin(x: 0, y: 0, z: 0)) blitEncoder.endEncoding() commandBuffer.present(drawable) commandBuffer.commit() commandBuffer.waitUntilCompleted() } imageView.image = image
  • 25. private let device = MTLCreateSystemDefaultDevice()! 
 private func setup() { commandQueue = device.makeCommandQueue() let textureLoader = MTKTextureLoader(device: device) texture = try! textureLoader.newTexture( name: "highsierra", scaleFactor: view.contentScaleFactor, bundle: nil) mtkView.device = device mtkView.delegate = self mtkView.colorPixelFormat = texture.pixelFormat } func draw(in view: MTKView) { guard let drawable = view.currentDrawable else {return} guard let commandBuffer = commandQueue.makeCommandBuffer() else {fatalE guard let blitEncoder = commandBuffer.makeBlitCommandEncoder() else {fa let w = min(texture.width, drawable.texture.width) let h = min(texture.height, drawable.texture.height) blitEncoder.copy(from: texture, sourceSlice: 0, sourceLevel: 0, sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0), sourceSize: MTLSizeMake(w, h, texture.depth), to: drawable.texture, destinationSlice: 0, destinationLevel: 0, destinationOrigin: MTLOrigin(x: 0, y: 0, z: 0)) blitEncoder.endEncoding() commandBuffer.present(drawable) commandBuffer.commit() commandBuffer.waitUntilCompleted() } imageView.image = image 💡
  • 26. My Idea: A Metal wrapper class to draw an image ✓ Easy to use as UIImageView ✓ Metal Accelerated “MetalImageView” metalImageView.texture = texture
  • 29. Sample App for the comparison • Render large images in table cells. - 5120 x 3200 (elcapitan.jpg) - 1245 x 1245 (sierra.png)
  • 30. Measuring Code let time1 = CACurrentMediaTime() if isMetal { let metalCell = cell as! MetalTableViewCell metalCell.metalImageView.textureName = name } else { let uikitCell = cell as! TableViewCell uikitCell.uiImageView.image = UIImage(named: name) } let time2 = CACurrentMediaTime() print("time:(time2-time1)") Time Interval Render with UIImageView Render with Metal
  • 31. Results • Metal is 10x - 20x faster! Time to render an image UIImageView 0.4 - 0.6 msec Metal 0.02 - 0.05 msec iPhone 6s
  • 32. Something weird Metal is more lagging, choppy UIImageView Metal
  • 33. Measuring Code let time1 = CACurrentMediaTime() if isMetal { let metalCell = cell as! MetalTableViewCell metalCell.metalImageView.textureName = name } else { let uikitCell = cell as! TableViewCell uikitCell.uiImageView.image = UIImage(named: name) } let time2 = CACurrentMediaTime() print("time:(time2-time1)")
  • 35. 2. CPU creates GPU commands 
 as a command buffer 1. Load image data to memory 
 for GPU (& CPU) 4. GPU processes the commands 3. Push it to GPU
  • 36. let time1 = CACurrentMediaTime() if isMetal { let metalCell = cell as! MetalTableViewCell metalCell.metalImageView.textureName = name } else { let uikitCell = cell as! TableViewCell uikitCell.uiImageView.image = UIImage(named: name) } let time2 = CACurrentMediaTime() print("time:(time2-time1)")
  • 37. 2. CPU creates GPU commands 
 as a command buffer 1. Load image data to memory 
 for GPU (& CPU) 3. Push it to GPU 4. GPU processes the commands NOT Considered!
  • 38. • Measure the time until the GPU processing is completed func draw(in view: MTKView) { // Prepare the command buffer ... // Push the command buffer commandBuffer.commit() // Wait commandBuffer.waitUntilCompleted() // Measure let endTime = CACurrentMediaTime() print(“Time: (endTime - startTime)”) } Fixed measuring code Submit commands to GPU Wait until the GPU processing is completed Calculate the total time
  • 39. Results • Metal is SLOWER !? - Less than 30fps even the best case → My implementation should have problems • UIImageView is fast enough anyways. Time to render an image UIImageView 0.4 - 0.6 msec Metal 40 - 200 msec
  • 40. Why does UIImageView so fast?
  • 41. ※WWDC17 Platforms State of the Unionより UIKit internally uses Metal
  • 42. • UIKit has been updated, and optimized well. • Should use UIKit rather than making a custom UI component with low level APIs (e.g. Metal) unless there is particular reasons it can be better.
  • 43. Point 1: UIKit is optimized well with GPU
  • 44. Point 2: Consider also the GPU, 
 when measuring the performance
  • 45. Why was MetalImageView so slow? What was the problem? (My Metal Wrapper)
  • 47.
  • 48. On CPU On GPU Create command buffers etc.(on CPU) Submit command buffers etc.(on CPU) Process shaders(on GPU)
  • 52. Current processing flow 1. Resize with MPSImageLanczosScale 2. After 1 is completed, call setNeedsDisplay() 3. draw(in:) of MTKViewDelegate is called 4. Render to screen in the draw(in:) Problem
  • 53. The CPU is waiting for the GPU On CPU On GPU
  • 54. Resize Render 2. CPU creates GPU commands 
 as a command buffer 4. GPU processes the commands 3. Push it to GPU
  • 55. FIX: Combined the commands • Encode both commands into a command buffer - Resize - Render • Push the command buffer to a GPU
  • 56. 2. CPU creates GPU commands 
 as a command buffer 4. GPU processes the commands 3. Push it to GPU Resize Render
  • 59. Point 3: Pay attention to the processing flow between CPU and GPU
  • 61.
  • 62. Speculation: Loading textures is the bottleneck(?) 1. Load image data to memory 
 for GPU (& CPU)
  • 63. Measure the time to load textures let startTime = CACurrentMediaTime() textureLoader.newTexture(name: name, scaleFactor: scaleFactor, bundle: nil) { (texture, error) in let endTime = CACurrentMediaTime() print("Time to load (name): (endTime - startTime)") • Results: 20 - 500 msec → It’s the bottleneck!
  • 64. Fix: Cache the loaded textures • UIImage(named:) caches internally, too • “Caching loaded image data” is NOT a Metal/GPU specific idea.
  • 65. Metal/GPU specific point: “Where is the resource?” Memory for GPU (& CPU) private var cachedTextures: [String: MTLTexture] = [:]OK private var cachedImages: [String: UIImage] = [:]NG
  • 67. Point 4: Be careful where the resource is.
  • 68.
  • 70. Today’s Goal • Learn “how to use Metal” • Be conscious the GPU layer through Metal
  • 71. • Compared the graphics rendering performance of Metal to UIImageView → Learned a lot around GPU 1. UIKit is optimized well with GPU. 2. Consider also the GPU, when measuring the performance. 3. Pay attention to the processing flow between CPU and GPU. 4. Be careful where the resource is.