SlideShare a Scribd company logo
1 of 20
Download to read offline
What I Know about
TensorFlow Lite
Koan-Sin Tan

freedom@computer.org

Hsinchu Coding Serfs Meeting

Dec 7th, 2017
• We heard Android NN and TensorFlow Lite back in Google I/
O 2017

• My COSCUP 2017 slide deck “TensorFlow on Android”

• https://www.slideshare.net/kstan2/tensorflow-on-
android

• People knew a bit about Android NN API before it was
announced and released

• No information about TensorFlow Lite, at least to me,
before it was released in Nov
tf-lite and android NN in
Google I/O
• New TensorFlow runtime
• Optimized for mobile and
embedded apps

• Runs TensorFlow models on
device

• Leverage Android NN API

• Soon to be open sourced
from Google I/O 2017 video
Actual Android NN API
• Announced with Android 8.1 Preview 1

• Available to developer in NDK

• yes, NDK

• The Android Neural Networks API (NNAPI)
is an Android C API designed for running
computationally intensive operations for
machine learning on mobile devices

• NNAPI is designed to provide a base layer
of functionality for higher-level machine
learning frameworks (such as TensorFlow
Lite, Caffe2, or others) that build and train
neural networks

• The API is available on all devices running
Android 8.1 (API level 27) or higher.
https://developer.android.com/ndk/images/nnapi/nnapi_architecture.png
Android NN on Pixel 2
• Only the CPU fallback is available

• Actually, you can see Android NN API related in AOSP
after Oreo MR1 (8.1) release already

• user level code, see https://android.googlesource.com/
platform/frameworks/ml/+/oreo-mr1-release

• HAL, see https://android.googlesource.com/platform/
hardware/interfaces/+/oreo-mr1-release/
neuralnetworks/
TensorFlow Lite
• TensorFlow Lite is TensorFlow’s lightweight solution for
mobile and embedded devices

• It enables on-device machine learning inference with low
latency and a small binary size

• Low latency techniques: optimizing the kernels for mobile
apps, pre-fused activations, and quantized kernels that
allow smaller and faster (fixed-point math) models

• TensorFlow Lite also supports hardware acceleration with
the Android Neural Networks API
https://www.tensorflow.org/mobile/tflite/
What does TensorFlow Lite
contain?
• a set of core operators, both quantized and float, which have been tuned for mobile platforms

• pre-fused activations and biases to further enhance performance and quantized accuracy

• using custom operations in models also supported

• a new model file format, based on FlatBuffers

• the primary difference is that FlatBuffers does not need a parsing/unpacking step to a secondary
representation before you can access data

• the code footprint of FlatBuffers is an order of magnitude smaller than protocol buffers

• a new mobile-optimized interpreter, 

• key goals: keeping apps lean and fast. 

• a static graph ordering and a custom (less-dynamic) memory allocator to ensure minimal load,
initialization, and execution latency

• an interface to Android NN API if available
https://www.tensorflow.org/mobile/tflite/
why a new mobile-specific
library?
• Innovation at the silicon layer is enabling new possibilities for hardware
acceleration, and frameworks such as the Android Neural Networks API
make it easy to leverage these

• Recent advances in real-time computer-vision and spoken language
understanding have led to mobile-optimized benchmark models being open
sourced (e.g. MobileNets, SqueezeNet)

• Widely-available smart appliances create new possibilities for on-device
intelligence

• Interest in stronger user data privacy paradigms where user data does not
need to leave the mobile device

• Ability to serve ‘offline’ use cases, where the device does not need to be
connected to a network
https://www.tensorflow.org/mobile/tflite/
• A set of core operators, both quantized and float, many of which have been tuned for mobile platforms. These
can be used to create and run custom models. Developers can also write their own custom operators and
use them in models

• A new FlatBuffers-based model file format

• On-device interpreter with kernels optimized for faster execution on mobile

• TensorFlow converter to convert TensorFlow-trained models to the TensorFlow Lite format.

• Smaller in size: TensorFlow Lite is smaller than 300KB when all supported operators are linked and less than
200KB when using only the operators needed for supporting InceptionV3 and Mobilenet

• Pre-tested models

• Inception V3, MobileNet, On Device Smart Reply

• Quantized versions of the MobileNet model, which runs faster than the non-quantized (float) version on CPU.

• New Android demo app to illustrate the use of TensorFlow Lite with a quantized MobileNet model for object
classification

• Java and C++ API support
https://www.tensorflow.org/mobile/tflite/
• Java API: A convenience
wrapper around the C++ API
on Android

• C++ API: Loads the
TensorFlow Lite Model File
and invokes the Interpreter.
The same library is available
on both Android and iOS
https://www.tensorflow.org/mobile/tflite/
• Let $TF_ROOT be root of tensorflow

• source of tf-lite: ${TF_ROOT}/tensorflow/contrib/lite/

• https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/README.md

• examples

• two for Android, two for iOS

• APIs: ${TF_ROOT}/tensorflow/contrib/lite/g3doc/apis.md, https://github.com/tensorflow/
tensorflow/blob/master/tensorflow/contrib/lite/g3doc/apis.md

• no benchmark_model: well there is one, https://github.com/tensorflow/tensorflow/blob/master/
tensorflow/contrib/lite/tools/benchmark_model.cc

• it’s incomplete

• no command line label_image (https://github.com/tensorflow/tensorflow/tree/master/tensorflow/
examples/label_image)
• model: .tflite model

• resolver: if no custom ops, builtin
op resolve is enough

• interpreter: we need it to compute
the graph

• interpreter->AllocateTensor():
allocate stuff for you, e.g., input
tensor(s)

• fill the input

• interpreter->Invoke(): run the graph

• process the output
tflite::FlatBufferModel	model(path_to_model);	
tflite::ops::builtin::BuiltinOpResolver	resolver;	
std::unique_ptr<tflite::Interpreter>	interpreter;	
tflite::InterpreterBuilder(*model,	resolver)(&interpreter);	
//	Resize	input	tensors,	if	desired.	
interpreter->AllocateTensors();	
float*	input	=	interpreter->typed_input_tensor<float>(0);	
//	Fill	`input`.	
interpreter->Invoke();	
float*	output	=	interpreter->type_output_tensor<float>(0);
beyond basic stuff
• https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/interpreter.h

• const char* GetInputName(int index): https://github.com/tensorflow/tensorflow/blob/master/
tensorflow/contrib/lite/interpreter.h#L157

• const char* GetOutputName(int index): https://github.com/tensorflow/tensorflow/blob/
master/tensorflow/contrib/lite/interpreter.h#L166

• int tensors_size(): https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/
lite/interpreter.h#L171

• TfLiteTensor* tensor(int tensor_index)

• int nodes_size(): https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/
lite/interpreter.h#L174

• const std::pair<TfLiteNode, TfLiteRegistration>* node_and_registration(int node_index) 

• Yes, we should be able to enumerate/traverse tensors and nodes
beyond basic stuff
• void UseNNAPI(bool enable)

• void SetNumThreads(int num_threads)

• TfLiteTensor: https://github.com/freedomtan/tensorflow/blob/
label_image_tflite_pr/tensorflow/contrib/lite/context.h#L163

• my label_image for tflite

• https://github.com/freedomtan/tensorflow/blob/
label_image_tflite_pr/tensorflow/contrib/lite/examples/
label_image/label_image.cc#L103
• logs of running label_image

• https://drive.google.com/file/d/11LAI_b1fVOM2GxOT_gOpOMpASaqe5m_U/view?
usp=sharing

• builtin state dump function

• void PrintInterpreterState(Interpreter* interpreter): https://github.com/tensorflow/tensorflow/
blob/master/tensorflow/contrib/lite/optional_debug_tools.h#L25

• https://github.com/freedomtan/tensorflow/blob/label_image_tflite_pr/tensorflow/contrib/lite/
examples/label_image/label_image.cc#L147

• TF operations --> TF Lite operations is not trivial

• https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/
tf_ops_compatibility.md

• https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/
nnapi_delegate.cc
speed of quantized one
• It seems it's much better than naive quantization as we saw before

• On Nexus 9 (MobileNet 1.0/224)

• Quantized

• ./label_image -t 2: ~ 160 ms

• ./label_image -t 2 -c 100: ~ 60 ms

• Floating point

• ./label_image -t 2 -f 1 -m ./mobilenet_v1_1.0_224.tflite: ~ 300 ms

• ./label_image -t 2 -c 100 -f 1 -m ./mobilenet_v1_1.0_224.tflite: ~ 82 ms

• TFLiteCameraDemo: 130 - 180 ms

• Pixel 2

• TFLiteCameraDemo: ~ 100 ms

• didn’t see significant difference, w/ or w/o Android NN runtime
Custom Operators
• https://github.com/tensorflow/
tensorflow/blob/master/
tensorflow/contrib/lite/g3doc/
custom_operators.md

• OpInit(), OpFree(),
OpPrepare(), and OpInvoke() in
interpreter.cc
typedef	struct	{	
		void*	(*init)(TfLiteContext*	context,	const	char*	buffer,	size_t	length);	
		void	(*free)(TfLiteContext*	context,	void*	buffer);	
		TfLiteStatus	(*prepare)(TfLiteContext*	context,	TfLiteNode*	node);	
		TfLiteStatus	(*invoke)(TfLiteContext*	context,	TfLiteNode*	node);	
}	TfLiteRegistration;
Fake Quantiztion
• How hard can it be? How much time is needed?

• Several pre-tested models are available

• https://github.com/tensorflow/tensorflow/blob/master/
tensorflow/contrib/lite/g3doc/models.md

• but only one of them (https://storage.googleapis.com/
download.tensorflow.org/models/tflite/
mobilenet_v1_224_android_quant_2017_11_08.zip) is quantized
one

• as we can guess from related docs, retrain is kinda required to
get accuracy back
• BLAS part: eigen (http://eigen.tuxfamily.org/) and
gemmlowp (https://github.com/google/gemmlowp)
The End

More Related Content

What's hot

Tensorflow presentation
Tensorflow presentationTensorflow presentation
Tensorflow presentationAhmed rebai
 
Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlowMatthias Feys
 
Getting started with TensorFlow
Getting started with TensorFlowGetting started with TensorFlow
Getting started with TensorFlowElifTech
 
第 1 回 Jetson ユーザー勉強会
第 1 回 Jetson ユーザー勉強会第 1 回 Jetson ユーザー勉強会
第 1 回 Jetson ユーザー勉強会NVIDIA Japan
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit pptSandeep Singh
 
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...Edureka!
 
Accelerated Training of Transformer Models
Accelerated Training of Transformer ModelsAccelerated Training of Transformer Models
Accelerated Training of Transformer ModelsDatabricks
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An IntroductionDhan V Sagar
 
Les réseaux de neurones
Les réseaux de neuronesLes réseaux de neurones
Les réseaux de neuronesMariam Amchayd
 
Deep learning: Hardware Landscape
Deep learning: Hardware LandscapeDeep learning: Hardware Landscape
Deep learning: Hardware LandscapeGrigory Sapunov
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflowCharmi Chokshi
 
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...Edge AI and Vision Alliance
 
HYPER-THREADING TECHNOLOGY
HYPER-THREADING TECHNOLOGYHYPER-THREADING TECHNOLOGY
HYPER-THREADING TECHNOLOGYSHASHI SHAW
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 
graphics processing unit ppt
graphics processing unit pptgraphics processing unit ppt
graphics processing unit pptNitesh Dubey
 
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...Simplilearn
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningCastLabKAIST
 

What's hot (20)

Tensorflow presentation
Tensorflow presentationTensorflow presentation
Tensorflow presentation
 
Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlow
 
Getting started with TensorFlow
Getting started with TensorFlowGetting started with TensorFlow
Getting started with TensorFlow
 
Pytorch
PytorchPytorch
Pytorch
 
第 1 回 Jetson ユーザー勉強会
第 1 回 Jetson ユーザー勉強会第 1 回 Jetson ユーザー勉強会
第 1 回 Jetson ユーザー勉強会
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit ppt
 
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
 
Accelerated Training of Transformer Models
Accelerated Training of Transformer ModelsAccelerated Training of Transformer Models
Accelerated Training of Transformer Models
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An Introduction
 
Les réseaux de neurones
Les réseaux de neuronesLes réseaux de neurones
Les réseaux de neurones
 
Deep learning: Hardware Landscape
Deep learning: Hardware LandscapeDeep learning: Hardware Landscape
Deep learning: Hardware Landscape
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
 
HYPER-THREADING TECHNOLOGY
HYPER-THREADING TECHNOLOGYHYPER-THREADING TECHNOLOGY
HYPER-THREADING TECHNOLOGY
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
graphics processing unit ppt
graphics processing unit pptgraphics processing unit ppt
graphics processing unit ppt
 
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
 
GPU Programming
GPU ProgrammingGPU Programming
GPU Programming
 
OpenVINO introduction
OpenVINO introductionOpenVINO introduction
OpenVINO introduction
 

Similar to Introduction to TensorFlow Lite

open source nn frameworks on cellphones
open source nn frameworks on cellphonesopen source nn frameworks on cellphones
open source nn frameworks on cellphonesKoan-Sin Tan
 
Introduction to Tensor Flow-v1.pptx
Introduction to Tensor Flow-v1.pptxIntroduction to Tensor Flow-v1.pptx
Introduction to Tensor Flow-v1.pptxJanagi Raman S
 
Tensorflow on Android
Tensorflow on AndroidTensorflow on Android
Tensorflow on AndroidKoan-Sin Tan
 
Tensor flow intro and summit info feb 2017
Tensor flow intro and summit info feb 2017Tensor flow intro and summit info feb 2017
Tensor flow intro and summit info feb 2017Sam Witteveen
 
Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022
Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022
Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022InfluxData
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in ProductionMatthias Feys
 
The Fn Project: A Quick Introduction (December 2017)
The Fn Project: A Quick Introduction (December 2017)The Fn Project: A Quick Introduction (December 2017)
The Fn Project: A Quick Introduction (December 2017)Oracle Developers
 
TensorFlow Technology
TensorFlow TechnologyTensorFlow Technology
TensorFlow Technologynarayan dudhe
 
Bringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunes
Bringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunesBringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunes
Bringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunesDroidConTLV
 
Bringing TensorFlow to Android - a War Story
Bringing TensorFlow to Android - a War StoryBringing TensorFlow to Android - a War Story
Bringing TensorFlow to Android - a War StoryYoni Tsafir
 
Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to TelegrafInfluxData
 
OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11OpenSAF Foundation
 
Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?GetInData
 
Tensor flow 2.0 what's new
Tensor flow 2.0  what's newTensor flow 2.0  what's new
Tensor flow 2.0 what's newPoo Kuan Hoong
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlowNdjido Ardo BAR
 
Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022
Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022
Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022InfluxData
 
TensorFlow 2.0 Autographs - For TFUG - Vik Pant
TensorFlow 2.0 Autographs - For TFUG - Vik PantTensorFlow 2.0 Autographs - For TFUG - Vik Pant
TensorFlow 2.0 Autographs - For TFUG - Vik PantDevatanu Banerjee
 
Onnx and onnx runtime
Onnx and onnx runtimeOnnx and onnx runtime
Onnx and onnx runtimeVishwas N
 

Similar to Introduction to TensorFlow Lite (20)

open source nn frameworks on cellphones
open source nn frameworks on cellphonesopen source nn frameworks on cellphones
open source nn frameworks on cellphones
 
Introduction to Tensor Flow-v1.pptx
Introduction to Tensor Flow-v1.pptxIntroduction to Tensor Flow-v1.pptx
Introduction to Tensor Flow-v1.pptx
 
Tensorflow on Android
Tensorflow on AndroidTensorflow on Android
Tensorflow on Android
 
Tensor flow intro and summit info feb 2017
Tensor flow intro and summit info feb 2017Tensor flow intro and summit info feb 2017
Tensor flow intro and summit info feb 2017
 
hpcpp.pptx
hpcpp.pptxhpcpp.pptx
hpcpp.pptx
 
Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022
Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022
Alan Pope [InfluxData] | Data Collectors | InfluxDays 2022
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in Production
 
The Fn Project: A Quick Introduction (December 2017)
The Fn Project: A Quick Introduction (December 2017)The Fn Project: A Quick Introduction (December 2017)
The Fn Project: A Quick Introduction (December 2017)
 
TensorFlow Technology
TensorFlow TechnologyTensorFlow Technology
TensorFlow Technology
 
Bringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunes
Bringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunesBringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunes
Bringing TensorFlow to Android: a war story - Yoni Tsafir, JoyTunes
 
Bringing TensorFlow to Android - a War Story
Bringing TensorFlow to Android - a War StoryBringing TensorFlow to Android - a War Story
Bringing TensorFlow to Android - a War Story
 
Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to Telegraf
 
OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11OpenSAF Symposium_Python Bindings_9.21.11
OpenSAF Symposium_Python Bindings_9.21.11
 
C#: Past, Present and Future
C#: Past, Present and FutureC#: Past, Present and Future
C#: Past, Present and Future
 
Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?
 
Tensor flow 2.0 what's new
Tensor flow 2.0  what's newTensor flow 2.0  what's new
Tensor flow 2.0 what's new
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
 
Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022
Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022
Samantha Wang [InfluxData] | Data Collection Overview | InfluxDays 2022
 
TensorFlow 2.0 Autographs - For TFUG - Vik Pant
TensorFlow 2.0 Autographs - For TFUG - Vik PantTensorFlow 2.0 Autographs - For TFUG - Vik Pant
TensorFlow 2.0 Autographs - For TFUG - Vik Pant
 
Onnx and onnx runtime
Onnx and onnx runtimeOnnx and onnx runtime
Onnx and onnx runtime
 

More from Koan-Sin Tan

running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on androidKoan-Sin Tan
 
Exploring Your Apple M1 devices with Open Source Tools
Exploring Your Apple M1 devices with Open Source ToolsExploring Your Apple M1 devices with Open Source Tools
Exploring Your Apple M1 devices with Open Source ToolsKoan-Sin Tan
 
Exploring Thermal Related Stuff in iDevices using Open-Source Tool
Exploring Thermal Related Stuff in iDevices using Open-Source ToolExploring Thermal Related Stuff in iDevices using Open-Source Tool
Exploring Thermal Related Stuff in iDevices using Open-Source ToolKoan-Sin Tan
 
A Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlowA Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlowKoan-Sin Tan
 
A Peek into Google's Edge TPU
A Peek into Google's Edge TPUA Peek into Google's Edge TPU
A Peek into Google's Edge TPUKoan-Sin Tan
 
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?Koan-Sin Tan
 
SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016Koan-Sin Tan
 
A peek into Python's Metaclass and Bytecode from a Smalltalk User
A peek into Python's Metaclass and Bytecode from a Smalltalk UserA peek into Python's Metaclass and Bytecode from a Smalltalk User
A peek into Python's Metaclass and Bytecode from a Smalltalk UserKoan-Sin Tan
 
Android Wear and the Future of Smartwatch
Android Wear and the Future of SmartwatchAndroid Wear and the Future of Smartwatch
Android Wear and the Future of SmartwatchKoan-Sin Tan
 
Understanding Android Benchmarks
Understanding Android BenchmarksUnderstanding Android Benchmarks
Understanding Android BenchmarksKoan-Sin Tan
 
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsDark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsKoan-Sin Tan
 
Smalltalk and ruby - 2012-12-08
Smalltalk and ruby  - 2012-12-08Smalltalk and ruby  - 2012-12-08
Smalltalk and ruby - 2012-12-08Koan-Sin Tan
 

More from Koan-Sin Tan (14)

running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on android
 
Exploring Your Apple M1 devices with Open Source Tools
Exploring Your Apple M1 devices with Open Source ToolsExploring Your Apple M1 devices with Open Source Tools
Exploring Your Apple M1 devices with Open Source Tools
 
A Peek into TFRT
A Peek into TFRTA Peek into TFRT
A Peek into TFRT
 
Exploring Thermal Related Stuff in iDevices using Open-Source Tool
Exploring Thermal Related Stuff in iDevices using Open-Source ToolExploring Thermal Related Stuff in iDevices using Open-Source Tool
Exploring Thermal Related Stuff in iDevices using Open-Source Tool
 
A Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlowA Sneak Peek of MLIR in TensorFlow
A Sneak Peek of MLIR in TensorFlow
 
A Peek into Google's Edge TPU
A Peek into Google's Edge TPUA Peek into Google's Edge TPU
A Peek into Google's Edge TPU
 
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?
 
Caffe2 on Android
Caffe2 on AndroidCaffe2 on Android
Caffe2 on Android
 
SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016SoC Idling for unconf COSCUP 2016
SoC Idling for unconf COSCUP 2016
 
A peek into Python's Metaclass and Bytecode from a Smalltalk User
A peek into Python's Metaclass and Bytecode from a Smalltalk UserA peek into Python's Metaclass and Bytecode from a Smalltalk User
A peek into Python's Metaclass and Bytecode from a Smalltalk User
 
Android Wear and the Future of Smartwatch
Android Wear and the Future of SmartwatchAndroid Wear and the Future of Smartwatch
Android Wear and the Future of Smartwatch
 
Understanding Android Benchmarks
Understanding Android BenchmarksUnderstanding Android Benchmarks
Understanding Android Benchmarks
 
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsDark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
 
Smalltalk and ruby - 2012-12-08
Smalltalk and ruby  - 2012-12-08Smalltalk and ruby  - 2012-12-08
Smalltalk and ruby - 2012-12-08
 

Recently uploaded

KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosVictor Morales
 
Immutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdfImmutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdfDrew Moseley
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSsandhya757531
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESkarthi keyan
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfManish Kumar
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Erbil Polytechnic University
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptJohnWilliam111370
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Romil Mishra
 
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdfPaper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdfNainaShrivastava14
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfalene1
 
OOP concepts -in-Python programming language
OOP concepts -in-Python programming languageOOP concepts -in-Python programming language
OOP concepts -in-Python programming languageSmritiSharma901052
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Coursebim.edu.pl
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solidnamansinghjarodiya
 
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxStephen Sitton
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...Erbil Polytechnic University
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 

Recently uploaded (20)

KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitos
 
Immutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdfImmutable Image-Based Operating Systems - EW2024.pdf
Immutable Image-Based Operating Systems - EW2024.pdf
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
Designing pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxDesigning pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptx
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________
 
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdfPaper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
 
OOP concepts -in-Python programming language
OOP concepts -in-Python programming languageOOP concepts -in-Python programming language
OOP concepts -in-Python programming language
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Course
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solid
 
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptx
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 

Introduction to TensorFlow Lite

  • 1. What I Know about TensorFlow Lite Koan-Sin Tan freedom@computer.org Hsinchu Coding Serfs Meeting Dec 7th, 2017
  • 2. • We heard Android NN and TensorFlow Lite back in Google I/ O 2017 • My COSCUP 2017 slide deck “TensorFlow on Android” • https://www.slideshare.net/kstan2/tensorflow-on- android • People knew a bit about Android NN API before it was announced and released • No information about TensorFlow Lite, at least to me, before it was released in Nov
  • 3. tf-lite and android NN in Google I/O • New TensorFlow runtime • Optimized for mobile and embedded apps • Runs TensorFlow models on device • Leverage Android NN API • Soon to be open sourced from Google I/O 2017 video
  • 4. Actual Android NN API • Announced with Android 8.1 Preview 1 • Available to developer in NDK • yes, NDK • The Android Neural Networks API (NNAPI) is an Android C API designed for running computationally intensive operations for machine learning on mobile devices • NNAPI is designed to provide a base layer of functionality for higher-level machine learning frameworks (such as TensorFlow Lite, Caffe2, or others) that build and train neural networks • The API is available on all devices running Android 8.1 (API level 27) or higher. https://developer.android.com/ndk/images/nnapi/nnapi_architecture.png
  • 5. Android NN on Pixel 2 • Only the CPU fallback is available • Actually, you can see Android NN API related in AOSP after Oreo MR1 (8.1) release already • user level code, see https://android.googlesource.com/ platform/frameworks/ml/+/oreo-mr1-release • HAL, see https://android.googlesource.com/platform/ hardware/interfaces/+/oreo-mr1-release/ neuralnetworks/
  • 6. TensorFlow Lite • TensorFlow Lite is TensorFlow’s lightweight solution for mobile and embedded devices • It enables on-device machine learning inference with low latency and a small binary size • Low latency techniques: optimizing the kernels for mobile apps, pre-fused activations, and quantized kernels that allow smaller and faster (fixed-point math) models • TensorFlow Lite also supports hardware acceleration with the Android Neural Networks API https://www.tensorflow.org/mobile/tflite/
  • 7. What does TensorFlow Lite contain? • a set of core operators, both quantized and float, which have been tuned for mobile platforms • pre-fused activations and biases to further enhance performance and quantized accuracy • using custom operations in models also supported • a new model file format, based on FlatBuffers • the primary difference is that FlatBuffers does not need a parsing/unpacking step to a secondary representation before you can access data • the code footprint of FlatBuffers is an order of magnitude smaller than protocol buffers • a new mobile-optimized interpreter, • key goals: keeping apps lean and fast. • a static graph ordering and a custom (less-dynamic) memory allocator to ensure minimal load, initialization, and execution latency • an interface to Android NN API if available https://www.tensorflow.org/mobile/tflite/
  • 8. why a new mobile-specific library? • Innovation at the silicon layer is enabling new possibilities for hardware acceleration, and frameworks such as the Android Neural Networks API make it easy to leverage these • Recent advances in real-time computer-vision and spoken language understanding have led to mobile-optimized benchmark models being open sourced (e.g. MobileNets, SqueezeNet) • Widely-available smart appliances create new possibilities for on-device intelligence • Interest in stronger user data privacy paradigms where user data does not need to leave the mobile device • Ability to serve ‘offline’ use cases, where the device does not need to be connected to a network https://www.tensorflow.org/mobile/tflite/
  • 9. • A set of core operators, both quantized and float, many of which have been tuned for mobile platforms. These can be used to create and run custom models. Developers can also write their own custom operators and use them in models • A new FlatBuffers-based model file format • On-device interpreter with kernels optimized for faster execution on mobile • TensorFlow converter to convert TensorFlow-trained models to the TensorFlow Lite format. • Smaller in size: TensorFlow Lite is smaller than 300KB when all supported operators are linked and less than 200KB when using only the operators needed for supporting InceptionV3 and Mobilenet • Pre-tested models • Inception V3, MobileNet, On Device Smart Reply • Quantized versions of the MobileNet model, which runs faster than the non-quantized (float) version on CPU. • New Android demo app to illustrate the use of TensorFlow Lite with a quantized MobileNet model for object classification • Java and C++ API support https://www.tensorflow.org/mobile/tflite/
  • 10. • Java API: A convenience wrapper around the C++ API on Android • C++ API: Loads the TensorFlow Lite Model File and invokes the Interpreter. The same library is available on both Android and iOS https://www.tensorflow.org/mobile/tflite/
  • 11. • Let $TF_ROOT be root of tensorflow • source of tf-lite: ${TF_ROOT}/tensorflow/contrib/lite/ • https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/README.md • examples • two for Android, two for iOS • APIs: ${TF_ROOT}/tensorflow/contrib/lite/g3doc/apis.md, https://github.com/tensorflow/ tensorflow/blob/master/tensorflow/contrib/lite/g3doc/apis.md • no benchmark_model: well there is one, https://github.com/tensorflow/tensorflow/blob/master/ tensorflow/contrib/lite/tools/benchmark_model.cc • it’s incomplete • no command line label_image (https://github.com/tensorflow/tensorflow/tree/master/tensorflow/ examples/label_image)
  • 12. • model: .tflite model • resolver: if no custom ops, builtin op resolve is enough • interpreter: we need it to compute the graph • interpreter->AllocateTensor(): allocate stuff for you, e.g., input tensor(s) • fill the input • interpreter->Invoke(): run the graph • process the output tflite::FlatBufferModel model(path_to_model); tflite::ops::builtin::BuiltinOpResolver resolver; std::unique_ptr<tflite::Interpreter> interpreter; tflite::InterpreterBuilder(*model, resolver)(&interpreter); // Resize input tensors, if desired. interpreter->AllocateTensors(); float* input = interpreter->typed_input_tensor<float>(0); // Fill `input`. interpreter->Invoke(); float* output = interpreter->type_output_tensor<float>(0);
  • 13. beyond basic stuff • https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/interpreter.h • const char* GetInputName(int index): https://github.com/tensorflow/tensorflow/blob/master/ tensorflow/contrib/lite/interpreter.h#L157 • const char* GetOutputName(int index): https://github.com/tensorflow/tensorflow/blob/ master/tensorflow/contrib/lite/interpreter.h#L166 • int tensors_size(): https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/ lite/interpreter.h#L171 • TfLiteTensor* tensor(int tensor_index) • int nodes_size(): https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/ lite/interpreter.h#L174 • const std::pair<TfLiteNode, TfLiteRegistration>* node_and_registration(int node_index) • Yes, we should be able to enumerate/traverse tensors and nodes
  • 14. beyond basic stuff • void UseNNAPI(bool enable) • void SetNumThreads(int num_threads) • TfLiteTensor: https://github.com/freedomtan/tensorflow/blob/ label_image_tflite_pr/tensorflow/contrib/lite/context.h#L163 • my label_image for tflite • https://github.com/freedomtan/tensorflow/blob/ label_image_tflite_pr/tensorflow/contrib/lite/examples/ label_image/label_image.cc#L103
  • 15. • logs of running label_image • https://drive.google.com/file/d/11LAI_b1fVOM2GxOT_gOpOMpASaqe5m_U/view? usp=sharing • builtin state dump function • void PrintInterpreterState(Interpreter* interpreter): https://github.com/tensorflow/tensorflow/ blob/master/tensorflow/contrib/lite/optional_debug_tools.h#L25 • https://github.com/freedomtan/tensorflow/blob/label_image_tflite_pr/tensorflow/contrib/lite/ examples/label_image/label_image.cc#L147 • TF operations --> TF Lite operations is not trivial • https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/ tf_ops_compatibility.md • https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/ nnapi_delegate.cc
  • 16. speed of quantized one • It seems it's much better than naive quantization as we saw before • On Nexus 9 (MobileNet 1.0/224) • Quantized • ./label_image -t 2: ~ 160 ms • ./label_image -t 2 -c 100: ~ 60 ms • Floating point • ./label_image -t 2 -f 1 -m ./mobilenet_v1_1.0_224.tflite: ~ 300 ms • ./label_image -t 2 -c 100 -f 1 -m ./mobilenet_v1_1.0_224.tflite: ~ 82 ms • TFLiteCameraDemo: 130 - 180 ms • Pixel 2 • TFLiteCameraDemo: ~ 100 ms • didn’t see significant difference, w/ or w/o Android NN runtime
  • 17. Custom Operators • https://github.com/tensorflow/ tensorflow/blob/master/ tensorflow/contrib/lite/g3doc/ custom_operators.md • OpInit(), OpFree(), OpPrepare(), and OpInvoke() in interpreter.cc typedef struct { void* (*init)(TfLiteContext* context, const char* buffer, size_t length); void (*free)(TfLiteContext* context, void* buffer); TfLiteStatus (*prepare)(TfLiteContext* context, TfLiteNode* node); TfLiteStatus (*invoke)(TfLiteContext* context, TfLiteNode* node); } TfLiteRegistration;
  • 18. Fake Quantiztion • How hard can it be? How much time is needed? • Several pre-tested models are available • https://github.com/tensorflow/tensorflow/blob/master/ tensorflow/contrib/lite/g3doc/models.md • but only one of them (https://storage.googleapis.com/ download.tensorflow.org/models/tflite/ mobilenet_v1_224_android_quant_2017_11_08.zip) is quantized one • as we can guess from related docs, retrain is kinda required to get accuracy back
  • 19. • BLAS part: eigen (http://eigen.tuxfamily.org/) and gemmlowp (https://github.com/google/gemmlowp)