Boosting machine learning workflow with TensorFlow 2.0

Boosting machine learning workflow
with TensorFlow 2.0
Jeongkyu Shin
Lablup Inc.
Google Developers Experts (ML/DL)

Jeongkyu Shin
Lablup Inc.
Making open-source machine
learning cluster platform:
Backend.AI
https://www.backend.ai
Google Developer Expert
ML / DL GDE, Sprint Master

Elements: Framework
● TensorFlow 2.0
● TensorFlow Extended
● TensorFlow Lite
● ML Kit
● TF-Agents
● TF Model Analysis
● TensorFlow Federated
● TF-NSL
● TensorFlow.js
● Swift for TensorFlow
● MLIR

TensorFlow
An end-to-end open source machine learning platform
Easy model building
Robust machine learning production
Powerful experimentation for research

TensorFlow: Summary
Statistics
> 30,000 commits since Dec. 2015
> 1,400 contributors
> 7,000 pull requests
> 24,000 forks (last 12 months)
> 6,400 TensorFlow related GitHub
Current
Complete ML model prototyping
Distributed training
CPU / GPU / TPU / Mobile support
TensorFlow Serving
Easier inference service
XLA compiler (1.0~)
Various target platform / performance tuning
Keras API Support (1.2~)
High-level programming API (Keras-compatible)
Eager Execution (1.4~)
Interactive mode
TensorFlow.Data (1.4~)
Pipeline for data sources

TensorFlow 2.0
Now in Live!
Keras as default grammar
Eager Execution as default runtime mode
Deprecates session-based execution
tf.compat.v1 for legacy codes
2.1 Soon!
TPU support is back
Mixed precision training

TensorFlow 2.0
Consolidate APIs
Keep one API for single behavior
Increase developer convenience
Debug with eager execution
Keep performance with AutoGraph
Keep namespace consistency
Remove global variable references
Easier large-scale training
Merge distributed TensorFlow features

# TensorFlow 1.X
outputs = session.run(f(placeholder), feed_dict={placeholder: input})
# TensorFlow 2.0
outputs = f(input)
TensorFlow 2.0: Optimization
Problems with Session.run ()
Not a function but works like a function and used like a function
Difficult to optimize session code
Change
Written like regular Python code
Decorator changes to optimized TensorFlow code at execution
(AutoGraph)

@autograph.convert()
def my_dynamic_rnn(rnn_cell, input_data, initial_state,
seq_len):
outputs = tf.TensorArray(tf.float32, input_data.shape[0])
state = initial_state
max_seq_len = tf.reduce_max(seq_len)
for i in tf.range(max_seq_len):
new_output, new_state = rnn_cell(input_data[i], state)
output = tf.where(i < seq_len, new_output,
tf.zeros_like(new_output))
state = tf.where(i < sequence_length, new_state, state)
outputs = outputs.write(i, output)
return tf.transpose(outputs.stack(), [1, 0, 2]), state
def tf__my_dynamic_rnn(rnn_cell, input_data, initial_state, sequence_length):
try:
with tf.name_scope('my_dynamic_rnn'):
outputs = tf.TensorArray(tf.float32, ag__.get_item(input_data.shape,
0, opts=ag__.GetItemOpts(element_dtype=None)))
state = initial_state
max_sequence_length = tf.reduce_max(sequence_length)
def extra_test(state_1, outputs_1):
with tf.name_scope('extra_test'):
return True
def loop_body(loop_vars, state_1, outputs_1):
with tf.name_scope('loop_body'):
i = loop_vars
new_output, new_state = ag__.converted_call(rnn_cell, True, False,
False, {}, ag__.get_item(input_data, i, opts=ag__.GetItemOpts
(element_dtype=None)), state_1)
output = tf.where(tf.less(i, sequence_length), new_output, tf.
zeros(new_output.shape))
state_1 = tf.where(tf.less(i, sequence_length), new_state, state_1)
outputs_1 = outputs_1.write(i, output)
return state_1, outputs_1
state, outputs = ag__.for_stmt(tf.range(max_sequence_length),
extra_test, loop_body, (state, outputs))
return tf.transpose(outputs.stack(), ag__.new_list([1, 0, 2])), state
except:
ag__.rewrite_graph_construction_error(ag_source_map__)

TensorFlow 2.0: Distribution Strategy
Supporting Strategies
MirroredStrategy (1.11)
All-reduce
Synchronized training
TPUStrategy (1.12)
Strategies on Google Cloud
CollectiveAllReduceStrategy (2.0)
Multi-node based MirroredStrategy
ParameterServerStrategy (2.0)
TPUStrategy (2.1)

Example: MirroredStrategy
strategy = tf.distribute.MirroredStrategy()
config = tf.estimator.RunConfig(
train_distribute=strategy, eval_distribute=strategy)
regressor = tf.estimator.LinearRegressor(
feature_columns=[tf.feature_column.numeric_column('feats')],
optimizer='SGD',
config=config)
def input_fn():
return tf.data.Dataset.from_tensors(({"feats":[1.]}, [1.]))
.repeat(10000).batch(10)
regressor.train(input_fn=input_fn, steps=10)
regressor.evaluate(input_fn=input_fn, steps=10)

ResNet50 Training code
train_dataset = tf.data.Dataset(...)
eval_dataset = tf.data.Dataset(...)
model = tf.keras.applications.ResNet50()
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
model.compile(loss="categorical_crossentropy", optimizer=optimizer)
model.fit(train_dataset, epochs=10)
model.evaluate(eval_dataset)

ResNet50 Training code
Added MirroredStrategy: other codes are same.
train_dataset = tf.data.Dataset(...)
eval_dataset = tf.data.Dataset(...)
model = tf.keras.applications.ResNet50()
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
strategy = tf.contrib.distribute.MirroredStrategy()
model.compile(loss="categorical_crossentropy", optimizer=optimizer,
distribute=strategy)
model.fit(train_dataset, epochs=10)
model.evaluate(eval_dataset)

Machine Learning Pipeline
Training is only a small part
Different story for production environment
Data
Ingestion
Data
Analysis + Validation
Data
Transformation Trainer
Model Evaluation
and Validation
Serving Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization

Machine Learning Pipeline
Training is only a small part
Different story for production environment
Diverse environments / situation make it so difficult
Data
Ingestion
Data
Analysis + Validation
Data
Transformation
Model Evaluation
and Validation
Serving Logging
Pipeline Storage
Tuner
TensorFlow
Core

TensorFlow Extended: Before
End-to-end machine learning platform by
Making pipeline with TensorFlow components
Orchestrate various computation resources
Data
Ingestion
Data
Transformation
Model Evaluation
and Validation
Logging
Pipeline Storage
Tuner
TensorFlow
Core
TensorFlow
Serving
TFDV

TensorFlow Extended: Now
Data
Ingestion
TFDV
TensorFlow
Transform
TensorFlow
Core
TensorFlow
Model Analysis
TensorFlow
Serving
Logging
Pipeline Storage
Tuner
AirFlow / KubeFlow
AirFlow / KubeFlow with various tools
End-to-end machine learning platform by
Making pipeline with TensorFlow components
Orchestrate various computation resources

Kubeflow Runtime
Example
Gen
Statistic
sGen
Schema
Gen
Example
Validator
Transform Trainer
Evaluator
Model
Validator
Pusher
TFX Config
Metadata Storage
Training +
Validation
Data
TensorFlow
Serving
TensorFlow
Hub
TensorFlow
Lite
TensorFlow JS
Airflow Runtime
TensorFlow Extended: Overview

TFX: Pipeline Orchestration
Airflow Kubeflow

TensorFlow Lite
TensorFlow for on-device environments
Simplifies on-device production challenges
Limited resources: CPU, memory, power consumption
Heterogeneous accelerators: various ASICs
Use cases
Android devices
Google assistant
Mobile ML platform

TensorFlow Lite with
TensorFlow 2.0
Model compatibility
TensorFlow 1.X / 2.0
GPU / NPU support
Less-buggy TFLite model
converter
With MLIR

ML Kit :
ML SDK for diverse environments

ML Kit :
Turn-key solution for a specific task
Component to help with your own models
Server-based powerful features with Firebase
+Increases accuracy
+More categories
-Data network needed

ML Kit :
Will be more network-independent solution
On-device SDK: offline-ready
Firebase features?

TF-Agents:
library for reinforcement learning
Jupyter Notebook Examples
Integration with TensorFlow / pybullet
Examples and Docs
Supports both TensorFlow 1.14 and 2.0

TF-Agents:
library for reinforcement learning
Jupyter Notebook Examples
Integration with TensorFlow / pybullet
Examples and Docs
Supports both TensorFlow 1.14 and 2.0
Gfootball opensource training env.
https://github.com/google-
research/football

Machine Learning Fairness
Transparency Framework
Data card
Statistics for data bias

What-If Tool
Trackback results
Simulate the difference

ML Fairness
Part of TensorFlow Model Analysis
Automatic bias monitoring
Evaluate the performance impact of
adjustments
Case studies and benchmarks

Federated Learning
Distributed training with
minimal privacy violations
Data island + edge computing

Federated Learning
Save resources / traffic
Training without exposing local data
Keeping privacy
Demo
Action / emoticon prediction

TensorFlow Federated
Federated Learning API
Unified Core API
Local runtime for simulation
*Sources: Federated Learning: Machine Learning on Decentralized Data (Google I/O 19)

node.js based server-side training
Supports online prebuilt models / new TF Hub
TensorFlow.js

node.js based server-side training
Nvidia GPU support on desktop (driver needed)
Supports online prebuilt models / new TF Hub
TensorFlow.js

TensorFlow Neural Structured Learning
My recent interest!
Are you familiar with Taxonomy or semantics?
How about Semantic Web or Knowledge graph?

New learning paradigm
Training ‘relationship’ of inputs
NSL: Generalized training method to
Neural graph learning
Adversarial learning

import tensorflow as tf
import neural_structured_learning as nsl
# Prepare data.
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
# Create a base model -- sequential, functional, or subclass.
model = tf.keras.Sequential([
tf.keras.Input((28, 28), name='feature'),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
# Wrap the model with adversarial regularization.
adv_config = nsl.configs.make_adv_reg_config(multiplier=0.2, adv_step_size=0.05)
adv_model = nsl.keras.AdversarialRegularization(model, adv_config=adv_config)
# Compile, train, and evaluate.
adv_model.compile(optimizer='adam’,
loss='sparse_categorical_crossentropy’,
metrics=['accuracy'])
adv_model.fit({'feature': x_train, 'label': y_train}, batch_size=32, epochs=5)
adv_model.evaluate({'feature': x_test, 'label': y_test})

Swift for TensorFlow
Every components with one language
Only swift from Backend to Frontend
C++ + Python + (TF Lite) + …
Swift
LLVM-based platform independency support
Low huddle (more link pythonic + interpreter mode)
Future
MLIR: framework-independent ML accelerating with LLVM

MLIR
“Multi-Level Intermediate Representation" Compiler Infrastructure
Global improvements to TensorFlow infrastructure
Abstraction layer for accelerators
Support Heterogenous, distributed, mobile, custom ASICs
Urgency driven by the “end of Moore’s law”
As a part of LLVM: Other frameworks can benefit

MLIR
TensorFlow to TensorFlow Lite Converter
First MLIR-based improvements
Enhance edge device support by simplifying transforms and expressibility
Nov. 2019

Elements: Devices
● Coral, Edge TPU
● TPU Pods V3
● Cloud TPU Pods Beta

TitleCoral: Edge TPU based hardware

Coral
Edge TPU Brand
Dev Board / USB
Also PCI-E, SOM
Software Stack
Mendel OS (Debian Fork)
Edge TPU compiler for TF Lite
model compilation
Python SDK

Coral: Dev Board
CPU i.MX 8M SoC w/ Quad-core A53
GPU Integrated GC7000 Lite GPU
TPU Google Edge TPU
RAM 1GB LPDDR4 RAM
Storage 8 GB eMMC
Security/Crypto
eMMC secure block for TrustZone MCHP
ATECC608A Crypto Chip
Power 5V 3A via Type-C connector
Connectors USB-C, RJ45, 3.5mm TRRS, HDMI
OS Mendel Linux (Debian derivative) Android
ML TensorFlow Lite

Coral: USB accelerator
TPU Google Edge TPU
Power 5V 3A via Type-C connector
Connectors USB 3.1 (gen 1) via Type-C
Supported OS
Debian 6.0 or higher / Other Debian
derivatives
Supported
Architectures
x86-64, ARMv8
Supported ML TensorFlow Lite

Coral: SDK and market
I/O `19: New SDK
Model compiler for custom
model serving
Limited ops. Support
Competition in Edge AI ASIC market
Neural ComputeStick / Jetson
Nano
RP4 + Add-on
More to come!

Cloud TPU Pods
Full TPU Pod on Google Cloud
What topics need these big resources?
XLNet (June 2019)
Outperforms BERT language model
Cloud TPU Pod training (< 2.5 days)
260k US dollar (~312,000,000 won) for 2.5 days use
https://arxiv.org/abs/1906.08237

Cloud TPU Pods
Full TPU Pod on Google Cloud
What topics need these big resources?
Google T5 (Oct 2019)
Unified text-to-text transformer
Outperforms XLNet language model (?!)
Cloud TPU Pod training (~2 weeks)
Guess the cost! https://arxiv.org/abs/1910.10683

Elements: Framework
● TensorFlow 2.0
● TensorFlow Extended
● TensorFlow Lite
● TF-Agents
● ML Kit
● TF Model Analysis
● TensorFlow Federated
● TF-NSL
● TensorFlow.js
● Swift for TensorFlow
● MLIR
Elements: Devices
● Coral, Edge TPU
● TPU Pods V3
● Cloud TPU Pods Beta

Thank You!
facebook/jeongkyu.shininureyes@gmail.com
inureyesgithub/inureyes

Boosting machine learning workflow with TensorFlow 2.0

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Boosting machine learning workflow with TensorFlow 2.0

Similar to Boosting machine learning workflow with TensorFlow 2.0 (20)

More from Jeongkyu Shin

More from Jeongkyu Shin (20)

Recently uploaded

Recently uploaded (20)

Boosting machine learning workflow with TensorFlow 2.0