SlideShare a Scribd company logo
1 of 21
Download to read offline
Overview of Chainer
and Its Features
Deep Learning Tokyo 2016 at Yahoo! JAPAN
Seiya Tokui, Preferred Networks, Inc.
Mar. 20, 2016
This talk aims at providing
 The basics of deep learning frameworks
 The concept and characteristics of Chainer among them
 What you can do with Chainer
2
Typical flow of using DL frameworks
3
objective
training data
function
function
function
parameters
1. Build a neural network (as a computational graph)
2. Feed it to a gradient-based
numerical optimizer
Numerical
Optimizer
3. The optimizer runs iterations
over the training dataset
4. Extract the resulting
parameters for some applications
Elements of Neural Network Implementations
 Multi-dimensional array
 Differentiable functions
– Called by various names (layers, modules, operators, primitives, etc.)
 Computational graphs
– DAG structure with executors (compiler or interpreter)
– Should support backpropagation
– May be optimized after the construction
 Gradient-based numerical optimizers (SGD, Adam, etc.)
 Data loaders, training loops, etc.
4
Common goals of deep learning frameworks
 Making it easy to write codes involving neural networks and running
them efficiently
 Four perspectives of DL frameworks:
– API to let users concentrate on the essential parts of NN models
 Automatic differentiation (backprop)
 Intuitive coding
– Extensibility to write a wide range of NN models
– Performance of executing the computational flow
 GPU support, parallelization
 Automatic optimization
– Portability of the network implementation (training and deploying phases)
5
Goals of Chainer
 Making it easy to write a wide range of codes involving neural networks
and running them efficiently enough for most researches
 What Chainer provides:
– API to let users concentrate on the essential parts of NN models
 Automatic differentiation (backprop)
 Intuitive coding: allow any Python control flows to appear in NNs
– Extensibility to write a wide range of NN models
– Performance of executing the computational flow
 GPU support, parallelization (multi-GPU support)
 Automatic optimization of computation (future work)
– Portability of the network implementation (training and deploying phases)
(Future work. Current Chainer heavily depends on CPython, and deployment
to environments without CPython might be done by other frameworks)
6
Basic information
7
Chainer
 Python-based framework of neural nets
 Open sourced: June 2015
 Core development:
Preferred Networks / Preferred Infrastructure
 Current version: v1.7.1
 Mainly designed for fast research and prototyping
Important URLs
 http://chainer.org/
 https://github.com/pfnet/chainer
Overall structure of Chainer
8
CuPy
CPU NVIDIA GPU
CUDA
cuDNN
BLAS
NumPy
Chainer
Backpropagation in Chainer
 Consider an objective L = f(x * w + b)
 This code computes the value of L (i.e. forward prop), and
simultaneously builds the following “backward graph”
– is Variable, and is Function
 Using this graph, one can compute the gradient of L with respect to any
variables by backpropagation
 Optimizer optimizes the parameters by backprop
9
f* +x
w b
L
Paradigms of BP: Define and Run vs Define by Run
 Define and Run (most DL frameworks)
– Computational graphs are constructed beforehand of any forward/backward
propagations (i.e. it defines graphs AND runs them)
– Pros: easy to optimize, high portability (definition of forward/backward prop
can be serialized to static data structure)
– Cons: hard to write graphs whose shapes depend on data, require special
treatment on control flows in the graphs
 Define by Run (Chainer and autograd)
– Graphs are constructed during the forward computation (i.e. it defines graphs
BY runs forward computations)
– Pros: shapes of graphs can be changed for different iterations, any control
flows of the host language can be used to define the forward computation
– Cons: hard to optimize the forward computation
10
Control flows in writing NNs: a case of RNN
rnn = RNN()
xs = [list of arrays] # The length can be changed for every
ys = [list of arrays] # iteration
loss = 0
for x, y in zip(xs, ys): # You can use for loop with
x_var = Variable(x) # arbitrary loop conditions
y_var = Variable(y) # (you can even use the results of
y_pred = rnn(x_var) # forward computations here)
loss += L(y_pred, y_var)
loss.backward() # backward through the dynamically
# constructed graph
optimizer.update()
11
Debug NNs just like programs
 In Chainer, NN is juat a fragment of Python program
– Functions applied to variables are used for later backprop
 Errors in forward computation occurs right at the execution of user code
– They can be debugged just as usual Python programs
(using appropriate stacktraces, pdb, etc.)
– Easy to print-debug (no need to add an auxiliary function)
– Easy to execute a part of NN in debug mode
 Just by switching the mode before and after the execution of the part
12
Extensibility – built-in Functions (differentiable!)
 Mathematics
Arithemetics, common elementwise maths, matrix product and inversion, sum
along axes
 Activation functions
Most of popular activations (sigmoid, tanh, relu family, maxout, lstm family)
 Array routines
Useful routines, most of which borrowed from NumPy API
(reshape, broadcast, concat/split_axis, transpose, where, etc.)
 Neural net connections
To implement trainable layers (linear, 2d convolution, word embedding, etc.)
 Loss functions
Typical loss functions over minibatch (softmax cross entropy, elementwise
sigmoid cross entropy, hinge loss, MSE, Negative Sampling, Hierarchical SoftMax,
CTC, etc.)
 Many others (dropout, batch_normalization, pooling, SPP, unpooling, LRN, etc.)
13
Extensibility – writing custom Functions (1)
 Function consists of two methods: forward and backward
class MulAdd(Function):
def forward(self, inputs):
x, y, z = inputs
w = x * y + z
return w,
def backward(self, inputs, grad_outputs):
x, y, z = inputs
gw = grad_outputs[0]
gx = y * gw
gy = x * gw
gz = gw
return gx, gy, gz
 This Function implements an elementwise expression x * y + z
14
Extensibility – writing custom Functions (2)
 Using NumPy/CuPy, you can write “device-agnostic codes” to implement
Functions
 Consider x and y are arrays either on CPU or on GPU
xp = cuda.get_array_module(x, y)
z = xp.exp(x) + xp.exp(y)
 This code executes exp(x) + exp(y) regardless of the type of x and y
(numpy.ndarray or cupy.ndarray)
– xp refers to either numpy or cupy
15
CuPy – NumPy-like GPU array
 CuPy is a multi-dimensional array library for CUDA
 It implements many interface compatible to NumPy
– Ndarray type
– Elementwise operations (including ufuncs) and reduction operations
– Full support of basic indexing
 It also supports multiple GPUs
– copy and copyto can be applied to arrays on different devices
 Chainer uses a memory pool to avoid calling cudaMalloc during iterations
(it syncs everything and stops hiding Python overhead!!)
16
CuPy – customized kernels
 It also supports easy-to-write custom kernels
 Example: muladd in one kernel
w = cuda.elementwise(
‘T x, T y, T z’, # argument list (T: variadic type)
‘T w’, # output
‘w = x * y + z’, # code applied to every element
‘muladd_forward’ # kernel name
)(x, y, z) # invocation
 Kernels are compiled on-the-fly
– Compiled kernels are cached to the disk and reused in later uses
– It also caches the kernels sent to each device and reuses them in the same
process
17
Extensibility – Link for binding params to Functions
 You can think of it as a “layer” in classic NN definitions
 Example: a simple fully-connected layer
class FullyConnected(Link):
def __init__(self, n_in, n_out):
super(FullyConnected, self).__init__()
self.add_param(‘W’, (n_out, n_in))
self.add_param(‘b’, n_out)
def __call__(self, x):
a = dot(x, transpose(self.W))
a, b = broadcast(a, self.b)
return a + b
 Note that equivalent (and more feature-rich) Link is also provided as
chainer.links.Linear
18
Extensibility – Chain as a reusable NN component
 Chain is a kind of Link having ability to combine one or more child links
 Examples: Multi-Layer Perceptron and AutoEncoder
19
class MLP(Chain):
def __init__(self):
super(MLP, self).__init__(
l1=Linear(784, 100),
l2=Linear(100, 10),
)
def __call__(self, x):
h = relu(self.l1(x))
return self.l2(h)
class AE(Chain):
def __init__(self, enc, dec):
super(AE, self).__init__(
encoder=enc, # child chain
decoder=dec, # child chain
)
def __call__(self, x):
h = self.encoder(x)
x_hat = self.decoder(h)
return mean_squared_error(
x, x_hat)
Features of Link and Chain
 You can collect parameters from Link/Chain
 Link/Chain are easy to serialize
– Just passing them to Serializer
– Chainer currently supports serialization to NPZ (NumPy) and HDF5
– It only serializes parameters (and specifically registered “persistent values”)
 There is another kind of chain called ChainList to define a chain with
arbitrary number of child links
20
Summary
 Chainer is a deep learning framework for researchers with high flexibility
and easiness to write NNs
– Computational graphs are only constructed for backprop, and are built on-
the-fly during the forward computations
– It enables us to build a different graph for every iteration
– It also makes it easy to debug the NNs
 You can write device-agnostic codes using NumPy and CuPy
– Not only that, CuPy also makes it easy to write custom kernels without
writing boilerplate codes
 Link/Chain is a convenient tool to write fragments of NNs as reusable
components, with capability of serialization etc.
21

More Related Content

What's hot

Deep Learning with PyTorch
Deep Learning with PyTorchDeep Learning with PyTorch
Deep Learning with PyTorchMayur Bhangale
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorchJun Young Park
 
PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017Yu-Hsun (lymanblue) Lin
 
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsFCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsKusano Hitoshi
 
PyTorch crash course
PyTorch crash coursePyTorch crash course
PyTorch crash courseNader Karimi
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUShohei Hido
 
Pytorch for tf_developers
Pytorch for tf_developersPytorch for tf_developers
Pytorch for tf_developersAbdul Muneer
 
Towards Chainer v1.5
Towards Chainer v1.5Towards Chainer v1.5
Towards Chainer v1.5Seiya Tokui
 
Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Kenta Oono
 
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017Yu-Hsun (lymanblue) Lin
 
CUDA and Caffe for deep learning
CUDA and Caffe for deep learningCUDA and Caffe for deep learning
CUDA and Caffe for deep learningAmgad Muhammad
 
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1Kenta Oono
 
Caffe framework tutorial2
Caffe framework tutorial2Caffe framework tutorial2
Caffe framework tutorial2Park Chunduck
 
Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Ganesan Narayanasamy
 
IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」Preferred Networks
 
Exploiting Concurrency with Dynamic Languages
Exploiting Concurrency with Dynamic LanguagesExploiting Concurrency with Dynamic Languages
Exploiting Concurrency with Dynamic LanguagesTobias Lindaaker
 

What's hot (20)

Deep parking
Deep parkingDeep parking
Deep parking
 
Deep Learning with PyTorch
Deep Learning with PyTorchDeep Learning with PyTorch
Deep Learning with PyTorch
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorch
 
PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017PyTorch Tutorial for NTU Machine Learing Course 2017
PyTorch Tutorial for NTU Machine Learing Course 2017
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsFCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
 
PyTorch crash course
PyTorch crash coursePyTorch crash course
PyTorch crash course
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPU
 
Chainer v4 and v5
Chainer v4 and v5Chainer v4 and v5
Chainer v4 and v5
 
Pytorch for tf_developers
Pytorch for tf_developersPytorch for tf_developers
Pytorch for tf_developers
 
Towards Chainer v1.5
Towards Chainer v1.5Towards Chainer v1.5
Towards Chainer v1.5
 
Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...
 
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
[Update] PyTorch Tutorial for NTU Machine Learing Course 2017
 
Deep Learning in theano
Deep Learning in theanoDeep Learning in theano
Deep Learning in theano
 
CUDA and Caffe for deep learning
CUDA and Caffe for deep learningCUDA and Caffe for deep learning
CUDA and Caffe for deep learning
 
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1
 
Caffe framework tutorial2
Caffe framework tutorial2Caffe framework tutorial2
Caffe framework tutorial2
 
Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117
 
IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」
 
Exploiting Concurrency with Dynamic Languages
Exploiting Concurrency with Dynamic LanguagesExploiting Concurrency with Dynamic Languages
Exploiting Concurrency with Dynamic Languages
 

Similar to Overview of Chainer and Its Features

Introduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep LearningIntroduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep LearningSeiya Tokui
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to ChainerShunta Saito
 
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...Universitat Politècnica de Catalunya
 
The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...
The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...
The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...David Walker
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSPeterAndreasEntschev
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Sparksamthemonad
 
Scalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetScalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetAmazon Web Services
 
Short intro to scala and the play framework
Short intro to scala and the play frameworkShort intro to scala and the play framework
Short intro to scala and the play frameworkFelipe
 
Netflix machine learning
Netflix machine learningNetflix machine learning
Netflix machine learningAmer Ather
 
Eclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science ProjectEclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science ProjectMatthew Gerring
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA Japan
 
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...Raffi Khatchadourian
 
Building an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowBuilding an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowDatabricks
 
Final training course
Final training courseFinal training course
Final training courseNoor Dhiya
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonRalf Gommers
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & PythonLonghow Lam
 

Similar to Overview of Chainer and Its Features (20)

Introduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep LearningIntroduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep Learning
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
 
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
 
The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...
The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...
The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Spark
 
MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
 
Scalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetScalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNet
 
HPC Essentials 0
HPC Essentials 0HPC Essentials 0
HPC Essentials 0
 
Short intro to scala and the play framework
Short intro to scala and the play frameworkShort intro to scala and the play framework
Short intro to scala and the play framework
 
Netflix machine learning
Netflix machine learningNetflix machine learning
Netflix machine learning
 
Eclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science ProjectEclipse Con Europe 2014 How to use DAWN Science Project
Eclipse Con Europe 2014 How to use DAWN Science Project
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
 
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
 
Building an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowBuilding an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflow
 
Final training course
Final training courseFinal training course
Final training course
 
Tensor flow
Tensor flowTensor flow
Tensor flow
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & Python
 

More from Seiya Tokui

Chainer/CuPy v5 and Future (Japanese)
Chainer/CuPy v5 and Future (Japanese)Chainer/CuPy v5 and Future (Japanese)
Chainer/CuPy v5 and Future (Japanese)Seiya Tokui
 
Chainer v2 alpha
Chainer v2 alphaChainer v2 alpha
Chainer v2 alphaSeiya Tokui
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerSeiya Tokui
 
深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開Seiya Tokui
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural NetworksSeiya Tokui
 
生成モデルの Deep Learning
生成モデルの Deep Learning生成モデルの Deep Learning
生成モデルの Deep LearningSeiya Tokui
 
Chainer Development Plan 2015/12
Chainer Development Plan 2015/12Chainer Development Plan 2015/12
Chainer Development Plan 2015/12Seiya Tokui
 
Deep Learningの基礎と応用
Deep Learningの基礎と応用Deep Learningの基礎と応用
Deep Learningの基礎と応用Seiya Tokui
 
Chainerの使い方と自然言語処理への応用
Chainerの使い方と自然言語処理への応用Chainerの使い方と自然言語処理への応用
Chainerの使い方と自然言語処理への応用Seiya Tokui
 
論文紹介 Compressing Neural Networks with the Hashing Trick
論文紹介 Compressing Neural Networks with the Hashing Trick論文紹介 Compressing Neural Networks with the Hashing Trick
論文紹介 Compressing Neural Networks with the Hashing TrickSeiya Tokui
 
深層学習フレームワークChainerの紹介とFPGAへの期待
深層学習フレームワークChainerの紹介とFPGAへの期待深層学習フレームワークChainerの紹介とFPGAへの期待
深層学習フレームワークChainerの紹介とFPGAへの期待Seiya Tokui
 
論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative ModelsSeiya Tokui
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural NetworksSeiya Tokui
 
Deep learning実装の基礎と実践
Deep learning実装の基礎と実践Deep learning実装の基礎と実践
Deep learning実装の基礎と実践Seiya Tokui
 
Deep Learning技術の今
Deep Learning技術の今Deep Learning技術の今
Deep Learning技術の今Seiya Tokui
 
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding ModelNIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding ModelSeiya Tokui
 
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM PredictionICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM PredictionSeiya Tokui
 
Deep Learningの技術と未来
Deep Learningの技術と未来Deep Learningの技術と未来
Deep Learningの技術と未来Seiya Tokui
 

More from Seiya Tokui (20)

Chainer/CuPy v5 and Future (Japanese)
Chainer/CuPy v5 and Future (Japanese)Chainer/CuPy v5 and Future (Japanese)
Chainer/CuPy v5 and Future (Japanese)
 
Chainer v2 alpha
Chainer v2 alphaChainer v2 alpha
Chainer v2 alpha
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with Chainer
 
深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開深層学習フレームワーク Chainer の開発と今後の展開
深層学習フレームワーク Chainer の開発と今後の展開
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks
 
生成モデルの Deep Learning
生成モデルの Deep Learning生成モデルの Deep Learning
生成モデルの Deep Learning
 
Chainer Development Plan 2015/12
Chainer Development Plan 2015/12Chainer Development Plan 2015/12
Chainer Development Plan 2015/12
 
Deep Learningの基礎と応用
Deep Learningの基礎と応用Deep Learningの基礎と応用
Deep Learningの基礎と応用
 
Chainerの使い方と自然言語処理への応用
Chainerの使い方と自然言語処理への応用Chainerの使い方と自然言語処理への応用
Chainerの使い方と自然言語処理への応用
 
論文紹介 Compressing Neural Networks with the Hashing Trick
論文紹介 Compressing Neural Networks with the Hashing Trick論文紹介 Compressing Neural Networks with the Hashing Trick
論文紹介 Compressing Neural Networks with the Hashing Trick
 
深層学習フレームワークChainerの紹介とFPGAへの期待
深層学習フレームワークChainerの紹介とFPGAへの期待深層学習フレームワークChainerの紹介とFPGAへの期待
深層学習フレームワークChainerの紹介とFPGAへの期待
 
論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models論文紹介 Semi-supervised Learning with Deep Generative Models
論文紹介 Semi-supervised Learning with Deep Generative Models
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
Deep learning実装の基礎と実践
Deep learning実装の基礎と実践Deep learning実装の基礎と実践
Deep learning実装の基礎と実践
 
Deep Learning技術の今
Deep Learning技術の今Deep Learning技術の今
Deep Learning技術の今
 
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding ModelNIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
 
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM PredictionICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
ICML2013読み会 Local Deep Kernel Learning for Efficient Non-linear SVM Prediction
 
Deep Learningの技術と未来
Deep Learningの技術と未来Deep Learningの技術と未来
Deep Learningの技術と未来
 
Tprimal agh
Tprimal aghTprimal agh
Tprimal agh
 
rinko2011-agh
rinko2011-aghrinko2011-agh
rinko2011-agh
 

Recently uploaded

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Recently uploaded (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Overview of Chainer and Its Features

  • 1. Overview of Chainer and Its Features Deep Learning Tokyo 2016 at Yahoo! JAPAN Seiya Tokui, Preferred Networks, Inc. Mar. 20, 2016
  • 2. This talk aims at providing  The basics of deep learning frameworks  The concept and characteristics of Chainer among them  What you can do with Chainer 2
  • 3. Typical flow of using DL frameworks 3 objective training data function function function parameters 1. Build a neural network (as a computational graph) 2. Feed it to a gradient-based numerical optimizer Numerical Optimizer 3. The optimizer runs iterations over the training dataset 4. Extract the resulting parameters for some applications
  • 4. Elements of Neural Network Implementations  Multi-dimensional array  Differentiable functions – Called by various names (layers, modules, operators, primitives, etc.)  Computational graphs – DAG structure with executors (compiler or interpreter) – Should support backpropagation – May be optimized after the construction  Gradient-based numerical optimizers (SGD, Adam, etc.)  Data loaders, training loops, etc. 4
  • 5. Common goals of deep learning frameworks  Making it easy to write codes involving neural networks and running them efficiently  Four perspectives of DL frameworks: – API to let users concentrate on the essential parts of NN models  Automatic differentiation (backprop)  Intuitive coding – Extensibility to write a wide range of NN models – Performance of executing the computational flow  GPU support, parallelization  Automatic optimization – Portability of the network implementation (training and deploying phases) 5
  • 6. Goals of Chainer  Making it easy to write a wide range of codes involving neural networks and running them efficiently enough for most researches  What Chainer provides: – API to let users concentrate on the essential parts of NN models  Automatic differentiation (backprop)  Intuitive coding: allow any Python control flows to appear in NNs – Extensibility to write a wide range of NN models – Performance of executing the computational flow  GPU support, parallelization (multi-GPU support)  Automatic optimization of computation (future work) – Portability of the network implementation (training and deploying phases) (Future work. Current Chainer heavily depends on CPython, and deployment to environments without CPython might be done by other frameworks) 6
  • 7. Basic information 7 Chainer  Python-based framework of neural nets  Open sourced: June 2015  Core development: Preferred Networks / Preferred Infrastructure  Current version: v1.7.1  Mainly designed for fast research and prototyping Important URLs  http://chainer.org/  https://github.com/pfnet/chainer
  • 8. Overall structure of Chainer 8 CuPy CPU NVIDIA GPU CUDA cuDNN BLAS NumPy Chainer
  • 9. Backpropagation in Chainer  Consider an objective L = f(x * w + b)  This code computes the value of L (i.e. forward prop), and simultaneously builds the following “backward graph” – is Variable, and is Function  Using this graph, one can compute the gradient of L with respect to any variables by backpropagation  Optimizer optimizes the parameters by backprop 9 f* +x w b L
  • 10. Paradigms of BP: Define and Run vs Define by Run  Define and Run (most DL frameworks) – Computational graphs are constructed beforehand of any forward/backward propagations (i.e. it defines graphs AND runs them) – Pros: easy to optimize, high portability (definition of forward/backward prop can be serialized to static data structure) – Cons: hard to write graphs whose shapes depend on data, require special treatment on control flows in the graphs  Define by Run (Chainer and autograd) – Graphs are constructed during the forward computation (i.e. it defines graphs BY runs forward computations) – Pros: shapes of graphs can be changed for different iterations, any control flows of the host language can be used to define the forward computation – Cons: hard to optimize the forward computation 10
  • 11. Control flows in writing NNs: a case of RNN rnn = RNN() xs = [list of arrays] # The length can be changed for every ys = [list of arrays] # iteration loss = 0 for x, y in zip(xs, ys): # You can use for loop with x_var = Variable(x) # arbitrary loop conditions y_var = Variable(y) # (you can even use the results of y_pred = rnn(x_var) # forward computations here) loss += L(y_pred, y_var) loss.backward() # backward through the dynamically # constructed graph optimizer.update() 11
  • 12. Debug NNs just like programs  In Chainer, NN is juat a fragment of Python program – Functions applied to variables are used for later backprop  Errors in forward computation occurs right at the execution of user code – They can be debugged just as usual Python programs (using appropriate stacktraces, pdb, etc.) – Easy to print-debug (no need to add an auxiliary function) – Easy to execute a part of NN in debug mode  Just by switching the mode before and after the execution of the part 12
  • 13. Extensibility – built-in Functions (differentiable!)  Mathematics Arithemetics, common elementwise maths, matrix product and inversion, sum along axes  Activation functions Most of popular activations (sigmoid, tanh, relu family, maxout, lstm family)  Array routines Useful routines, most of which borrowed from NumPy API (reshape, broadcast, concat/split_axis, transpose, where, etc.)  Neural net connections To implement trainable layers (linear, 2d convolution, word embedding, etc.)  Loss functions Typical loss functions over minibatch (softmax cross entropy, elementwise sigmoid cross entropy, hinge loss, MSE, Negative Sampling, Hierarchical SoftMax, CTC, etc.)  Many others (dropout, batch_normalization, pooling, SPP, unpooling, LRN, etc.) 13
  • 14. Extensibility – writing custom Functions (1)  Function consists of two methods: forward and backward class MulAdd(Function): def forward(self, inputs): x, y, z = inputs w = x * y + z return w, def backward(self, inputs, grad_outputs): x, y, z = inputs gw = grad_outputs[0] gx = y * gw gy = x * gw gz = gw return gx, gy, gz  This Function implements an elementwise expression x * y + z 14
  • 15. Extensibility – writing custom Functions (2)  Using NumPy/CuPy, you can write “device-agnostic codes” to implement Functions  Consider x and y are arrays either on CPU or on GPU xp = cuda.get_array_module(x, y) z = xp.exp(x) + xp.exp(y)  This code executes exp(x) + exp(y) regardless of the type of x and y (numpy.ndarray or cupy.ndarray) – xp refers to either numpy or cupy 15
  • 16. CuPy – NumPy-like GPU array  CuPy is a multi-dimensional array library for CUDA  It implements many interface compatible to NumPy – Ndarray type – Elementwise operations (including ufuncs) and reduction operations – Full support of basic indexing  It also supports multiple GPUs – copy and copyto can be applied to arrays on different devices  Chainer uses a memory pool to avoid calling cudaMalloc during iterations (it syncs everything and stops hiding Python overhead!!) 16
  • 17. CuPy – customized kernels  It also supports easy-to-write custom kernels  Example: muladd in one kernel w = cuda.elementwise( ‘T x, T y, T z’, # argument list (T: variadic type) ‘T w’, # output ‘w = x * y + z’, # code applied to every element ‘muladd_forward’ # kernel name )(x, y, z) # invocation  Kernels are compiled on-the-fly – Compiled kernels are cached to the disk and reused in later uses – It also caches the kernels sent to each device and reuses them in the same process 17
  • 18. Extensibility – Link for binding params to Functions  You can think of it as a “layer” in classic NN definitions  Example: a simple fully-connected layer class FullyConnected(Link): def __init__(self, n_in, n_out): super(FullyConnected, self).__init__() self.add_param(‘W’, (n_out, n_in)) self.add_param(‘b’, n_out) def __call__(self, x): a = dot(x, transpose(self.W)) a, b = broadcast(a, self.b) return a + b  Note that equivalent (and more feature-rich) Link is also provided as chainer.links.Linear 18
  • 19. Extensibility – Chain as a reusable NN component  Chain is a kind of Link having ability to combine one or more child links  Examples: Multi-Layer Perceptron and AutoEncoder 19 class MLP(Chain): def __init__(self): super(MLP, self).__init__( l1=Linear(784, 100), l2=Linear(100, 10), ) def __call__(self, x): h = relu(self.l1(x)) return self.l2(h) class AE(Chain): def __init__(self, enc, dec): super(AE, self).__init__( encoder=enc, # child chain decoder=dec, # child chain ) def __call__(self, x): h = self.encoder(x) x_hat = self.decoder(h) return mean_squared_error( x, x_hat)
  • 20. Features of Link and Chain  You can collect parameters from Link/Chain  Link/Chain are easy to serialize – Just passing them to Serializer – Chainer currently supports serialization to NPZ (NumPy) and HDF5 – It only serializes parameters (and specifically registered “persistent values”)  There is another kind of chain called ChainList to define a chain with arbitrary number of child links 20
  • 21. Summary  Chainer is a deep learning framework for researchers with high flexibility and easiness to write NNs – Computational graphs are only constructed for backprop, and are built on- the-fly during the forward computations – It enables us to build a different graph for every iteration – It also makes it easy to debug the NNs  You can write device-agnostic codes using NumPy and CuPy – Not only that, CuPy also makes it easy to write custom kernels without writing boilerplate codes  Link/Chain is a convenient tool to write fragments of NNs as reusable components, with capability of serialization etc. 21