SlideShare a Scribd company logo
1 of 61
Download to read offline
Kenta Oono (, github: delta2323)
Kosuke Nakago (, github: corochann)
Deep learning for molecules
Introduction to Chainer Chemistry
Table of contents
1. What is machine learning?
a. Data driven approach
b. Primer of deep learning (MLP/ CNN)
2. Prediction of chemical characteristics
a. Rule-based approach vs. Learning-based approach
b. Neural Message passing (NFP / GGNN etc.)
3. Chainer Chemistry
a. Primer of Chainer
b. Coding examples
4. Other topics
a. Generation of chemical compounds
b. Automatic chemical synthesis
Why machine learning?
Example: Prediction of age from pictures
● What criteria can we use?
○ height, hair, cloths, physique etc. ?
○ Not all criteria are perfect.
● Even if we have good criteria, how could we extract them?
○ People in pictures can have different positions, scale, postures.
○ How can we detect each part (face, hair etc.) within a body?
=> It is very difficult to list up rules manually. Picture: irastoya
Approach by machine learning
Provide machines with vast amount of images with age information and have
them discover treads characteristic to each generation.
Human does not direct machines where in images to look at explicitly.
Photo : flicker
Application of machine learning
Task Input Output
Chemical prediction Molecule Chemical characteristics (HOMO etc.)
Mail classification E-mail
(sentences, header)
Spam or Normal or Important
Data center electlicity
Packets of each
Estimated electricity demand
Web marketing Access history,
ad contents
Click or not
Surveillance camera Movie suspicious behavior or not
Categorization of machine learning algorithms
● By dataset types
● Supervised learning (with ground truth labels)
● Unsupervised learning (without ground truth labels)
● Semi-supervised learning (A part of samples has ground truth labels)
● Reinforcement learning (Reward instead of labels)
● By methods
● Classification, Regression, Clustering, Nearest Neighbourhood
● Others
● discriminative model vs. generative model / bayesian vs. fequensionist etc.
Deep Learning
A general term of the subcategory of machine learning that uses models
consisting of (typically many) simple and differentiable transformations.
Multi Layer Perceptron (MLP)
f1 f2 f3
Ground truthInput
Learnable parameters
• W1
, W2
: parameter matrices
• b1
, b2
: bias vectors
Forward propagation
• h = f1
(x) = Sigmoid(W1
x + b1
• k = f2
(h) = Sigmoid(W2
h + b2
• y = f3
(k) = SoftMax(k)
(equivalently, yi
= exp(ki
Training dataset
• Feature vectors: x1
, x2
, …, xN
• Ground truth labels: t1
, t2
, …, tN
Each transform consists of a fully-connected layer and an activation function
● Learnable parameters:
● W (weight matrix of size N x M)
● b (bias vector of size M)
● Input : vector x of size N
● Output vector y = Wx + b (affine transformation)
Fully connected layer
y = Wx + b
Activation function
● Function (usually) without learnable
parameter for introducing non-linearlity
● Input: vector (or tensor) x = (x1
, …, xn
● Output: vector (or tensor) y = (y1
, …, yn
Examples of σ
● Sigmoid(x) = 1 / 1 + exp(-x)
● tanh(x)
● ReLU(x) = max(0, x)
● LeakyReLU(x) = x (x > 0), ax (x < 0)
○ a < 0 is a fixed constant
= σ(xi
) (i = 1, …, n)
Convolutional Neural Network (CNN)[LeCun+98]
• A neural network consisting of convolutional layers and pooling layers
• Many variants: AlexNet, VGG, Inception, GoogleNet, ResNet etc.
• Widely used in image recognition and recently applied to biology and chemistry
LeCun, Yann, et al. "Gradient-based learning applied to
document recognition." Proceedings of the IEEE 86.11
(1998): 2278-2324.
Convolution operation (stride = 1 case)
1 0 1
0 1 0
1 0 1
1 1 1 0 0 0
0 1 1 1 0 0
0 0 1 1 1 0
0 0 1 1 0 0
0 1 1 0 0 0
0 0 0 0 0 0
input filter
* =
4 3 4 1
2 4 3 3
2 3 4 1
2 2 1 1
Convolution operation (stride = 3 case)
1 0 1
0 1 0
1 0 1
1 1 1 0 0 0
0 1 1 1 0 0
0 0 1 1 1 0
0 0 1 1 0 0
0 1 1 0 0 0
0 0 0 0 0 0
input filter
* =
4 1
2 1
Feature extraction by filters
Convolutional layer
Stack several filters whose parameters are learnable
Stacking convolutional layers
Convolution layer with stride k generates
the output whose height & width are
approximately k times smaller.
Pooling layers
How can we generalize convolution operations to arbitrary
Images : grid graph Molecules : arbitrary graph
Table of contents
1. What is machine learning?
a. Data driven approach
b. Primer of deep learning (MLP / CNN)
2. Prediction of chemical characteristics
a. Rule-based approach vs. Learning-based approach
b. Neural Message passing (NFP / GGNN etc.)
3. Chainer Chemistry
a. Primer of Chainer
b. Coding examples
4. Other topics
a. Generation of chemical compounds
b. Automatic chemical synthesis
Chemical prediction - Two approaches
Quantum simulation
 Theory-based approach.
 DFT (Density Functional Theory)
 → Pros: Precision is guaranteed
   Cons: High calculation cost
Machine learning
 Data-based approach.
 Learn known compound’s property,
 predict new compound’s property.
 → Pros: Low cost, high speed calculation
   Cons: No precision guaranteed
“Neural message passing for quantum chemistry” Justin et al
Extended Connectivity Fingerprint (ECFP)
- Calculation is fast
- Show presence of
particular substructures
- Bit collision
two (or more) different substructural features could be
represented by the same bit position
Convert molecule into
fixed length bit representation
Problems of conventional methods
1. Input representation is not unique,
result depends on representation of input
e.g. SMILES representation
  CC#C and C#CC are same molecule.
2. Order invariance is not guaranteed
– representation is not guaranteed to be invariant to relabeling (i.e.
permutation of indexes) of molecules.
How graph convolution works
CNN on image
class label
Graph convolution
Atom feature embedding: 1 Man-made features
1.0 0.0 0.0 6.0 1.0
atom type
0.0 1.0 0.0 7.0 1.0
0.0 0.0 1.0 8.0 1.0
Man-made features
Molecular Graph Convolutions: Moving Beyond Fingerprints
Steven Kearnes, Kevin McCloskey, Marc Berndl, Vijay Pande, Patrick Riley arXiv:1603.00856
Atom feature embedding: 2 Embed in vector space
0.5 1.2 1.0 1.0 1.8
Embed in vector space
0.8 1.0 1.3 0.1 1.5
0.5 1.0 0.5 2.0 0.0
Each atom is randomly assigned
to some position in vector space
Learnable parameter
Graph Convolution: update each node’s (atom)
Feature of each node is updated (several times) by
Graph Convolution operation.
Han Altae-Tran, Bharath Ramsundar, Aneesh S. Pappu, & Vijay Pande (2017). Low Data Drug
Discovery with One-Shot Learning. ACS Cent. Sci., 3 (4)
Graph Gather: Extract whole graph (molecule) feature
Updated feature of each node is finally combined to form
graph’s (molecule’s) feature by Graph Gather operation.
Han Altae-Tran, Bharath Ramsundar, Aneesh S. Pappu, & Vijay Pande (2017). Low Data Drug
Discovery with One-Shot Learning. ACS Cent. Sci., 3 (4)
Unified view of graph convolution
Many message-passing algorithms (NFP, GGNN, Weave etc.) are formulated as the
iterative application of Update and Readout functions [Gilmer et al. 17].
Update Readout
Aggregates neighborhood information and updates
node representations.
Aggregates all node representations and updates the
final output.
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., & Dahl, G. E. (2017). Neural message
passing for quantum chemistry. arXiv preprint arXiv:1704.01212.
Graph convolution neural network variants
- NFP: Neural Fingerprint
- GGNN: Gated-Graph Neural Network
- WeaveNet: Molecular Graph Convolutions
- SchNet: A continuous-filter convolutional NN
“Convolutional Networks on Graph for
Learning Molecular Fingerprints”
NFP: Neural Fingerprint
Message passing
- update feature r
- extract output f from r
David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell, Timothy Hirzel, Alan Aspuru-Guzik,
and Ryan P Adams. Convolutional networks on graphs for learning molecular fingerprints.
NFP: Neural Fingerprint
h7 h’7
= σ ( W3
) )
= σ ( W2
) )
Graph convolution operation depends on degree of each atom
→ Bonding type information is not utilized
NFP: Neural Fingerprint
Readout operation is basically simply sum over the atoms
→ No selective operation/attention mechanism is adopted.
R = ∑ i
softmax (Whi
GGNN: Gated Graph Neural Network
h7 h’7
= GRU (h7
, W1
= GRU (h3
, W1
Graph convolution operation depends on bonding type of each atom pair
GRU: Gated Recurrent Unit
Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. Gated graph sequence neural networks.
arXiv preprint arXiv:1511.05493, 2015.
GGNN: Gated Graph Neural Network
Readout operation contains selective operation (gating)
R = ∑ v
σ (Wi
) ⦿ Wj
R = ∑ v
σ (i(hv
, hv0
)) ⦿ j(hv
Simplified version
Here, i and j represents some function (neural network)
σ is sigmoid non-linear function
Weave: Molecular Graph Convolutions
● Weave module convolutes an atom feature for by
features of the pair of each atoms.
A: atom feature, P: feature of atom pair
● P → A operation:
g() is a function for order invariance.
sum() is used in the paper.
Molecular Graph Convolutions: Moving Beyond Fingerprints
Steven Kearnes, Kevin McCloskey, Marc Berndl, Vijay Pande, Patrick Riley arXiv:1603.00856
SchNet: A continuous-filter convolutional neural network
Kristof Schütt, Pieter-Jan Kindermans, Huziel Enoc Sauceda Felix, Stefan Chmiela, Alexandre Tkatchenko, and Klaus-Rober Müller
Schnet: A continuous-filter convolutional neural network for modeling quantum interactions.
1. All atom pair distance ||ri
- rj
|| is used as input
2. Energy conserving condition can be addtionally used to constraint the model
for energy prediction task
Comparison between graph convolution networks
NFP GGNN Weave SchNet
Atom feature
Man-made or
Man-made or
Man-made or
Man-made or
Graph convolution
atoms only
Adjacent atoms
All atom-atom pairs All atom-atom pairs
How to represent
Degree Binding type Man-made
pair features
distance etc.)
Example: IT Drug Discovery Contest
• Find new seed compounds for a target protein (Sirtuin 1) from 2.5 million
compounds by IT technologies
• Each team needs to prepare data by itself such as training datasets.
• Each team can submit up to 400 candidate compounds
• Judge checks all submitted compounds
by a 2-stage biological experiment.
– Thermal Shift Assay
– Inhibitory assay → IC50 measurement Sirtuin 1
Contest website (Japanese)
Our result
Ours Average
(18 teams in total)
1st screening (TSA) 23 / 200 (11.5%) 69 / 3559 (1.9 %)
2nd screening (IC50) 1 5
We found one hit compound and won
one of Grand prize (IPAB prize)
Extension to semi-supervised learning
Compute representations of subgraphs inductively
with neural message passing (→)
Optimize the representation in unsupervised
manner in the same way as Paragraph vector (↓)
Nguyen, H., Maeda, S. I., & Oono, K.
(2017). Semi-supervised learning of
hierarchical representations of molecules
using neural message passing. arXiv
preprint arXiv:1711.10168.
Table of contents
1. What is machine learning?
a. Data driven approach
b. Primer of deep learning (MLP/ CNN / Graph convolution network)
2. Prediction of chemical characteristics
a. Rule-based approach vs. Learning-based approach
b. Neural Message passing (NFP / GGNN etc.)
3. Chainer Chemistry
a. Primer of Chainer
b. Coding examples
4. Other topics
a. Generation of chemical compounds
b. Automatic chemical synthesis
How can we incorporate ML to Chemistry and
• Optimized graph convolution algorithms are hard to implement
from scratch.
• ML and Chemistry/Biology researchers sometimes use different
Solution: Create tools so that …
• Chemistry/Biology researchers do not bother details of DL
algorithms and concentrate on their research.
• ML and Chemistry researchers can work in collaboration.
ー> We are developing Chainer Chemistry
Picture: irastoya
A Python framework that lets researchers quickly implement, train,
and evaluate deep learning models.
Designing a network Training, evaluation
Speed up research and development of deep learning and its applications.
• Build DL models as a Python program
→ Can write complex network (loop, branch etc.) easily
• Define-by-Run: dynamic model construction
→ Can make full use of Python stacktrace in debugging
→ Can support data-dependent neural networks natively
• CuPy: NumPy-like GPU array library
→ Can write CPU/GPU agnostic code
Basic information
• First release: June 2015
• Version
– v3.3.0 (stable)
– v4.0.0b3 (develop)
• License: MIT
• Language: Python
Example: Build and train convolutional Network
import chainer
import chainer.links as L
import chainer.functions as F
class LeNet5(chainer.Chain):
def __init__(self):
super(LeNet5, self).__init__()
with self.init_scope():
self.conv1 = L.Convolution2D(1, 6, 5, 1)
self.conv2 = L.Convolution2D(6, 16, 5, 1)
self.conv3 = L.Convolution2D(16, 120, 4, 1)
self.fc4 = L.Linear(None, 84)
self.fc5 = L.Linear(84, 10)
def __call__(self, x):
h = F.sigmoid(self.conv1(x))
h = F.max_pooling_2d(h, 2, 2)
h = F.sigmoid(self.conv2(h))
h = F.max_pooling_2d(h, 2, 2)
h = F.sigmoid(self.conv3(h))
h = F.sigmoid(self.fc4(h))
return self.fc5(h)
Example: Build and train convolutional Network
model = LeNet5()
model = L.Classifier(model)
# Dataset is a list! ([] to access, having __len__)
dataset = [(x1, t1), (x2, t2), ...]
# iterator to return a mini-batch retrieved from dataset
it = iterators.SerialIterator(dataset, batchsize=32)
# Optimization methods (you can easily try various methods by changing SGD to
# MomentumSGD, Adam, RMSprop, AdaGrad, etc.)
opt = optimizers.SGD(lr=0.01)
updater = training.StandardUpdater(it, opt, device=0) # device=-1 if you use CPU
trainer = training.Trainer(updater, stop_trigger=(100, 'epoch'))
Add-on packages for Chainer
Chainer Chemistry
Chainer extension library for Biology and Chemistry
Technological Stack
File Parser
(SDF file, CSV file) QM 9, Tox21 dataset
Graph convolution NN
(NFP, GGNN, SchNet)
Train and prediction
with QM9/tox21
Preprocessor (Feature Extractor)
Chainer Chemistry
Chainer extension library for Biology and Chemistry
Basic information
release:12/14/2017, version: v0.1.0, license: MIT, language: Python
• State-of-the-art deep learning neural network models (especially graph
convolutions) for chemical molecules (NFP, GGNN, Weave, SchNet etc.)
• Preprocessors of molecules tailored for these models
• Parsers for several standard file formats (CSV, SDF etc.)
• Loaders for several well-known datasets (QM9, Tox21 etc.)
Dataset introduction - tox21
# of Dataset: Train 11757, Validation 295, Test 645
Label - Following 12 types of toxity is included:
'NR-AR', 'NR-AR-LBD', 'NR-AhR', 'NR-Aromatase', 'NR-ER', 'NR-ER-LBD',
'NR-PPAR-gamma', 'SR-ARE', 'SR-ATAD5', 'SR-HSE', 'SR-MMP', 'SR-p53'
LABEL: [ 0 1 -1 1 -1 1 -1 -1 1 -1 1 1]
LABEL: [ 0 0 0 -1 1 0 0 -1 -1 -1 0 0]
LABEL: [ 0 0 1 0 1 1 0 1 0 0 -1 -1]
LABEL: [ 0 0 1 -1 1 1 -1 0 0 0 1 0]
2948 3895 6558 7381
Dataset introduction - QM9
# of Dataset: 133,885
Label - Following property is included:
'A', 'B', 'C', 'mu', 'alpha', 'homo', 'lumo', 'gap', 'r2', 'zpve', 'U0', 'U', 'H', 'G', 'Cv'
LABEL: [ 3.51 1.93 1.29 2.54
64.1 -0.236 -2.79e-03 2.34e-01
900.7 0.12 -396.0 -396.0
-396.0 -396.0 26.9]
LABEL: [3.285 2.062 1.3 4.218
68.69 -0.224 -0.056 0.168
914.65 0.131 -379.959 -379.951
-379.95 -379.992 27.934]
LABEL: [2.729 1.853 1.474 4.274
61.94 -0.282 -0.026 0.256
887.402 0.104 -473.876 -473.87
-473.869 -473.907 24.823]
LABEL: [ 3.64 2.218 1.938 0.863
69.48 -0.232 0.074 0.306
756.356 0.128 -400.633 -400.628
-400.627 -400.662 23.434]
Example: HOMO Prediction by NFP with QM9 dataset
Dataset preprocessing (for NFP Network)
preprocessor = preprocess_method_dict['nfp']()
dataset = D.get_qm9(preprocessor, labels='homo')
# Cache dataset for second use'input/nfp_homo/data.npz', dataset)
train_data_size = int(len(dataset) * train_data_ratio)
train, val = split_dataset_random(dataset, train_data_size)
Example: HOMO Prediction by NFP with QM9 dataset
Model definition
class GraphConvPredictor(chainer.Chain):
def __init__(self, graph_conv, mlp):
super(GraphConvPredictor, self).__init__()
with self.init_scope():
self.graph_conv = graph_conv
self.mlp = mlp
def __call__(self, atoms, adjs):
x = self.graph_conv(atoms, adjs)
x = self.mlp(x)
return x
model = GraphConvPredictor(NFP(16, 16, 4), MLP(16, 1))
Once a graph neural network is built, training is same as ordinary Chainer models.
Future work
• Primitive operations
– GraphConv, GraphPool, GraphGather
• Graph Convolution models
– Follow state of the art Graph Convolutional Neural Networks
• Pretrained Models
– We do not think to guarantee reproducibility of papers, though.
• Off-the-shelf models
– Neural message passing, 3D convolution, Generative models etc.
• Dataset
– MUTAG, MoleculeNet etc.
Table of contents
1. What is machine learning?
a. Data driven approach
b. Primer of deep learning (MLP/ CNN / Graph convolution network)
2. Prediction of chemical characteristics
a. Rule-based approach vs. Learning-based approach
b. Neural Message passing (NFP / GGNN etc.)
3. Chainer Chemistry
a. Primer of Chainer
b. Coding examples
4. Other topics (5 min.)
a. Generation of chemical compounds
b. Automatic chemical synthesis
From prediction to generation of molecules
Prediction Generation
Find molecules with desired properties
from given compound libraries.
Produce molecules not in the
libraries that has desired properties
Molecule generation with VAE [Gómez-Bombarelli+16]
● Encode and decode molecules
represented as SMILE with VAE in
seq2seq manner.
● Latent representation can be used for
semi-supervised learning.
● We can use learned model to find
molecule with desired property by
optimizing representation in latent
space and decode it.
Generated molecules are not guaranteed
to be valid syntactically :(
Gómez-Bombarelli, R., Wei, J. N., Duvenaud, D., Herná ndez-Lobato, J. M.,
Sánchez-Lengeling, B., Sheberla, D., ... & Aspuru-Guzik, A. (2016). Automatic chemical
design using a data-driven continuous representation of molecules. ACS Central Science.
Grammar VAE [Kusner+17]
Convert a molecule to a
parse tree to get a
sequence of production
rules and feed the
sequence to RNN-VAE.
Generated molecules are guaranteed to be valid syntactically !
Kusner, M. J., Paige, B., & Hernández-Lobato, J. M.
(2017). Grammar Variational Autoencoder. arXiv
preprint arXiv:1703.01925.
Generate sequence of
production rules of syntax
of SMILES represented by
• Data-based approach for chemical property prediction is
getting more attention.
• New material/drug discovery research may be
accelerated by deep learning technology.

More Related Content

What's hot

Machine Learning for Chemical Sciences
Machine Learning for Chemical SciencesMachine Learning for Chemical Sciences
Machine Learning for Chemical SciencesIchigaku Takigawa
PyTorch, PixyzによるGenerative Query Networkの実装
PyTorch, PixyzによるGenerative Query Networkの実装PyTorch, PixyzによるGenerative Query Networkの実装
PyTorch, PixyzによるGenerative Query Networkの実装Shohei Taniguchi
SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜
SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜
SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜SSII
PFP:材料探索のための汎用Neural Network Potential_中郷_20220422POLセミナー
PFP:材料探索のための汎用Neural Network Potential_中郷_20220422POLセミナーPFP:材料探索のための汎用Neural Network Potential_中郷_20220422POLセミナー
PFP:材料探索のための汎用Neural Network Potential_中郷_20220422POLセミナーMatlantis
モデルアーキテクチャ観点からの高速化2019Yusuke Uchida
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"Ryohei Suzuki
猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoderSho Tatsuno
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksDeep Learning JP
深層生成モデルと世界モデルMasahiro Suzuki
(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick
(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick
(DL hacks輪読) Variational Dropout and the Local Reparameterization TrickMasahiro Suzuki
[DL輪読会]Attentive neural processes
[DL輪読会]Attentive neural processes[DL輪読会]Attentive neural processes
[DL輪読会]Attentive neural processesDeep Learning JP
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some PreliminaryDeep Learning JP
(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?
(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?
(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?Ichigaku Takigawa
[DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
 [DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent [DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
[DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient DescentDeep Learning JP
DeepLearning 14章 自己符号化器
DeepLearning 14章 自己符号化器DeepLearning 14章 自己符号化器
DeepLearning 14章 自己符号化器hirono kawashima
機械学習におけるオンライン確率的最適化の理論Taiji Suzuki
【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces
【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces
【DL輪読会】Efficiently Modeling Long Sequences with Structured State SpacesDeep Learning JP
[DeepLearning論文読み会] Dataset Distillation
[DeepLearning論文読み会] Dataset Distillation[DeepLearning論文読み会] Dataset Distillation
[DeepLearning論文読み会] Dataset DistillationRyutaro Yamauchi
【DL輪読会】Emergent World Representations: Exploring a Sequence ModelTrained on a...
【DL輪読会】Emergent World Representations: Exploring a Sequence ModelTrained on a...【DL輪読会】Emergent World Representations: Exploring a Sequence ModelTrained on a...
【DL輪読会】Emergent World Representations: Exploring a Sequence ModelTrained on a...Deep Learning JP

What's hot (20)

Machine Learning for Chemical Sciences
Machine Learning for Chemical SciencesMachine Learning for Chemical Sciences
Machine Learning for Chemical Sciences
PyTorch, PixyzによるGenerative Query Networkの実装
PyTorch, PixyzによるGenerative Query Networkの実装PyTorch, PixyzによるGenerative Query Networkの実装
PyTorch, PixyzによるGenerative Query Networkの実装
SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜
SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜
SSII2020SS: グラフデータでも深層学習 〜 Graph Neural Networks 入門 〜
PFP:材料探索のための汎用Neural Network Potential_中郷_20220422POLセミナー
PFP:材料探索のための汎用Neural Network Potential_中郷_20220422POLセミナーPFP:材料探索のための汎用Neural Network Potential_中郷_20220422POLセミナー
PFP:材料探索のための汎用Neural Network Potential_中郷_20220422POLセミナー
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick
(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick
(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick
[DL輪読会]Attentive neural processes
[DL輪読会]Attentive neural processes[DL輪読会]Attentive neural processes
[DL輪読会]Attentive neural processes
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?
(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?
(2020.10) 分子のグラフ表現と機械学習: Graph Neural Networks (GNNs) とは?
[DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
 [DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent [DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
[DL輪読会]A Bayesian Perspective on Generalization and Stochastic Gradient Descent
DeepLearning 14章 自己符号化器
DeepLearning 14章 自己符号化器DeepLearning 14章 自己符号化器
DeepLearning 14章 自己符号化器
【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces
【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces
【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces
[DeepLearning論文読み会] Dataset Distillation
[DeepLearning論文読み会] Dataset Distillation[DeepLearning論文読み会] Dataset Distillation
[DeepLearning論文読み会] Dataset Distillation
【DL輪読会】Emergent World Representations: Exploring a Sequence ModelTrained on a...
【DL輪読会】Emergent World Representations: Exploring a Sequence ModelTrained on a...【DL輪読会】Emergent World Representations: Exploring a Sequence ModelTrained on a...
【DL輪読会】Emergent World Representations: Exploring a Sequence ModelTrained on a...

Similar to Deep learning for molecules, introduction to chainer chemistry

The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...PyData
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Oswald Campesato
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)Oswald Campesato
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level FeatureDongmin Choi
What's Wrong With Deep Learning?
What's Wrong With Deep Learning?What's Wrong With Deep Learning?
What's Wrong With Deep Learning?Philip Zheng
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...AILABS Academy
20190927 generative models_aia
20190927 generative models_aia20190927 generative models_aia
20190927 generative models_aiaYi-Fan Liou
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksJeremy Nixon

Similar to Deep learning for molecules, introduction to chainer chemistry (20)

Deep Learning
Deep LearningDeep Learning
Deep Learning
Java and Deep Learning
Java and Deep LearningJava and Deep Learning
Java and Deep Learning
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)
Conv xg
Conv xgConv xg
Conv xg
Convolutional neural networks
Convolutional neural  networksConvolutional neural  networks
Convolutional neural networks
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level Feature
What's Wrong With Deep Learning?
What's Wrong With Deep Learning?What's Wrong With Deep Learning?
What's Wrong With Deep Learning?
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
20190927 generative models_aia
20190927 generative models_aia20190927 generative models_aia
20190927 generative models_aia
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural Networks
Dssg talk CNN intro
Dssg talk CNN introDssg talk CNN intro
Dssg talk CNN intro
C++ and Deep Learning
C++ and Deep LearningC++ and Deep Learning
C++ and Deep Learning
deep CNN vs conventional ML
deep CNN vs conventional MLdeep CNN vs conventional ML
deep CNN vs conventional ML

More from Kenta Oono

Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...
Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...
Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...Kenta Oono
Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Kenta Oono
深層学習フレームワーク概要とChainerの事例紹介Kenta Oono
20170422 数学カフェ Part2
20170422 数学カフェ Part220170422 数学カフェ Part2
20170422 数学カフェ Part2Kenta Oono
20170422 数学カフェ Part1
20170422 数学カフェ Part120170422 数学カフェ Part1
20170422 数学カフェ Part1Kenta Oono
情報幾何学の基礎、第7章発表ノートKenta Oono
GTC Japan 2016 Chainer feature introduction
GTC Japan 2016 Chainer feature introductionGTC Japan 2016 Chainer feature introduction
GTC Japan 2016 Chainer feature introductionKenta Oono
On the benchmark of Chainer
On the benchmark of ChainerOn the benchmark of Chainer
On the benchmark of ChainerKenta Oono
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1Kenta Oono
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative ModelsKenta Oono
Common Design of Deep Learning Frameworks
Common Design of Deep Learning FrameworksCommon Design of Deep Learning Frameworks
Common Design of Deep Learning FrameworksKenta Oono
Introduction to Chainer and CuPy
Introduction to Chainer and CuPyIntroduction to Chainer and CuPy
Introduction to Chainer and CuPyKenta Oono
Stochastic Gradient MCMC
Stochastic Gradient MCMCStochastic Gradient MCMC
Stochastic Gradient MCMCKenta Oono
Chainer Contribution Guide
Chainer Contribution GuideChainer Contribution Guide
Chainer Contribution GuideKenta Oono
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用 2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用 Kenta Oono
Introduction to Chainer (LL Ring Recursive)
Introduction to Chainer (LL Ring Recursive)Introduction to Chainer (LL Ring Recursive)
Introduction to Chainer (LL Ring Recursive)Kenta Oono
日本神経回路学会セミナー「DeepLearningを使ってみよう!」資料Kenta Oono
提供AMIについてKenta Oono
ChainerインストールKenta Oono
CaffeインストールKenta Oono

More from Kenta Oono (20)

Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...
Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...
Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...
20170422 数学カフェ Part2
20170422 数学カフェ Part220170422 数学カフェ Part2
20170422 数学カフェ Part2
20170422 数学カフェ Part1
20170422 数学カフェ Part120170422 数学カフェ Part1
20170422 数学カフェ Part1
GTC Japan 2016 Chainer feature introduction
GTC Japan 2016 Chainer feature introductionGTC Japan 2016 Chainer feature introduction
GTC Japan 2016 Chainer feature introduction
On the benchmark of Chainer
On the benchmark of ChainerOn the benchmark of Chainer
On the benchmark of Chainer
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
Common Design of Deep Learning Frameworks
Common Design of Deep Learning FrameworksCommon Design of Deep Learning Frameworks
Common Design of Deep Learning Frameworks
Introduction to Chainer and CuPy
Introduction to Chainer and CuPyIntroduction to Chainer and CuPy
Introduction to Chainer and CuPy
Stochastic Gradient MCMC
Stochastic Gradient MCMCStochastic Gradient MCMC
Stochastic Gradient MCMC
Chainer Contribution Guide
Chainer Contribution GuideChainer Contribution Guide
Chainer Contribution Guide
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用 2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用
Introduction to Chainer (LL Ring Recursive)
Introduction to Chainer (LL Ring Recursive)Introduction to Chainer (LL Ring Recursive)
Introduction to Chainer (LL Ring Recursive)

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10 CEO/Founder: Sri Ambati Keynote at Wells Fargo Day CEO/Founder: Sri Ambati Keynote at Wells Fargo CEO/Founder: Sri Ambati Keynote at Wells Fargo Day CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite CEO/Founder: Sri Ambati Keynote at Wells Fargo Day CEO/Founder: Sri Ambati Keynote at Wells Fargo CEO/Founder: Sri Ambati Keynote at Wells Fargo Day CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web

Deep learning for molecules, introduction to chainer chemistry

  • 1. Kenta Oono (, github: delta2323) Kosuke Nakago (, github: corochann) Deep learning for molecules Introduction to Chainer Chemistry
  • 2. Table of contents 1. What is machine learning? a. Data driven approach b. Primer of deep learning (MLP/ CNN) 2. Prediction of chemical characteristics a. Rule-based approach vs. Learning-based approach b. Neural Message passing (NFP / GGNN etc.) 3. Chainer Chemistry a. Primer of Chainer b. Coding examples 4. Other topics a. Generation of chemical compounds b. Automatic chemical synthesis
  • 3. Why machine learning? Example: Prediction of age from pictures Challenges ● What criteria can we use? ○ height, hair, cloths, physique etc. ? ○ Not all criteria are perfect. ● Even if we have good criteria, how could we extract them? ○ People in pictures can have different positions, scale, postures. ○ How can we detect each part (face, hair etc.) within a body? => It is very difficult to list up rules manually. Picture: irastoya (
  • 4. Approach by machine learning Provide machines with vast amount of images with age information and have them discover treads characteristic to each generation. Human does not direct machines where in images to look at explicitly. Photo : flicker
  • 5. Application of machine learning Task Input Output Chemical prediction Molecule Chemical characteristics (HOMO etc.) Mail classification E-mail (sentences, header) Spam or Normal or Important Data center electlicity optimization Packets of each server Estimated electricity demand Web marketing Access history, ad contents Click or not Surveillance camera Movie suspicious behavior or not
  • 6. Categorization of machine learning algorithms ● By dataset types ● Supervised learning (with ground truth labels) ● Unsupervised learning (without ground truth labels) ● Semi-supervised learning (A part of samples has ground truth labels) ● Reinforcement learning (Reward instead of labels) ● By methods ● Classification, Regression, Clustering, Nearest Neighbourhood ● Others ● discriminative model vs. generative model / bayesian vs. fequensionist etc.
  • 7. Deep Learning A general term of the subcategory of machine learning that uses models consisting of (typically many) simple and differentiable transformations.
  • 8. Multi Layer Perceptron (MLP) x1 xN ・・・・・・・ h1 hH kM k1 yM y1 f1 f2 f3 W2/b2 W1/b1 tM t1 Ground truthInput Forward Backward Output ・・・ ・・ ・・ Learnable parameters • W1 , W2 : parameter matrices • b1 , b2 : bias vectors Forward propagation • h = f1 (x) = Sigmoid(W1 x + b1 ) • k = f2 (h) = Sigmoid(W2 h + b2 ) • y = f3 (k) = SoftMax(k) (equivalently, yi = exp(ki )/Σj exp(kj )) Training dataset • Feature vectors: x1 , x2 , …, xN • Ground truth labels: t1 , t2 , …, tN Each transform consists of a fully-connected layer and an activation function Evaluate difference ・・・・・・・
  • 9. ● Learnable parameters: ● W (weight matrix of size N x M) ● b (bias vector of size M) ● Input : vector x of size N ● Output vector y = Wx + b (affine transformation) W/b Fully connected layer yx y1 yM ・・・・ x1 xN ・・・・・・ y = Wx + b
  • 10. Activation function ● Function (usually) without learnable parameter for introducing non-linearlity ● Input: vector (or tensor) x = (x1 , …, xn ) ● Output: vector (or tensor) y = (y1 , …, yn ) y1 yN x1 xN yx ・・・・・・ Examples of σ ● Sigmoid(x) = 1 / 1 + exp(-x) ● tanh(x) ● ReLU(x) = max(0, x) ● LeakyReLU(x) = x (x > 0), ax (x < 0) ○ a < 0 is a fixed constant ・・・・・・ yi = σ(xi ) (i = 1, …, n)
  • 11. Convolutional Neural Network (CNN)[LeCun+98] • A neural network consisting of convolutional layers and pooling layers • Many variants: AlexNet, VGG, Inception, GoogleNet, ResNet etc. • Widely used in image recognition and recently applied to biology and chemistry LeNet-5[LeCunn+98] LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
  • 12. Convolution operation (stride = 1 case) 1 0 1 0 1 0 1 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 input filter * = output 4 3 4 1 2 4 3 3 2 3 4 1 2 2 1 1
  • 13. Convolution operation (stride = 3 case) 1 0 1 0 1 0 1 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 input filter * = output 4 1 2 1
  • 15. Convolutional layer Stack several filters whose parameters are learnable
  • 16. Stacking convolutional layers Convolution layer with stride k generates the output whose height & width are approximately k times smaller.
  • 18. How can we generalize convolution operations to arbitrary graphs? Images : grid graph Molecules : arbitrary graph
  • 19. Table of contents 1. What is machine learning? a. Data driven approach b. Primer of deep learning (MLP / CNN) 2. Prediction of chemical characteristics a. Rule-based approach vs. Learning-based approach b. Neural Message passing (NFP / GGNN etc.) 3. Chainer Chemistry a. Primer of Chainer b. Coding examples 4. Other topics a. Generation of chemical compounds b. Automatic chemical synthesis
  • 20. Chemical prediction - Two approaches Quantum simulation  Theory-based approach.  DFT (Density Functional Theory)  → Pros: Precision is guaranteed    Cons: High calculation cost Machine learning  Data-based approach.  Learn known compound’s property,  predict new compound’s property.  → Pros: Low cost, high speed calculation    Cons: No precision guaranteed “Neural message passing for quantum chemistry” Justin et al
  • 21. Extended Connectivity Fingerprint (ECFP) Pros - Calculation is fast - Show presence of particular substructures Cons - Bit collision two (or more) different substructural features could be represented by the same bit position Convert molecule into fixed length bit representation
  • 22. Problems of conventional methods 1. Input representation is not unique, result depends on representation of input e.g. SMILES representation   CC#C and C#CC are same molecule. 2. Order invariance is not guaranteed – representation is not guaranteed to be invariant to relabeling (i.e. permutation of indexes) of molecules.
  • 23. How graph convolution works CNN on image Image class label Chemical property Graph convolution
  • 24. Atom feature embedding: 1 Man-made features C N O 1.0 0.0 0.0 6.0 1.0 atom type 0.0 1.0 0.0 7.0 1.0 0.0 0.0 1.0 8.0 1.0 charge chirality Man-made features Molecular Graph Convolutions: Moving Beyond Fingerprints Steven Kearnes, Kevin McCloskey, Marc Berndl, Vijay Pande, Patrick Riley arXiv:1603.00856
  • 25. Atom feature embedding: 2 Embed in vector space C N O 0.5 1.2 1.0 1.0 1.8 Embed in vector space 0.8 1.0 1.3 0.1 1.5 0.5 1.0 0.5 2.0 0.0 Each atom is randomly assigned to some position in vector space W Learnable parameter
  • 26. Graph Convolution: update each node’s (atom) feature Feature of each node is updated (several times) by Graph Convolution operation. Han Altae-Tran, Bharath Ramsundar, Aneesh S. Pappu, & Vijay Pande (2017). Low Data Drug Discovery with One-Shot Learning. ACS Cent. Sci., 3 (4)
  • 27. Graph Gather: Extract whole graph (molecule) feature Updated feature of each node is finally combined to form graph’s (molecule’s) feature by Graph Gather operation. Han Altae-Tran, Bharath Ramsundar, Aneesh S. Pappu, & Vijay Pande (2017). Low Data Drug Discovery with One-Shot Learning. ACS Cent. Sci., 3 (4)
  • 28. Unified view of graph convolution Many message-passing algorithms (NFP, GGNN, Weave etc.) are formulated as the iterative application of Update and Readout functions [Gilmer et al. 17]. Update Readout Aggregates neighborhood information and updates node representations. Aggregates all node representations and updates the final output. Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., & Dahl, G. E. (2017). Neural message passing for quantum chemistry. arXiv preprint arXiv:1704.01212.
  • 29. Graph convolution neural network variants - NFP: Neural Fingerprint - GGNN: Gated-Graph Neural Network - WeaveNet: Molecular Graph Convolutions - SchNet: A continuous-filter convolutional NN “Convolutional Networks on Graph for Learning Molecular Fingerprints”
  • 30. NFP: Neural Fingerprint Message passing - update feature r Readout - extract output f from r Convolution David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell, Timothy Hirzel, Alan Aspuru-Guzik, and Ryan P Adams. Convolutional networks on graphs for learning molecular fingerprints.
  • 31. NFP: Neural Fingerprint C C C N C C C O OH C C C N C C C O O h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 W3 h9 W3 h8 W3 h6 W3 h7 h’7 = σ ( W3 (h7 +h6 +h8 +h9 ) ) h’3 = σ ( W2 (h3 +h2 +h4 ) ) W2 h2 W2 h4 W2 h3 Graph convolution operation depends on degree of each atom → Bonding type information is not utilized Update:
  • 32. NFP: Neural Fingerprint C C C N C C C O OH h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 Readout operation is basically simply sum over the atoms → No selective operation/attention mechanism is adopted. Readout: R = ∑ i softmax (Whi )
  • 33. GGNN: Gated Graph Neural Network C C C N C C C O OH C C C N C C C O O h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 W1 h9 W2 h8 W1 h6 h7 h’7 = GRU (h7 , W1 h6 +W2 h8 +W1 h9 ) h’3 = GRU (h3 , W1 h2 +W2 h4 ) W1 h2 W2 h4 h3 Graph convolution operation depends on bonding type of each atom pair Update: GRU: Gated Recurrent Unit Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493, 2015.
  • 34. GGNN: Gated Graph Neural Network C C C N C C C O OH h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 Readout operation contains selective operation (gating) Readout: R = ∑ v σ (Wi hv ) ⦿ Wj hv R = ∑ v σ (i(hv , hv0 )) ⦿ j(hv ) Simplified version Here, i and j represents some function (neural network) σ is sigmoid non-linear function
  • 35. Weave: Molecular Graph Convolutions ● Weave module convolutes an atom feature for by features of the pair of each atoms. A: atom feature, P: feature of atom pair ● P → A operation: g() is a function for order invariance. sum() is used in the paper. Molecular Graph Convolutions: Moving Beyond Fingerprints Steven Kearnes, Kevin McCloskey, Marc Berndl, Vijay Pande, Patrick Riley arXiv:1603.00856
  • 36. SchNet: A continuous-filter convolutional neural network Kristof Schütt, Pieter-Jan Kindermans, Huziel Enoc Sauceda Felix, Stefan Chmiela, Alexandre Tkatchenko, and Klaus-Rober Müller Schnet: A continuous-filter convolutional neural network for modeling quantum interactions. 1. All atom pair distance ||ri - rj || is used as input 2. Energy conserving condition can be addtionally used to constraint the model for energy prediction task
  • 37. Comparison between graph convolution networks NFP GGNN Weave SchNet Atom feature extraction Man-made or Embed Man-made or Embed Man-made or Embed Man-made or Embed Graph convolution strategy Adjacent atoms only Adjacent atoms only All atom-atom pairs All atom-atom pairs How to represent connection information Degree Binding type Man-made pair features (bondtype, distance etc.) Distance
  • 38. Example: IT Drug Discovery Contest Task • Find new seed compounds for a target protein (Sirtuin 1) from 2.5 million compounds by IT technologies Rule • Each team needs to prepare data by itself such as training datasets. • Each team can submit up to 400 candidate compounds • Judge checks all submitted compounds by a 2-stage biological experiment. – Thermal Shift Assay – Inhibitory assay → IC50 measurement Sirtuin 1 Contest website (Japanese)
  • 39. Our result Ours Average (18 teams in total) 1st screening (TSA) 23 / 200 (11.5%) 69 / 3559 (1.9 %) 2nd screening (IC50) 1 5 We found one hit compound and won one of Grand prize (IPAB prize)
  • 40. Extension to semi-supervised learning Compute representations of subgraphs inductively with neural message passing (→) Optimize the representation in unsupervised manner in the same way as Paragraph vector (↓) Nguyen, H., Maeda, S. I., & Oono, K. (2017). Semi-supervised learning of hierarchical representations of molecules using neural message passing. arXiv preprint arXiv:1711.10168.
  • 41. Table of contents 1. What is machine learning? a. Data driven approach b. Primer of deep learning (MLP/ CNN / Graph convolution network) 2. Prediction of chemical characteristics a. Rule-based approach vs. Learning-based approach b. Neural Message passing (NFP / GGNN etc.) 3. Chainer Chemistry a. Primer of Chainer b. Coding examples 4. Other topics a. Generation of chemical compounds b. Automatic chemical synthesis
  • 42. How can we incorporate ML to Chemistry and Biology? Problems • Optimized graph convolution algorithms are hard to implement from scratch. • ML and Chemistry/Biology researchers sometimes use different “languages”. Solution: Create tools so that … • Chemistry/Biology researchers do not bother details of DL algorithms and concentrate on their research. • ML and Chemistry researchers can work in collaboration. ー> We are developing Chainer Chemistry Picture: irastoya (
  • 43. A Python framework that lets researchers quickly implement, train, and evaluate deep learning models. Designing a network Training, evaluation Data set
  • 44. Speed up research and development of deep learning and its applications. ( Features • Build DL models as a Python program → Can write complex network (loop, branch etc.) easily • Define-by-Run: dynamic model construction → Can make full use of Python stacktrace in debugging → Can support data-dependent neural networks natively • CuPy: NumPy-like GPU array library → Can write CPU/GPU agnostic code Basic information • First release: June 2015 • Version – v3.3.0 (stable) – v4.0.0b3 (develop) • License: MIT • Language: Python
  • 45. Example: Build and train convolutional Network import chainer import chainer.links as L import chainer.functions as F class LeNet5(chainer.Chain): def __init__(self): super(LeNet5, self).__init__() with self.init_scope(): self.conv1 = L.Convolution2D(1, 6, 5, 1) self.conv2 = L.Convolution2D(6, 16, 5, 1) self.conv3 = L.Convolution2D(16, 120, 4, 1) self.fc4 = L.Linear(None, 84) self.fc5 = L.Linear(84, 10) def __call__(self, x): h = F.sigmoid(self.conv1(x)) h = F.max_pooling_2d(h, 2, 2) h = F.sigmoid(self.conv2(h)) h = F.max_pooling_2d(h, 2, 2) h = F.sigmoid(self.conv3(h)) h = F.sigmoid(self.fc4(h)) return self.fc5(h)
  • 46. Example: Build and train convolutional Network model = LeNet5() model = L.Classifier(model) # Dataset is a list! ([] to access, having __len__) dataset = [(x1, t1), (x2, t2), ...] # iterator to return a mini-batch retrieved from dataset it = iterators.SerialIterator(dataset, batchsize=32) # Optimization methods (you can easily try various methods by changing SGD to # MomentumSGD, Adam, RMSprop, AdaGrad, etc.) opt = optimizers.SGD(lr=0.01) opt.setup(model) updater = training.StandardUpdater(it, opt, device=0) # device=-1 if you use CPU trainer = training.Trainer(updater, stop_trigger=(100, 'epoch'))
  • 49. Chainer Chemistry Chainer extension library for Biology and Chemistry (
  • 50. Technological Stack File Parser (SDF file, CSV file) QM 9, Tox21 dataset Graph convolution NN GraphLinear Preprocessing (NFP, GGNN, SchNet) Example Train and prediction with QM9/tox21 dataset Model Layer/Function Dataset Pretrained Model (TBD) Preprocessor (Feature Extractor)
  • 51. Chainer Chemistry Chainer extension library for Biology and Chemistry Basic information release:12/14/2017, version: v0.1.0, license: MIT, language: Python Features • State-of-the-art deep learning neural network models (especially graph convolutions) for chemical molecules (NFP, GGNN, Weave, SchNet etc.) • Preprocessors of molecules tailored for these models • Parsers for several standard file formats (CSV, SDF etc.) • Loaders for several well-known datasets (QM9, Tox21 etc.) (
  • 52. Dataset introduction - tox21 # of Dataset: Train 11757, Validation 295, Test 645 Label - Following 12 types of toxity is included: 'NR-AR', 'NR-AR-LBD', 'NR-AhR', 'NR-Aromatase', 'NR-ER', 'NR-ER-LBD', 'NR-PPAR-gamma', 'SR-ARE', 'SR-ATAD5', 'SR-HSE', 'SR-MMP', 'SR-p53' Example: SMILES: C(=O)C1(O)Cc2c(O)c3c(c(O)c2C(OC2CC (N)C(O)C(C)O2)C1)C(=O)c1c(O)cccc1C3 =O LABEL: [ 0 1 -1 1 -1 1 -1 -1 1 -1 1 1] SMILES: CCCOc1ccc(C(=O)CCN2CCCCC2)cc1.Cl LABEL: [ 0 0 0 -1 1 0 0 -1 -1 -1 0 0] SMILES: CCOP(=S)(OCC)SC(CCl)N1C(=O)c2cccc c2C1=O LABEL: [ 0 0 1 0 1 1 0 1 0 0 -1 -1] SMILES: O=c1c(O)c(-c2ccc(O)cc2)oc2cc(O)cc(O)c 12 LABEL: [ 0 0 1 -1 1 1 -1 0 0 0 1 0] 2948 3895 6558 7381
  • 53. Dataset introduction - QM9 # of Dataset: 133,885 Label - Following property is included: 'A', 'B', 'C', 'mu', 'alpha', 'homo', 'lumo', 'gap', 'r2', 'zpve', 'U0', 'U', 'H', 'G', 'Cv' Example: SMILES: NC1=NCCC(=O)N1 LABEL: [ 3.51 1.93 1.29 2.54 64.1 -0.236 -2.79e-03 2.34e-01 900.7 0.12 -396.0 -396.0 -396.0 -396.0 26.9] SMILES: CN1CCC(=O)C1=N LABEL: [3.285 2.062 1.3 4.218 68.69 -0.224 -0.056 0.168 914.65 0.131 -379.959 -379.951 -379.95 -379.992 27.934] SMILES: N=C1OC2CC1C(=O)O2 LABEL: [2.729 1.853 1.474 4.274 61.94 -0.282 -0.026 0.256 887.402 0.104 -473.876 -473.87 -473.869 -473.907 24.823] SMILES: C1N2C3C4C5OC13C2C5 LABEL: [ 3.64 2.218 1.938 0.863 69.48 -0.232 0.074 0.306 756.356 0.128 -400.633 -400.628 -400.627 -400.662 23.434]
  • 54. Example: HOMO Prediction by NFP with QM9 dataset Dataset preprocessing (for NFP Network) preprocessor = preprocess_method_dict['nfp']() dataset = D.get_qm9(preprocessor, labels='homo') # Cache dataset for second use'input/nfp_homo/data.npz', dataset) train_data_size = int(len(dataset) * train_data_ratio) train, val = split_dataset_random(dataset, train_data_size)
  • 55. Example: HOMO Prediction by NFP with QM9 dataset Model definition class GraphConvPredictor(chainer.Chain): def __init__(self, graph_conv, mlp): super(GraphConvPredictor, self).__init__() with self.init_scope(): self.graph_conv = graph_conv self.mlp = mlp def __call__(self, atoms, adjs): x = self.graph_conv(atoms, adjs) x = self.mlp(x) return x model = GraphConvPredictor(NFP(16, 16, 4), MLP(16, 1)) Once a graph neural network is built, training is same as ordinary Chainer models.
  • 56. Future work • Primitive operations – GraphConv, GraphPool, GraphGather • Graph Convolution models – Follow state of the art Graph Convolutional Neural Networks • Pretrained Models – We do not think to guarantee reproducibility of papers, though. • Off-the-shelf models – Neural message passing, 3D convolution, Generative models etc. • Dataset – MUTAG, MoleculeNet etc.
  • 57. Table of contents 1. What is machine learning? a. Data driven approach b. Primer of deep learning (MLP/ CNN / Graph convolution network) 2. Prediction of chemical characteristics a. Rule-based approach vs. Learning-based approach b. Neural Message passing (NFP / GGNN etc.) 3. Chainer Chemistry a. Primer of Chainer b. Coding examples 4. Other topics (5 min.) a. Generation of chemical compounds b. Automatic chemical synthesis
  • 58. From prediction to generation of molecules Prediction Generation Find molecules with desired properties from given compound libraries. Produce molecules not in the libraries that has desired properties
  • 59. Molecule generation with VAE [Gómez-Bombarelli+16] ● Encode and decode molecules represented as SMILE with VAE in seq2seq manner. ● Latent representation can be used for semi-supervised learning. ● We can use learned model to find molecule with desired property by optimizing representation in latent space and decode it. Generated molecules are not guaranteed to be valid syntactically :( Gómez-Bombarelli, R., Wei, J. N., Duvenaud, D., Herná ndez-Lobato, J. M., Sánchez-Lengeling, B., Sheberla, D., ... & Aspuru-Guzik, A. (2016). Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Science.
  • 60. Grammar VAE [Kusner+17] Encode Convert a molecule to a parse tree to get a sequence of production rules and feed the sequence to RNN-VAE. Generated molecules are guaranteed to be valid syntactically ! Kusner, M. J., Paige, B., & Hernández-Lobato, J. M. (2017). Grammar Variational Autoencoder. arXiv preprint arXiv:1703.01925. Decode Generate sequence of production rules of syntax of SMILES represented by CFG
  • 61. Conclusion • Data-based approach for chemical property prediction is getting more attention. • New material/drug discovery research may be accelerated by deep learning technology.