SlideShare a Scribd company logo
1 of 68
Download to read offline
Convolutional Neural Networks
and Natural Language Processing
Thomas Delteil – github.com/thomasdelteil – linkedin.com/in/thomasdelteil
Applied Scientist @ AWS Deep Engine
Goals
§ Explain what convolutions are
§ Show how to handle textual data
§ Analyze a reference neural network
architecture for text classification
§ Demonstrate how to train and deploy a CNN for
Natural Language Processing
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Convolutions
And where to find them
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
2012 - ImageNet Classification with Deep Convolutional Neural Networks
ImageNet classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E.
Hinton, Advances in Neural Information Processing Systems, 2012
AlexNet architecture
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
ImageNet competition
Classify images among 1000 classes:
AlexNet Top-5 error-rate, 25% => 16%!
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Actual photo of the reaction from the computer vision community*
*might just be a stock photo
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
I told you
so!
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
What made Convolutional Neural Networks viable?
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
GPUs!
- Nvidia V100, float16 Ops:
~ 120 TFLOPS, 5000+ cuda cores
- #1 Super computer 2005 ~135 TFLOPS
Source: Mathworks
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Sea/Land segmentation via satellite images
DeepUNet: A Deep Fully Convolutional Network for Pixel-level Sea-Land Segmentation, Ruirui Li et al, 2017
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Automatic Galaxy classication
Deep Galaxy: Classification of Galaxies based on Deep Convolutional Neural Networks , Nour Eldeen M. Khalifa, 2017
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Medical Imaging, MRI, X-ray, surgical cameras
Review of MRI-based Brain Tumor Image Segmentation Using Deep Learning Methods, Ali Isn et al. 2016
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
What is a convolution ?
It is the cross-channel sum of the element-wise
multiplication of a convolutional filter (kernel/mask)
computed over a sliding window on an input tensor
given a certain stride and padding, plus a bias term.
The result is called a feature map.
2 2 1
3 1 -1
4 3 2
1 -1
-1 0
Input matrix (3x3)
no padding
1 channel
Kernel (2x2)
Stride 1
Bias = 2
Feature map (2x2)
-1 2
0 1
1*2 –1*2 –1*3 + 0*1 + 2 = – 1
1*2 –1*2 –1*1 + 0*-1 + 2. = 2
1*3 –1*1 –1*4 + 0*3 + 2 = 0
1*1 – (-1)*1 –1*3 + 0*2 + 2 = 1
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
What is a convolution ? Padding
Source: Machine Learning guru - Neural Networks CNN
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
What is a convolution ? Stride = 2
Source: Machine Learning guru - Neural Networks CNN
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
What is a convolution ? Multi Channel
1 convolutional filter
(3)x(3x3)
Source: Machine Learning guru - Neural Networks CNN
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
What is a convolution ? Multi Channel
source: Convolutional Neural Networks on the iphone with vggnet
N: Number of input channels
W:Width of the kernel
H: Height of the kernel
M: Number of output channels
Kernel size = ! ∗ # ∗ $
#Params = % ∗ ! ∗ # ∗ $ + %
256 convolutions of kernel (3,3) on 256 input channels
256*256*3*3 = ~0.5M
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Easily parallelizable
Convolution computations are:
- Independent (across filters and within
filter)
- Simple (multiplication and sums)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Why does it work?
Sharpening filter
Laplacian filter
Sobel x-axis filter
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Why does it work?
- Detect patterns at larger and larger scale by stacking convolution
layers on top of each others to grow the receptive field
- Applicable to spatially correlated data
Source: AlexNet first 96 (55x55) filters learned represented in RGB
space (3 input channels)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Growing receptive field
Source: ML Review, A guide to receptive field arithmetic
Deeper in the
network
Visualize convolutions
http://scs.ryerson.ca/~aharley/vis/conv/flat.html
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Visualize convolutions
Source: Neural Network 3D Simulation
(warning flashing lights)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
State of the art networks are getting deeper and more complex
Source: Inception v3
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
input
Learn Data Science – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
High number of parameters => Requires a lot of data to train
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Advanced type of convolutions
Source: An introduction to different types of convolutions
Transposed Convolutions
(deconvolution)
EnhanceNet
Dilated Convolutions
WaveNet
Depth-wise separable
Convolutions
MobileNet
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
On to Natural Language
Processing
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
NLP
Machine
translation
OCR
Q&A
Sentiment
Analysis
Speech
Recognition
TTS
Topic
Modelling
Information
Retrieval
Natural
Language
Understanding
Document
Classification
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
NLP Domains
8.4PB
of information per second
as of 2020
source: business2comunity, 2016
70%
of companies
use customer feedback
Source: business2comunity, 2016
£1.3Tvalue of company
data
source: IDC, 2014
10%
of organizations expect to
commercialise their data by 2020
source: Gartner, 2016
NLP Industry Facts
Source: Ticary, What is natural language processing Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Convolutions and Natural Language Processing
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Data Representation
?
source: Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn,and Dong Yu,. Classification Convolutional Neural Networks for
Speech Recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Encoding Data word-level
- Word-level embedding (word2vec). Word -> N-dimensional vector
Source: Convolutional Neural Networks for Sentence Classification,Yoon Kim, 2014
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
N
time
different
embeddings
V A N C O U V E R N L P …
_ 0 0 0 0 0 0 0 0 0 1 0 0 0
- 0 0 0 0 0 0 0 0 0 0 0 0 0
. 0 0 0 0 0 0 0 0 0 0 0 0 0
A 0 1 0 0 0 0 0 0 0 0 0 0 0
B 0 0 0 0 0 0 0 0 0 0 0 0 0
C 0 0 0 1 0 0 0 0 0 0 0 0 0
D 0 0 0 0 0 0 0 0 0 0 0 0 0
E 0 0 0 0 0 0 0 1 0 0 0 0 0
F 0 0 0 0 0 0 0 0 0 0 0 0 0
G 0 0 0 0 0 0 0 0 0 0 0 0 0
H 0 0 0 0 0 0 0 0 0 0 0 0 0
I 0 0 0 0 0 0 0 0 0 0 0 0 0
J 0 0 0 0 0 0 0 0 0 0 0 0 0
K 0 0 0 0 0 0 0 0 0 0 0 0 0
L 0 0 0 0 0 0 0 0 0 0 0 1 0
M 0 0 0 0 0 0 0 0 0 0 0 0 0
N 0 0 1 0 0 0 0 0 0 0 1 0 0
O 0 0 0 0 1 0 0 0 0 0 0 0 0
P 0 0 0 0 0 0 0 0 0 0 0 0 1
Q 0 0 0 0 0 0 0 0 0 0 0 0 0
R 0 0 0 0 0 0 0 0 1 0 0 0 0
S 0 0 0 0 0 0 0 0 0 0 0 0 0
T 0 0 0 0 0 0 0 0 0 0 0 0 0
U 0 0 0 0 0 1 0 0 0 0 0 0 0
V 1 0 0 0 0 0 1 0 0 0 0 0 0
W 0 0 0 0 0 0 0 0 0 0 0 0 0
X 0 0 0 0 0 0 0 0 0 0 0 0 0
Y 0 0 0 0 0 0 0 0 0 0 0 0 0
Z 0 0 0 0 0 0 0 0 0 0 0 0 0
Encoding Data – Character-level
- One-hot encoding
- Alphabet
- Sparse representation
- Character embedding
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Text classification, N categories
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Text classification, N categories
Neural
Network
- Fiction: 0%
- Biography: 6%
…
- Play: 80%
…
- Documentation: 0%
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
source: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. NIPS 2015
Visualization with Netro
Deep Neural Network: Crepe Model
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Visualization with Netron
Intuition: convolutions act similarly as n-grams
V A N C O U V E R … 1013
_ 0 0 0 0 0 0 0 0 0 1
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
- 0 0 0 0 0 0 0 0 0 0 …
. 0 0 0 0 0 0 0 0 0 0 …
A 0 1 0 0 0 0 0 0 0 0 …
B 0 0 0 0 0 0 0 0 0 0 …
C 0 0 0 1 0 0 0 0 0 0 …
D 0 0 0 0 0 0 0 0 0 0 …
E 0 0 0 0 0 0 0 1 0 0 …
F 0 0 0 0 0 0 0 0 0 0 …
G 0 0 0 0 0 0 0 0 0 0 …
H 0 0 0 0 0 0 0 0 0 0 …
I 0 0 0 0 0 0 0 0 0 0 …
J 0 0 0 0 0 0 0 0 0 0 …
K 0 0 0 0 0 0 0 0 0 0 …
L 0 0 0 0 0 0 0 0 0 0 …
M 0 0 0 0 0 0 0 0 0 0 …
N 0 0 1 0 0 0 0 0 0 0 …
O 0 0 0 0 1 0 0 0 0 0 …
P 0 0 0 0 0 0 0 0 0 0 …
Q 0 0 0 0 0 0 0 0 0 0 …
R 0 0 0 0 0 0 0 0 1 0 …
S 0 0 0 0 0 0 0 0 0 0 …
T 0 0 0 0 0 0 0 0 0 0 …
U 0 0 0 0 0 1 0 0 0 0 …
V 1 0 0 0 0 0 1 0 0 0 …
W 0 0 0 0 0 0 0 0 0 0 …
X 0 0 0 0 0 0 0 0 0 0 …
Y 0 0 0 0 0 0 0 0 0 0 …
Z 0 0 0 0 0 0 0 0 0 0 …
0 1 2 3 4 … … … … … … … … 1007
0 6.4 1.1 3.2 0.3 -0.4 … … … … … … … … …
1 -2.1 0.2 -3.4 … … … … … … … … … … …
… … … … … … … … … … … … … … …
… … … … … … … … … … … … … … …
… … … … … … … … … … … … … … …
254 … … … … … … … … … … … … … …
255 1.2 3.4 -1 1.2 3.2 … … … … … … … … …
x 256
69x1014x1 = ~70k
1x1008x256 = ~256k
x 1008
Temporal Convolution (256 69*7/1)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
1x1008x256 = ~256k
1x1008x256 = ~ 256k
Activation Function: Rectified Linear Unit (ReLU)
! " = $
", " ≥ 0
0, " < 0
0 1 2 3 4 5 … 1007
0 6.4 1.1 3.2 0.3 -0.4 0.2 … …
… … … … … … … … …
255 1.2 3.4 -1 1.2 3.2 2.8 … …
0 1 2 3 4 5 … 1007
0 6.4 1.1 3.2 0.3 0 0.2 … …
… … … … … … … … …
255 1.2 3.4 0 1.2 3.2 2.8 … …
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
0 1 2 3 4 5 … 1007
0 6.4 1.1 3.2 0.3 0 0.2 … …
… … … … … … … … …
255 1.2 3.4 0 1.2 3.2 2.8 … …
0 1 … 335
0 6.4 0.3 … …
… … … … …
255 3.4 3.2 … …
1x1008x256 = ~256k
1x336x256 = ~86k
x 336
x 256
Down-sampling: Max-Pooling (256 1*3/3)
source : Stanford's CS231n
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Fast forward…
1x336x256 = ~86k <- after 1 convolution layer (69*7/1) and 1 max pooling (3x1/3)
1x330x256 = ~85k <- after 1 convolution layer (1*7/1)
1x110x256 = ~28k <- 1 max-pooling (1*3/3)
3x102x256 = ~26k <- 4 convolutions layers (1*3/1)
1x34x256 = ~9k <- 1 max-pooling (1*3/3)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
0 1 2 3 4 5 6 7 8 … 33
0 6.4 0.1 … … … … … … … … …
1 2.1 24.9 … … … … … … … … …
… … … … … … … … … … … …
255 … … … … … … … … … … 9.9
0
0 6.4
1 0.1
… …
34 2.1
35 24.9
… …
… …
… …
8703 9.9 8704x1x1 = ~9k
1x34x256 = ~9k
x 256
Flattening Layer
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
0
0 6.4
1 0.1
… …
8703 9.9
8704x1x1 = ~9k
0
1
k
1023
x 1024
1024x1x1 = ~1k
!" # = %
&'(
)*(+
,"& ∗ .& + 0"
0
0 8.7
1 -2.1
… …
1023 32.1
Fully Connected / Dense layer (1024)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
0
0 8.7
1 0
… …
… …
… …
… …
… …
… …
… …
1023 32.1
DROP OUT
1024x1x1 = ~1k
0
1
k
1023
x 1024
1024x1x1 = ~1k
!" # = %
&'(
)*(+
,"& ∗ .& + 0"
0
0 9.2
1 5.3
… …
1023 0.1
ignored
Dropout (p=0.5) + Fully Connected Layer (1024)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
0
0 6.4
1 0.1
… …
… …
… …
… …
… …
… …
… …
1023 9.9
1024x1x1 = ~1k
0
…
N-1
x N
Nx1x1 = N
0
0 2.7
1 0.1
… …
… …
N-1 12.5
ignored
Softmax
0
0 0.1
1 0.01
… …
… …
N-1 0.8
Nx1x1 = N
!"#$%&' ( ) =
+,-
∑/01
234 +
,/
Output: Dropout + Dense + Softmax for N categories
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Text classification, N categories
Neural
Network
- Fiction: 0%
- Biography: 6%
…
- Play: 6%
…
- Documentation: 80%
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
How to train the network? Backward propagation!
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Backward propagation – Efficient Gradient Descent
- Fiction: 0%
- Biography: 6% 0%
…
- Play: 6% 100%
…
- Documentation: 80% 0%
- Fiction: 0%
- Biography: 6%
…
- Play: 6%
…
- Documentation: 80%
Update the weights of the convolutional masks and fully
connected units so that the error will be minimized next time
Neural
Network
!"# = !"# − &.
()
(*+,
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Learning Rate ! : How much to update the weights for every batch of documents?
Training Parameters: Learning Rate
Source:Towards data Science: Gradient descent in a nutshell
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Training parameters: Batch Size
Batch size: How many examples to learn from in one step?
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Training parameters: Number of epochs
Number of epochs: How many times should we feed the network the entire training set?
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Jupyter notebook demo – Crepe in Apache MXNet/Gluon
https://github.com/ThomasDelteil/CNN_NLP_MXNet
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Results
Traditional approaches
Word-level CNN
Character-level CNN
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
For images
For text
Humans to rephrase the examples
Synonyms
Similar semantic meaning
Data Augmentation
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Data Augmentation
The quick brown fox jumps over the lazy dog
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Data Augmentation
The quick brown fox jumps over the lazy dog
fast
swift
speedy
idle
indolent
slothful
hound
pup
mutt
leaps
springs
bounds
hops
hazel
brunette
chestnut
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Data Augmentation
The quick brown fox jumps over the lazy dog
fast
swift
speedy
idle
indolent
slothful
hound
pup
mutt
leaps
springs
bounds
hops
hazel
brunette
chestnut
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Data Augmentation
The quick brown fox jumps over the lazy dog
fast
swift
speedy
idle
indolent
slothful
hound
pup
mutt
leaps
springs
bounds
hops
The swift brunette fox leaps over the slothful pup
hazel
brunette
chestnut
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
You need a large dataset
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
…A very large dataset!
Live Demo – Classification of product category for Amazon Reviews
https://thomasdelteil.github.io/CNN_NLP_MXNet/
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
- Develop model using a Jupyter notebook
- Train model on GPU instance
- Package model behind web API in a Docker container, e.g using MXNet Model Server
- Upload container to container registry
- Deploy container to an elastic container service
- Enjoy quick and linear scaling
- Put the API behind a load balancer with SSL termination
- Enjoy J
Workflow and Operationalization
Elastic
Container
Service
GPU instance Container
Registry
Auto-scaling Load
Balancer
Container
HTTPS request
“Loved this
book”
HTTPS response
{
“prediction” : {
“book”: 0.99
}
}
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Advanced use-cases for
Convolutions and NLP
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
CNN + LSTM: Spatially and Temporally Deep Neural Networks
- CNN for feature extraction
- LSTM for temporal representation
Applications:
- Video (CNN for frames, LSTM to
combine them temporally)
- Text tasks
- Audio (Language detection)
Source: Combining CNN and RNN for spoken language detection
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Advanced use-case: Speech Generation WaveNet
Source: DeepMind Wavenet generative model raw audio
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
WaveNet: Dilated Causal Convolution
Source: DeepMind Wavenet generative model raw audio
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
WaveNet: Dilated Causal Convolution
Source: DeepMind Wavenet generative model raw audio
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Summary
§ Learned about convolutions
§ Applied them to textual data
§ Studied the crepe architecture from
Zhang et al. in details
§ Learned about advanced use cases and
operationalization
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Thank you!
Connect here
github.com/thomasdelteil
linkedin.com/in/thomasdelteil
tdelteil@amazon.com
Photos credits: https://pexels.com and https://unsplash.com/

More Related Content

What's hot

Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual IntroductionLukas Masuch
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model佳蓉 倪
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPAnuj Gupta
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchBhaskar Mitra
 
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...Edureka!
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDevashish Shanker
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingVeenaSKumar2
 
Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Hady Elsahar
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnnKuppusamy P
 
Natural language processing
Natural language processing Natural language processing
Natural language processing Md.Sumon Sarder
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingYasir Khan
 

What's hot (20)

Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
NLP
NLPNLP
NLP
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
 
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
 
Document similarity
Document similarityDocument similarity
Document similarity
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Machine Learning for dummies!
Machine Learning for dummies!Machine Learning for dummies!
Machine Learning for dummies!
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Text Classification
Text ClassificationText Classification
Text Classification
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Word Embeddings, why the hype ?
Word Embeddings, why the hype ?
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
Word embedding
Word embedding Word embedding
Word embedding
 
Natural language processing
Natural language processing Natural language processing
Natural language processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 

Similar to Convolutional Neural Networks and Natural Language Processing

The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?Frank van Harmelen
 
Introduction to Semantic Web for GIS Practitioners
Introduction to Semantic Web for GIS PractitionersIntroduction to Semantic Web for GIS Practitioners
Introduction to Semantic Web for GIS PractitionersEmanuele Della Valle
 
Deep Content Learning in Traffic Prediction and Text Classification
Deep Content Learning in Traffic Prediction and Text ClassificationDeep Content Learning in Traffic Prediction and Text Classification
Deep Content Learning in Traffic Prediction and Text ClassificationHPCC Systems
 
Deep Learning for IoT : is there a shallow end of the pool?
Deep Learning for IoT : is there a shallow end of the pool?Deep Learning for IoT : is there a shallow end of the pool?
Deep Learning for IoT : is there a shallow end of the pool?Venu Vasudevan
 
The Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationThe Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationFrank van Harmelen
 
building intelligent systems with large scale deep learning
building intelligent systems with large scale deep learningbuilding intelligent systems with large scale deep learning
building intelligent systems with large scale deep learningmustafa sarac
 
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...MaRS Discovery District
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning David Voyles
 
OK festival Lightning Talk - Collaborative Open Geospatial Data
OK festival Lightning Talk - Collaborative Open Geospatial DataOK festival Lightning Talk - Collaborative Open Geospatial Data
OK festival Lightning Talk - Collaborative Open Geospatial DataAndrew Turner
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Amr Rashed
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning TutorialAmr Rashed
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Dhruv Gohil
 
Small, Medium and Big Data
Small, Medium and Big DataSmall, Medium and Big Data
Small, Medium and Big DataPierre De Wilde
 
BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...
BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...
BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...Big Data Week
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningAmr Rashed
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson StudioSasha Lazarevic
 
Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Paul Houle
 

Similar to Convolutional Neural Networks and Natural Language Processing (20)

The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Introduction to Semantic Web for GIS Practitioners
Introduction to Semantic Web for GIS PractitionersIntroduction to Semantic Web for GIS Practitioners
Introduction to Semantic Web for GIS Practitioners
 
Novi sad ai event 1-2018
Novi sad ai event 1-2018Novi sad ai event 1-2018
Novi sad ai event 1-2018
 
Deep Content Learning in Traffic Prediction and Text Classification
Deep Content Learning in Traffic Prediction and Text ClassificationDeep Content Learning in Traffic Prediction and Text Classification
Deep Content Learning in Traffic Prediction and Text Classification
 
Deep Learning for IoT : is there a shallow end of the pool?
Deep Learning for IoT : is there a shallow end of the pool?Deep Learning for IoT : is there a shallow end of the pool?
Deep Learning for IoT : is there a shallow end of the pool?
 
The Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationThe Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge Representation
 
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
 
building intelligent systems with large scale deep learning
building intelligent systems with large scale deep learningbuilding intelligent systems with large scale deep learning
building intelligent systems with large scale deep learning
 
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning
 
OK festival Lightning Talk - Collaborative Open Geospatial Data
OK festival Lightning Talk - Collaborative Open Geospatial DataOK festival Lightning Talk - Collaborative Open Geospatial Data
OK festival Lightning Talk - Collaborative Open Geospatial Data
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical
 
Small, Medium and Big Data
Small, Medium and Big DataSmall, Medium and Big Data
Small, Medium and Big Data
 
BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...
BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...
BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson Studio
 
Query Understanding
Query UnderstandingQuery Understanding
Query Understanding
 
Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6
 

Recently uploaded

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 

Recently uploaded (20)

Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 

Convolutional Neural Networks and Natural Language Processing

  • 1. Convolutional Neural Networks and Natural Language Processing Thomas Delteil – github.com/thomasdelteil – linkedin.com/in/thomasdelteil Applied Scientist @ AWS Deep Engine
  • 2. Goals § Explain what convolutions are § Show how to handle textual data § Analyze a reference neural network architecture for text classification § Demonstrate how to train and deploy a CNN for Natural Language Processing Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 3. Convolutions And where to find them Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 4. 2012 - ImageNet Classification with Deep Convolutional Neural Networks ImageNet classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, Advances in Neural Information Processing Systems, 2012 AlexNet architecture Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 5. ImageNet competition Classify images among 1000 classes: AlexNet Top-5 error-rate, 25% => 16%! Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 6. Actual photo of the reaction from the computer vision community* *might just be a stock photo Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 7. I told you so! Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 8. What made Convolutional Neural Networks viable? Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 9. GPUs! - Nvidia V100, float16 Ops: ~ 120 TFLOPS, 5000+ cuda cores - #1 Super computer 2005 ~135 TFLOPS Source: Mathworks Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 10. Sea/Land segmentation via satellite images DeepUNet: A Deep Fully Convolutional Network for Pixel-level Sea-Land Segmentation, Ruirui Li et al, 2017 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 11. Automatic Galaxy classication Deep Galaxy: Classification of Galaxies based on Deep Convolutional Neural Networks , Nour Eldeen M. Khalifa, 2017 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 12. Medical Imaging, MRI, X-ray, surgical cameras Review of MRI-based Brain Tumor Image Segmentation Using Deep Learning Methods, Ali Isn et al. 2016 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 13. What is a convolution ? It is the cross-channel sum of the element-wise multiplication of a convolutional filter (kernel/mask) computed over a sliding window on an input tensor given a certain stride and padding, plus a bias term. The result is called a feature map. 2 2 1 3 1 -1 4 3 2 1 -1 -1 0 Input matrix (3x3) no padding 1 channel Kernel (2x2) Stride 1 Bias = 2 Feature map (2x2) -1 2 0 1 1*2 –1*2 –1*3 + 0*1 + 2 = – 1 1*2 –1*2 –1*1 + 0*-1 + 2. = 2 1*3 –1*1 –1*4 + 0*3 + 2 = 0 1*1 – (-1)*1 –1*3 + 0*2 + 2 = 1 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 14. What is a convolution ? Padding Source: Machine Learning guru - Neural Networks CNN Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 15. What is a convolution ? Stride = 2 Source: Machine Learning guru - Neural Networks CNN Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 16. What is a convolution ? Multi Channel 1 convolutional filter (3)x(3x3) Source: Machine Learning guru - Neural Networks CNN Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 17. What is a convolution ? Multi Channel source: Convolutional Neural Networks on the iphone with vggnet N: Number of input channels W:Width of the kernel H: Height of the kernel M: Number of output channels Kernel size = ! ∗ # ∗ $ #Params = % ∗ ! ∗ # ∗ $ + % 256 convolutions of kernel (3,3) on 256 input channels 256*256*3*3 = ~0.5M Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 18. Easily parallelizable Convolution computations are: - Independent (across filters and within filter) - Simple (multiplication and sums) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 19. Why does it work? Sharpening filter Laplacian filter Sobel x-axis filter Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 20. Why does it work? - Detect patterns at larger and larger scale by stacking convolution layers on top of each others to grow the receptive field - Applicable to spatially correlated data Source: AlexNet first 96 (55x55) filters learned represented in RGB space (3 input channels) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 21. Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil Growing receptive field Source: ML Review, A guide to receptive field arithmetic Deeper in the network
  • 22. Visualize convolutions http://scs.ryerson.ca/~aharley/vis/conv/flat.html Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 23. Visualize convolutions Source: Neural Network 3D Simulation (warning flashing lights) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 24. State of the art networks are getting deeper and more complex Source: Inception v3 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil input
  • 25. Learn Data Science – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil High number of parameters => Requires a lot of data to train Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 26. Advanced type of convolutions Source: An introduction to different types of convolutions Transposed Convolutions (deconvolution) EnhanceNet Dilated Convolutions WaveNet Depth-wise separable Convolutions MobileNet Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 27. On to Natural Language Processing Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 28. NLP Machine translation OCR Q&A Sentiment Analysis Speech Recognition TTS Topic Modelling Information Retrieval Natural Language Understanding Document Classification Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil NLP Domains
  • 29. 8.4PB of information per second as of 2020 source: business2comunity, 2016 70% of companies use customer feedback Source: business2comunity, 2016 £1.3Tvalue of company data source: IDC, 2014 10% of organizations expect to commercialise their data by 2020 source: Gartner, 2016 NLP Industry Facts Source: Ticary, What is natural language processing Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 30. Convolutions and Natural Language Processing Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 31. Data Representation ? source: Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn,and Dong Yu,. Classification Convolutional Neural Networks for Speech Recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 32. Encoding Data word-level - Word-level embedding (word2vec). Word -> N-dimensional vector Source: Convolutional Neural Networks for Sentence Classification,Yoon Kim, 2014 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil N time different embeddings
  • 33. V A N C O U V E R N L P … _ 0 0 0 0 0 0 0 0 0 1 0 0 0 - 0 0 0 0 0 0 0 0 0 0 0 0 0 . 0 0 0 0 0 0 0 0 0 0 0 0 0 A 0 1 0 0 0 0 0 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 0 0 0 C 0 0 0 1 0 0 0 0 0 0 0 0 0 D 0 0 0 0 0 0 0 0 0 0 0 0 0 E 0 0 0 0 0 0 0 1 0 0 0 0 0 F 0 0 0 0 0 0 0 0 0 0 0 0 0 G 0 0 0 0 0 0 0 0 0 0 0 0 0 H 0 0 0 0 0 0 0 0 0 0 0 0 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 J 0 0 0 0 0 0 0 0 0 0 0 0 0 K 0 0 0 0 0 0 0 0 0 0 0 0 0 L 0 0 0 0 0 0 0 0 0 0 0 1 0 M 0 0 0 0 0 0 0 0 0 0 0 0 0 N 0 0 1 0 0 0 0 0 0 0 1 0 0 O 0 0 0 0 1 0 0 0 0 0 0 0 0 P 0 0 0 0 0 0 0 0 0 0 0 0 1 Q 0 0 0 0 0 0 0 0 0 0 0 0 0 R 0 0 0 0 0 0 0 0 1 0 0 0 0 S 0 0 0 0 0 0 0 0 0 0 0 0 0 T 0 0 0 0 0 0 0 0 0 0 0 0 0 U 0 0 0 0 0 1 0 0 0 0 0 0 0 V 1 0 0 0 0 0 1 0 0 0 0 0 0 W 0 0 0 0 0 0 0 0 0 0 0 0 0 X 0 0 0 0 0 0 0 0 0 0 0 0 0 Y 0 0 0 0 0 0 0 0 0 0 0 0 0 Z 0 0 0 0 0 0 0 0 0 0 0 0 0 Encoding Data – Character-level - One-hot encoding - Alphabet - Sparse representation - Character embedding Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 34. Text classification, N categories Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 35. Text classification, N categories Neural Network - Fiction: 0% - Biography: 6% … - Play: 80% … - Documentation: 0% Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 36. source: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. NIPS 2015 Visualization with Netro Deep Neural Network: Crepe Model Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil Visualization with Netron Intuition: convolutions act similarly as n-grams
  • 37. V A N C O U V E R … 1013 _ 0 0 0 0 0 0 0 0 0 1 … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … - 0 0 0 0 0 0 0 0 0 0 … . 0 0 0 0 0 0 0 0 0 0 … A 0 1 0 0 0 0 0 0 0 0 … B 0 0 0 0 0 0 0 0 0 0 … C 0 0 0 1 0 0 0 0 0 0 … D 0 0 0 0 0 0 0 0 0 0 … E 0 0 0 0 0 0 0 1 0 0 … F 0 0 0 0 0 0 0 0 0 0 … G 0 0 0 0 0 0 0 0 0 0 … H 0 0 0 0 0 0 0 0 0 0 … I 0 0 0 0 0 0 0 0 0 0 … J 0 0 0 0 0 0 0 0 0 0 … K 0 0 0 0 0 0 0 0 0 0 … L 0 0 0 0 0 0 0 0 0 0 … M 0 0 0 0 0 0 0 0 0 0 … N 0 0 1 0 0 0 0 0 0 0 … O 0 0 0 0 1 0 0 0 0 0 … P 0 0 0 0 0 0 0 0 0 0 … Q 0 0 0 0 0 0 0 0 0 0 … R 0 0 0 0 0 0 0 0 1 0 … S 0 0 0 0 0 0 0 0 0 0 … T 0 0 0 0 0 0 0 0 0 0 … U 0 0 0 0 0 1 0 0 0 0 … V 1 0 0 0 0 0 1 0 0 0 … W 0 0 0 0 0 0 0 0 0 0 … X 0 0 0 0 0 0 0 0 0 0 … Y 0 0 0 0 0 0 0 0 0 0 … Z 0 0 0 0 0 0 0 0 0 0 … 0 1 2 3 4 … … … … … … … … 1007 0 6.4 1.1 3.2 0.3 -0.4 … … … … … … … … … 1 -2.1 0.2 -3.4 … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … 254 … … … … … … … … … … … … … … 255 1.2 3.4 -1 1.2 3.2 … … … … … … … … … x 256 69x1014x1 = ~70k 1x1008x256 = ~256k x 1008 Temporal Convolution (256 69*7/1) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 38. 1x1008x256 = ~256k 1x1008x256 = ~ 256k Activation Function: Rectified Linear Unit (ReLU) ! " = $ ", " ≥ 0 0, " < 0 0 1 2 3 4 5 … 1007 0 6.4 1.1 3.2 0.3 -0.4 0.2 … … … … … … … … … … … 255 1.2 3.4 -1 1.2 3.2 2.8 … … 0 1 2 3 4 5 … 1007 0 6.4 1.1 3.2 0.3 0 0.2 … … … … … … … … … … … 255 1.2 3.4 0 1.2 3.2 2.8 … … Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 39. 0 1 2 3 4 5 … 1007 0 6.4 1.1 3.2 0.3 0 0.2 … … … … … … … … … … … 255 1.2 3.4 0 1.2 3.2 2.8 … … 0 1 … 335 0 6.4 0.3 … … … … … … … 255 3.4 3.2 … … 1x1008x256 = ~256k 1x336x256 = ~86k x 336 x 256 Down-sampling: Max-Pooling (256 1*3/3) source : Stanford's CS231n Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 40. Fast forward… 1x336x256 = ~86k <- after 1 convolution layer (69*7/1) and 1 max pooling (3x1/3) 1x330x256 = ~85k <- after 1 convolution layer (1*7/1) 1x110x256 = ~28k <- 1 max-pooling (1*3/3) 3x102x256 = ~26k <- 4 convolutions layers (1*3/1) 1x34x256 = ~9k <- 1 max-pooling (1*3/3) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 41. 0 1 2 3 4 5 6 7 8 … 33 0 6.4 0.1 … … … … … … … … … 1 2.1 24.9 … … … … … … … … … … … … … … … … … … … … … 255 … … … … … … … … … … 9.9 0 0 6.4 1 0.1 … … 34 2.1 35 24.9 … … … … … … 8703 9.9 8704x1x1 = ~9k 1x34x256 = ~9k x 256 Flattening Layer Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 42. 0 0 6.4 1 0.1 … … 8703 9.9 8704x1x1 = ~9k 0 1 k 1023 x 1024 1024x1x1 = ~1k !" # = % &'( )*(+ ,"& ∗ .& + 0" 0 0 8.7 1 -2.1 … … 1023 32.1 Fully Connected / Dense layer (1024) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 43. 0 0 8.7 1 0 … … … … … … … … … … … … … … 1023 32.1 DROP OUT 1024x1x1 = ~1k 0 1 k 1023 x 1024 1024x1x1 = ~1k !" # = % &'( )*(+ ,"& ∗ .& + 0" 0 0 9.2 1 5.3 … … 1023 0.1 ignored Dropout (p=0.5) + Fully Connected Layer (1024) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 44. 0 0 6.4 1 0.1 … … … … … … … … … … … … … … 1023 9.9 1024x1x1 = ~1k 0 … N-1 x N Nx1x1 = N 0 0 2.7 1 0.1 … … … … N-1 12.5 ignored Softmax 0 0 0.1 1 0.01 … … … … N-1 0.8 Nx1x1 = N !"#$%&' ( ) = +,- ∑/01 234 + ,/ Output: Dropout + Dense + Softmax for N categories Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 45. Text classification, N categories Neural Network - Fiction: 0% - Biography: 6% … - Play: 6% … - Documentation: 80% Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 46. How to train the network? Backward propagation! Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 47. Backward propagation – Efficient Gradient Descent - Fiction: 0% - Biography: 6% 0% … - Play: 6% 100% … - Documentation: 80% 0% - Fiction: 0% - Biography: 6% … - Play: 6% … - Documentation: 80% Update the weights of the convolutional masks and fully connected units so that the error will be minimized next time Neural Network !"# = !"# − &. () (*+, Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 48. Learning Rate ! : How much to update the weights for every batch of documents? Training Parameters: Learning Rate Source:Towards data Science: Gradient descent in a nutshell Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 49. Training parameters: Batch Size Batch size: How many examples to learn from in one step? Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 50. Training parameters: Number of epochs Number of epochs: How many times should we feed the network the entire training set? Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 51. Jupyter notebook demo – Crepe in Apache MXNet/Gluon https://github.com/ThomasDelteil/CNN_NLP_MXNet Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 52. Results Traditional approaches Word-level CNN Character-level CNN Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 53. For images For text Humans to rephrase the examples Synonyms Similar semantic meaning Data Augmentation Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 54. Data Augmentation The quick brown fox jumps over the lazy dog Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 55. Data Augmentation The quick brown fox jumps over the lazy dog fast swift speedy idle indolent slothful hound pup mutt leaps springs bounds hops hazel brunette chestnut Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 56. Data Augmentation The quick brown fox jumps over the lazy dog fast swift speedy idle indolent slothful hound pup mutt leaps springs bounds hops hazel brunette chestnut Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 57. Data Augmentation The quick brown fox jumps over the lazy dog fast swift speedy idle indolent slothful hound pup mutt leaps springs bounds hops The swift brunette fox leaps over the slothful pup hazel brunette chestnut Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 58. You need a large dataset Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 59. …A very large dataset!
  • 60. Live Demo – Classification of product category for Amazon Reviews https://thomasdelteil.github.io/CNN_NLP_MXNet/ Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 61. - Develop model using a Jupyter notebook - Train model on GPU instance - Package model behind web API in a Docker container, e.g using MXNet Model Server - Upload container to container registry - Deploy container to an elastic container service - Enjoy quick and linear scaling - Put the API behind a load balancer with SSL termination - Enjoy J Workflow and Operationalization Elastic Container Service GPU instance Container Registry Auto-scaling Load Balancer Container HTTPS request “Loved this book” HTTPS response { “prediction” : { “book”: 0.99 } } Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 62. Advanced use-cases for Convolutions and NLP Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 63. CNN + LSTM: Spatially and Temporally Deep Neural Networks - CNN for feature extraction - LSTM for temporal representation Applications: - Video (CNN for frames, LSTM to combine them temporally) - Text tasks - Audio (Language detection) Source: Combining CNN and RNN for spoken language detection Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 64. Advanced use-case: Speech Generation WaveNet Source: DeepMind Wavenet generative model raw audio Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 65. WaveNet: Dilated Causal Convolution Source: DeepMind Wavenet generative model raw audio Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 66. WaveNet: Dilated Causal Convolution Source: DeepMind Wavenet generative model raw audio Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 67. Summary § Learned about convolutions § Applied them to textual data § Studied the crepe architecture from Zhang et al. in details § Learned about advanced use cases and operationalization Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil