SlideShare a Scribd company logo
1 of 23
Download to read offline
NATURAL LANGUAGE
PROCESSING
DEEP LEARNING FOR
MACHINE LEARNING ENGINEER
WILDER RODRIGUES
• Coursera Mentor
• City.AI Ambassador;
• IBM Watson AI XPRIZE contestant;
• Kaggler;
• Guest attendee at AI for
Good Global Summit at the UN;
• X-Men geek;
• family man and father of 5 (3 kids and
2 cats).
@wilderrodrigues
https://medium.com/@wilder.rodrigues/
WHAT IS IN THERE FOR YOU?
AGENDA
• The Basics
• Vector Representation of Words
• The Shallow
• [Deep] Neural Networks for NLP
• The Deep
• Convolutional Networks for NLP
• The Recurrent
• Long-short Term Memory for NLP
• Where do we go from here?
• Automation of AWS GPUs with Terraform
VECTOR
REPRESENTATION
OF WORDS
THE BASICS
REPRESENTATIONS OF LANGUAGE
HOW DOES IT WORK?
WORD2VEC
• Cosine distance between words in the vector
space:
• X = vector(”biggest”)−vector(”big”) +
vector(”small”)
• X = smallest
• Algorithms:
• Skip-Gram
• It predicts the context words from the
target words.
• CBOW
• It predicts the target word from the bag of
all context words.
Cosine Distance Euclidian Distance
The CBOW architecture predicts the current word based on the context,
and the Skip-gram predicts surrounding words given the current word.
DEMO
WORD2VEC
[DEEP]
NEURAL
NETWORKS
THE SHALLOW
WHERE TO FOCUS FOR NOW?
DEMO
SENTIMENT ANALYSIS
CONVOLUTIONAL
NEURAL
NETWORKS
THE DEEP
HOW THEY WORK?
CNNS
• Filters
• Kernel
• Strides
• Padding
• One equation to rule them all:
* =
6x6x3
3x3x3
4x4x16
4x4x16
2x2x16 2x2x16
* =
2
6
3
3
6
4
7
9
8
3
1
-1
4
0
0
4
2
3
91
1
6
2
3
2
5
7
9
7
2
1
4
3
2
7
7
4
8
2
6
7
3
4
4
3
9
1
55
(6 + 2 . 0 - 3) / 1 + 1 = 4
(6 + 2 . 0 - 3) / 1 + 1 = 4
16
4x4x16
HOW THEY WORK WITH TEXT?
CNNS
• Each row of the matrix corresponds
to a word/token. Meaning, each row
is a low-dimensional vector that
represents a word/token.
• The width of the filters is usually the
same as the width of the input
matrix.
• The height may vary, but it’s typically
between 2 and 5. So, for a 2x5 filter
it means we would cover 2 words
per sliding window.
DEMO
SENTIMENT ANALYSIS
LONG
SHORT TERM
MEMORY
THE RECURRENT
LONG-TERM DEPENDENCIES PROBLEMS
RNNS
• Small vs Large gap between the
relevant information for the
prediction:
• “the clouds are in the sky.”;
• “I grew up in France… I speak
fluent French.”.
HOW THEY WORK?
LSTMS
• LSTMs’ Gates:
• Forget
• Decides whether the state will be passed through
or not.
• Input
• Decides on which values to update and then feeds
a tanh which will output the next Candidate state.
• Update the new state based on the previous one
plus the candidate state.
• Output
• Feeds a sigmoid function to decide which parts of
the state will be output.
• Feeds a tanh function with the State and multiplies
its output with the sigmoid result.
DEMO
SENTIMENT ANALYSIS
TERRAFORM
WHERE DO WE GO
FROM HERE?
INFRASTRUCTURE AS CODE
BUILDING A LANDSCAPE
• Abstracts resources and providers:
• Physical hardware;
• Virtual machines; and
• Containers.
• Multi-Tier Applications
• Multi-Cloud Deployment
• Software Demos
DEMO
PUT IT ALL TOGETHER
WHERE DID I GET THIS STUFF FROM?
REFERENCES
• Efficient Estimation of Word Representations in Vector Space: Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean. Google,
2013.
• A Neural Probabilistic Language Model: Yoshua Bengio, Réjean Ducharme, Pascal Vincent, Christian Jauvin. Université de
Montréal, Montréal, Québec, Canada, 2013.
• Dropout: A Simple Way to Prevent Neural Networks from Overfitting: Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya
Sutskever, Ruslan Salakhutdinov. University of Toronto, Toronto, Ontario, Canada.
• https://medium.com/cityai/deep-learning-for-natural-language-processing-part-i-8369895ffb98
• https://medium.com/cityai/deep-learning-for-natural-language-processing-part-ii-8b2b99b3fa1e
• https://medium.com/cityai/deep-learning-for-natural-language-processing-part-iii-96cfc6acfcc3
• http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/
• https://github.com/ekholabs/DLinK
• https://github.com/ekholabs/automated_ml
Deep Learning for Natural Language Processing

More Related Content

Similar to Deep Learning for Natural Language Processing

R for the semantic web, Quesada useR 2009
R for the semantic web, Quesada useR 2009R for the semantic web, Quesada useR 2009
R for the semantic web, Quesada useR 2009
Jose Quesada
 
FP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleFP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit Hole
Christophe Grand
 
A million little tracking devices - Don Bailey
A million little tracking devices - Don BaileyA million little tracking devices - Don Bailey
A million little tracking devices - Don Bailey
idsecconf
 

Similar to Deep Learning for Natural Language Processing (20)

5_RNN_LSTM.pdf
5_RNN_LSTM.pdf5_RNN_LSTM.pdf
5_RNN_LSTM.pdf
 
Introduction to k-Nearest Neighbors and Amazon SageMaker
Introduction to k-Nearest Neighbors and Amazon SageMaker Introduction to k-Nearest Neighbors and Amazon SageMaker
Introduction to k-Nearest Neighbors and Amazon SageMaker
 
The Factoring Dead: Preparing for the Cryptopocalypse
The Factoring Dead: Preparing for the CryptopocalypseThe Factoring Dead: Preparing for the Cryptopocalypse
The Factoring Dead: Preparing for the Cryptopocalypse
 
R for the semantic web, Quesada useR 2009
R for the semantic web, Quesada useR 2009R for the semantic web, Quesada useR 2009
R for the semantic web, Quesada useR 2009
 
Using Spark's RDD APIs for complex, custom applications
Using Spark's RDD APIs for complex, custom applicationsUsing Spark's RDD APIs for complex, custom applications
Using Spark's RDD APIs for complex, custom applications
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
EPOP: Quantifying Violent Risk for Every Point on the Planet
EPOP: Quantifying Violent Risk for Every Point on the PlanetEPOP: Quantifying Violent Risk for Every Point on the Planet
EPOP: Quantifying Violent Risk for Every Point on the Planet
 
L5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature EngineeringL5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature Engineering
 
MapReduce on Zero VM
MapReduce on Zero VM MapReduce on Zero VM
MapReduce on Zero VM
 
Deep Learning Summit (DLS01-4)
Deep Learning Summit (DLS01-4)Deep Learning Summit (DLS01-4)
Deep Learning Summit (DLS01-4)
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
FP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit HoleFP Days: Down the Clojure Rabbit Hole
FP Days: Down the Clojure Rabbit Hole
 
Trends in Programming Technology you might want to keep an eye on af Bent Tho...
Trends in Programming Technology you might want to keep an eye on af Bent Tho...Trends in Programming Technology you might want to keep an eye on af Bent Tho...
Trends in Programming Technology you might want to keep an eye on af Bent Tho...
 
Data oriented design and c++
Data oriented design and c++Data oriented design and c++
Data oriented design and c++
 
Finding Needles in Haystacks (The Size of Countries)
Finding Needles in Haystacks (The Size of Countries)Finding Needles in Haystacks (The Size of Countries)
Finding Needles in Haystacks (The Size of Countries)
 
10 - IDNOG04 - Enrico Hugo (Indonesia Honeynet Project) - The Rise of DGA Mal...
10 - IDNOG04 - Enrico Hugo (Indonesia Honeynet Project) - The Rise of DGA Mal...10 - IDNOG04 - Enrico Hugo (Indonesia Honeynet Project) - The Rise of DGA Mal...
10 - IDNOG04 - Enrico Hugo (Indonesia Honeynet Project) - The Rise of DGA Mal...
 
Data science
Data scienceData science
Data science
 
A million little tracking devices - Don Bailey
A million little tracking devices - Don BaileyA million little tracking devices - Don Bailey
A million little tracking devices - Don Bailey
 
Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013Sharding in MongoDB Days 2013
Sharding in MongoDB Days 2013
 
dialecto
dialectodialecto
dialecto
 

More from Wilder Rodrigues

More from Wilder Rodrigues (7)

Improving Machine Learning
 Workflows: Training, Packaging and Serving.
Improving  Machine Learning
 Workflows: Training, Packaging and Serving.Improving  Machine Learning
 Workflows: Training, Packaging and Serving.
Improving Machine Learning
 Workflows: Training, Packaging and Serving.
 
Neutralising bias on word embeddings
Neutralising bias on word embeddingsNeutralising bias on word embeddings
Neutralising bias on word embeddings
 
Ai - A Practical Approach
Ai - A Practical ApproachAi - A Practical Approach
Ai - A Practical Approach
 
Java 9: Jigsaw Project
Java 9: Jigsaw ProjectJava 9: Jigsaw Project
Java 9: Jigsaw Project
 
Microservices with Spring Cloud
Microservices with Spring CloudMicroservices with Spring Cloud
Microservices with Spring Cloud
 
Machine intelligence
Machine intelligenceMachine intelligence
Machine intelligence
 
Embracing Reactive Streams with Java 9 and Spring 5
Embracing Reactive Streams with Java 9 and Spring 5Embracing Reactive Streams with Java 9 and Spring 5
Embracing Reactive Streams with Java 9 and Spring 5
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Deep Learning for Natural Language Processing

  • 2. MACHINE LEARNING ENGINEER WILDER RODRIGUES • Coursera Mentor • City.AI Ambassador; • IBM Watson AI XPRIZE contestant; • Kaggler; • Guest attendee at AI for Good Global Summit at the UN; • X-Men geek; • family man and father of 5 (3 kids and 2 cats). @wilderrodrigues https://medium.com/@wilder.rodrigues/
  • 3. WHAT IS IN THERE FOR YOU? AGENDA • The Basics • Vector Representation of Words • The Shallow • [Deep] Neural Networks for NLP • The Deep • Convolutional Networks for NLP • The Recurrent • Long-short Term Memory for NLP • Where do we go from here? • Automation of AWS GPUs with Terraform
  • 6. HOW DOES IT WORK? WORD2VEC • Cosine distance between words in the vector space: • X = vector(”biggest”)−vector(”big”) + vector(”small”) • X = smallest • Algorithms: • Skip-Gram • It predicts the context words from the target words. • CBOW • It predicts the target word from the bag of all context words. Cosine Distance Euclidian Distance The CBOW architecture predicts the current word based on the context, and the Skip-gram predicts surrounding words given the current word.
  • 9. WHERE TO FOCUS FOR NOW?
  • 12. HOW THEY WORK? CNNS • Filters • Kernel • Strides • Padding • One equation to rule them all: * = 6x6x3 3x3x3 4x4x16 4x4x16 2x2x16 2x2x16 * = 2 6 3 3 6 4 7 9 8 3 1 -1 4 0 0 4 2 3 91 1 6 2 3 2 5 7 9 7 2 1 4 3 2 7 7 4 8 2 6 7 3 4 4 3 9 1 55 (6 + 2 . 0 - 3) / 1 + 1 = 4 (6 + 2 . 0 - 3) / 1 + 1 = 4 16 4x4x16
  • 13. HOW THEY WORK WITH TEXT? CNNS • Each row of the matrix corresponds to a word/token. Meaning, each row is a low-dimensional vector that represents a word/token. • The width of the filters is usually the same as the width of the input matrix. • The height may vary, but it’s typically between 2 and 5. So, for a 2x5 filter it means we would cover 2 words per sliding window.
  • 16. LONG-TERM DEPENDENCIES PROBLEMS RNNS • Small vs Large gap between the relevant information for the prediction: • “the clouds are in the sky.”; • “I grew up in France… I speak fluent French.”.
  • 17. HOW THEY WORK? LSTMS • LSTMs’ Gates: • Forget • Decides whether the state will be passed through or not. • Input • Decides on which values to update and then feeds a tanh which will output the next Candidate state. • Update the new state based on the previous one plus the candidate state. • Output • Feeds a sigmoid function to decide which parts of the state will be output. • Feeds a tanh function with the State and multiplies its output with the sigmoid result.
  • 19. TERRAFORM WHERE DO WE GO FROM HERE?
  • 20. INFRASTRUCTURE AS CODE BUILDING A LANDSCAPE • Abstracts resources and providers: • Physical hardware; • Virtual machines; and • Containers. • Multi-Tier Applications • Multi-Cloud Deployment • Software Demos
  • 21. DEMO PUT IT ALL TOGETHER
  • 22. WHERE DID I GET THIS STUFF FROM? REFERENCES • Efficient Estimation of Word Representations in Vector Space: Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean. Google, 2013. • A Neural Probabilistic Language Model: Yoshua Bengio, Réjean Ducharme, Pascal Vincent, Christian Jauvin. Université de Montréal, Montréal, Québec, Canada, 2013. • Dropout: A Simple Way to Prevent Neural Networks from Overfitting: Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan Salakhutdinov. University of Toronto, Toronto, Ontario, Canada. • https://medium.com/cityai/deep-learning-for-natural-language-processing-part-i-8369895ffb98 • https://medium.com/cityai/deep-learning-for-natural-language-processing-part-ii-8b2b99b3fa1e • https://medium.com/cityai/deep-learning-for-natural-language-processing-part-iii-96cfc6acfcc3 • http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/ • https://github.com/ekholabs/DLinK • https://github.com/ekholabs/automated_ml