Deep learning for natural language understanding

Dr. David Talby
DEEP LEARNING FOR
NATURAL LANGUAGE UNDERSTANDING

CONTENTS
 NLP & THE PROMISE OF DEEP LEARNING
 IN ACTION: NAMED ENTITY RECOGNITION
 GOING TO PRODUCTION

AI VS. DOCTORS
Deep Learning
Computer
Vision
Access to Care
Diagnostic
Accuracy

NLP IN HEALTHCARE
Deep Learning
NLP
Efficiency
Accuracy
Radiology Diagnostic
Mental
Health
Safety
Events
Inpatient
Pre-
Auth
Key
Opinion
Leaders
Research
Meta
Analysis
Clinical
Coding
Financial
Anti-
Fraud
Adverse
Events
Drug Development
Recruit
for Trials

Natural Language Understanding
is an AI-Complete problem.

ED Triage Notes
states started last night, upper abd, took alka seltzer approx
0500, no relief. nausea no vomiting
Since yeatreday 10/10 "constant Tylenol 1 hr ago. +nausea.
diaphoretic. Mid abd radiates to back
Generalized abd radiating to lower x 3 days accompanied
by dark stools. Now with bloody stool this am. Denies dizzy,
sob, fatigue. Visiting from Japan on business.”
Features
Type of Pain
Intensity of Pain
Body part of region
Symptoms
Onset of symptoms
Attempted home remedy
HUMAN LANGUAGE IS CONTEXTUAL

THE PROMISE OF DEEP LEARNING
Get by with rules, search,
RegEx, attribute extraction
Welcome to the world of
NLP, ML and DL
Social media
Does this social media post
contain an offensive word?
Is this social media post
offensive?
Legal
Find patents with the terms
‘car’ and battery’, or synonyms
Who is patenting next-gen
electrical car batteries?
Support
Find products mentioned in
customer emails or phone calls
What is this customer
complaining about?
Finance
Extract the fee structure from a
mutual fund prospectus
Are UK pensions allowed to
invest in this fund?
Healthcare
Extract the patient’s blood
pressure reading from a note
Does this patient have high
blood pressure?

NAMED ENTITY RECOGNITION
From Sutton & McCallum’s An Introduction to Conditional Random Fields.

FROM CRF TO DEEP LEARNING (AND BACK)
From Yves Peirsman’s Named Entity Recognition and the Road to Deep Learning
• CoNLL-2003 shared task dataset
• CRF++ Implementation
• Feature engineering:
• the token itself
• Its Bigram & trigram
• Their prefix & suffix
• Its part of speech
• Its chunk type
• Does it start with a capital?
• Is it uppercase?
• Is it a digit?
• Surrounding context words
Starting Point: “Classic” machine learning approach
81.15%
F-score

CRF + WORD EMBEDDINGS
Replacing curated dictionaries with embeddings to model semantic similarity
84.9%
F-score

FORGET CRF. LET’S USE AN LSTM NETWORK
An LSTM is a type of RNN, well suited for sequential data with long-term dependencies
64.9%
LSTM F-score
76.1%
biLSTM F-score

TRANSFER LEARNING: USE PRETRAINED EMBEDDINGS
85.9%
F-score
Reuse the embeddings trained on Wikipedia,
instead of on CoNNL which only has 200,000 words

ADD CHARACTER BASED MODEL: BI-LSTM OR CNN
89.3%
F-score
In addition to token based models, add a character-based biLSTM or CNN
to learn and model word prefixes and suffixes

LET’S GET OVER 90% - BRING BACK THE CRF!
90.3%
F-score
Because predicting all labels independently of each other, not taking into account the
labels predicted for the surrounding words, leaves some accuracy on the table

In deep learning, architecture engineering
is the new feature engineering.
Stephen Merity

Data
Curation
Data
Science
Data
Engineering
Data
Operations
Moving from research to production?
 Business Case
 All four roles on the team

Data
Curation
Data
Science
Data
Engineering
Data
Operations
Get the data Get expert labels
Get pretrained datasets
& embeddings
“Inception v3 was trained on
1.28 million images”
“In the study, the algorithm went
head-to-head against 21 board-
certified dermatologists”
Facebook open sourced
pre-trained word vectors for
294 languages, trained
on Wikipedia using fastText
“used over 120,000 retinal
images to train a neural network
to detect diabetic retinopathy”
“All images were graded by 3 to 7
different ophthalmologists, from
a panel of 54 US-licensed senior
residents & ophthalmologists”
UMLS has over 1 million
biomedical concepts and 5
million concept names, from
over 100 controlled vocabularies

Data
Curation
Data
Science
Data
Engineering
Data
Operations
Read up on state of the art, domain specific research
“How to Train Good Word Embeddings
for Biomedical NLP”.
Chiu et al., In Proceedings of BioNLP’16, August 2016.
“Entity Recognition from Clinical Texts via Recurrent
Neural Network”.
Liu et al., BMC Medical Informatics & Decision Making, July 2017.
Are your ML/DL/NLP libraries research or industrial grade?

Data Sources API
Spark Core API (RDD’s, Project Tungsten)
Spark SQL API (DataFrame, Catalyst Optimizer)
Spark ML API (Pipeline, Transformer, Estimator)
Part of Speech Tagger
Named Entity Recognition
Sentiment Analysis
Spell Checker
Tokenizer
Stemmer
Lemmatizer
Entity Extraction
Topic Modeling
Word2Vec
TF-IDF
String distance calculation
N-grams calculation
Stop word removal
Train/Test & Cross-Validate
Ensembles
High Performance Natural Language Understanding at Scale
Data
Curation
Data
Science
Data
Engineering
Data
Operations
DeepLearning4j Spark-NLP

Data
Curation
Data
Science
Data
Engineering
Data
Operations

Data
Curation
Data
Science
Data
Engineering
Data
Operations
From: Post by Ben Lorica

david@pacific.ai
@davidtalby
in/davidtalby
THANK YOU!

Deep learning for natural language understanding

Recommended

Recommended

More Related Content

Similar to Deep learning for natural language understanding

Similar to Deep learning for natural language understanding (20)

More from David Talby

More from David Talby (11)

Recently uploaded

Recently uploaded (20)

Deep learning for natural language understanding

Editor's Notes