SlideShare a Scribd company logo
1 of 45
Download to read offline
Making natural language processing
robust to sociolinguistic variation
Jacob Eisenstein
@jacobeisenstein
Georgia Institute of Technology
September 9, 2017
Machine reading
From text to structured
representations.
Annotate
and train
Machine reading
From text to structured
representations.
Annotate
and train
Machine reading
From text to structured
representations.
New domains of digitized
texts offer opportunities as
well as challenges.
Language data then and now
Then: news text, small set
of authors, professionally
edited, fixed style
Language data then and now
Then: news text, small set
of authors, professionally
edited, fixed style
Now: open domain,
everyone is an author,
unedited, many styles
Social media has forced
NLP to confront the
challenge of missing
social context
(Eisenstein, 2013):
(Gimpel et al., 2011)
(Ritter et al., 2011)
(Foster et al., 2011)
Social media has forced
NLP to confront the
challenge of missing
social context
(Eisenstein, 2013):
tacit assumptions
about audience
knowledge
language variation
across social groups
(Gimpel et al., 2011)
(Ritter et al., 2011)
(Foster et al., 2011)
Social media has forced
NLP to confront the
challenge of missing
social context
(Eisenstein, 2013):
tacit assumptions
about audience
knowledge
language variation
across social groups
(Gimpel et al., 2011)
(Ritter et al., 2011)
(Foster et al., 2011)
Finding tacit context in the social network
Social media texts lack
context, because it is
implicit between the
writer and the reader.
Homophily: socially
connected individuals tend
to share traits.
Assortativity of entity references
We project embeddings for entities, words, and
authors into a shared semantic space.
“Dirk Novitsky”
“the warriors”
Inner products in this space indicate compatibility.
Socially-Infused	En,ty	Linking
47
Socially-Infused	En,ty	Linking
47
tweet
en,ty	assignments
author
Socially-Infused	En,ty	Linking
47
tweet
en,ty	assignments
author
‣						is	employed	to	model	surface	features.g1
Socially-Infused	En,ty	Linking
47
tweet
en,ty	assignments
author
‣						is	used	to	capture	two	assump,ons:
‣	En,ty	homophily
‣						is	employed	to	model	surface	features.
‣	Seman,cally	related	men,ons	tend	to	refer	similar	en,,es
g1
g2
Socially-Infused	En,ty	Linking
48
g2(x, yt, u, t; ⇥2) = v(u)
u
>
W(u,e)
v(e)
yt
+ v
(m)
t
>
W(m,e)
v(e)
yt
author	embedding men,on	embedding
v(u)
u
v(e)
yt
v(e)
yt v
(m)
t
g2
(x, yt, t)
g1
en,ty	embedding
Loss-augmented	
training
Socially-Infused	En,ty	Linking
48
g2(x, yt, u, t; ⇥2) = v(u)
u
>
W(u,e)
v(e)
yt
+ v
(m)
t
>
W(m,e)
v(e)
yt
author	embedding men,on	embedding
v(u)
u
v(e)
yt
v(e)
yt v
(m)
t
g2
(x, yt, t)
g1
en,ty	embedding
Learning
49
Learning
49
‣	Loss-augmented	inference:
Learning
49
‣	Loss-augmented	inference: hamming	loss
Learning
49
‣	Loss-augmented	inference:
‣	Op,miza,on:	stochas,c	gradient	descent
hamming	loss
Inference
50
‣	Non-overlapping	structure
In	order	to	link	‘Red	Sox’	to	a	real	en,ty,	‘Red’	and	‘Sox’	
should	be	linked	to	Nil.
Classifier Struct Struct+Social S-MART
64
66
68
70
72
74
76
78
F1
Dataset
NEEL
TACL
+3.2
+2.0
Structure prediction improves accuracy.
Social context yields further improvements.
S-MART is the prior state-of-the-art
(Yang & Chang, 2015).
Social media has forced
NLP to confront the
challenge of missing
social context
(Eisenstein, 2013):
tacit assumptions
about audience
knowledge
language variation
across social groups
(Gimpel et al., 2011)
(Ritter et al., 2011)
(Foster et al., 2011)
Social media has forced
NLP to confront the
challenge of missing
social context
(Eisenstein, 2013):
tacit assumptions
about audience
knowledge
language variation
across social groups
(Gimpel et al., 2011)
(Ritter et al., 2011)
(Foster et al., 2011)
Language variation: a challenge for NLP
“I would like to believe he’s
sick rather than just mean
and evil.”
Language variation: a challenge for NLP
“I would like to believe he’s
sick rather than just mean
and evil.”
“You could’ve been getting
down to this sick beat.”
(Yang & Eisenstein, 2017)
Personalization by ensemble
Goal: personalized conditional likelihood,
P(y | x, a), where a is the author.
Problem: We have labeled examples for only a
few authors.
Personalization by ensemble
Goal: personalized conditional likelihood,
P(y | x, a), where a is the author.
Problem: We have labeled examples for only a
few authors.
Personalization ensemble
P(y | x, a) =
k
Pk(y | x)πa(k)
Pk(y | x) is a basis model
πa(·) are the ensemble weights for author a
Homophily to the rescue?
Sick!
Sick!
Sick!Sick!
Labeled
data
Unlabeled
data
Are language styles assortative on the social
network?
Evidence for linguistic homophily
Pilot study: is classifier accuracy assortative on the
Twitter social network?
assort(G) =
1
#|G|
(i,j)∈G
δ(yi = ˆyi)δ(yj = ˆyj)
+ δ(yi = ˆyi)δ(yj = ˆyj)
Evidence for linguistic homophily
Pilot study: is classifier accuracy assortative on the
Twitter social network?
assort(G) =
1
#|G|
(i,j)∈G
δ(yi = ˆyi)δ(yj = ˆyj)
+ δ(yi = ˆyi)δ(yj = ˆyj)
0 20 40 60 80 100
rewiring epochs
0.700
0.705
0.710
0.715
0.720
0.725
0.730
0.735
assortativity
follow
0 20 40 60 80 100
rewiring epochs
mention
0 20 40 60 80 100
rewiring epochs
retweet
original network
random rewiring
Network-driven personalization
For each author, estimate
a node embedding
ea (Tang et al., 2015).
Nodes who share
neighbors get similar
embeddings.
πa =SoftMax(f (ea))
P(y | x, a) =
K
k=1
Pk(y | x)πa(k)
Results
Mixture of Experts NLSE Social Personalization
0.0
0.5
1.0
1.5
2.0
2.5
3.0F1improvementoverConvNet
+0.10
+1.90
+2.80
Twitter Sentiment Analysis
Improvements over ConvNet baseline:
+2.8% on Twitter Sentiment Analysis
+2.7% on Ciao Product Reviews
NLSE is prior state-of-the-art (Astudillo et al., 2015).
Variable sentiment words
More positive More negative
1 banging loss fever broken
fucking
dear like god yeah wow
2 chilling cold ill sick suck satisfy trust wealth strong
lmao
3 ass damn piss bitch shit talent honestly voting win
clever
4 insane bawling fever weird cry lmao super lol haha hahaha
5 ruin silly bad boring dreadful lovatics wish beliebers ariana-
tors kendall
Summary
Robustness is a key challenge for making NLP effective on
social media data:
Tacit assumptions about shared knowledge; language
variation
Social metadata gives NLP systems the flexibility to
handle each author differently.
Summary
Robustness is a key challenge for making NLP effective on
social media data:
Tacit assumptions about shared knowledge; language
variation
Social metadata gives NLP systems the flexibility to
handle each author differently.
The long tail of rare events is the other big challenge.
Word embeddings for unseen words (Pinter et al., 2017)
Lexicon-based supervision (Eisenstein, 2017)
Applications to finding rare events in electronic health
records (ongoing work with Jimeng Sun)
Acknowledgments
Students and collaborators:
Yi Yang (GT → Bloomberg)
Mingwei Chang (Google Research)
See https://gtnlp.wordpress.com/ for more!
Funding: National Science Foundation,
National Institutes for Health, Georgia Tech
References I
Astudillo, R. F., Amir, S., Lin, W., Silva, M., & Trancoso, I. (2015). Learning word representations from scarce and
noisy data with embedding sub-spaces. In Proceedings of the Association for Computational Linguistics
(ACL), Beijing.
Eisenstein, J. (2013). What to do about bad language on the internet. In Proceedings of the North American
Chapter of the Association for Computational Linguistics (NAACL), (pp. 359–369).
Eisenstein, J. (2017). Unsupervised learning for lexicon-based classification. In Proceedings of the National
Conference on Artificial Intelligence (AAAI), San Francisco.
Foster, J., Cetinoglu, O., Wagner, J., Le Roux, J., Nivre, J., Hogan, D., & van Genabith, J. (2011). From news to
comment: Resources and benchmarks for parsing the language of web 2.0. In Proceedings of the International
Joint Conference on Natural Language Processing (IJCNLP), (pp. 893–901)., Chiang Mai, Thailand. Asian
Federation of Natural Language Processing.
Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan,
J., & Smith, N. A. (2011). Part-of-speech tagging for Twitter: annotation, features, and experiments. In
Proceedings of the Association for Computational Linguistics (ACL), (pp. 42–47)., Portland, OR.
Pinter, Y., Guthrie, R., & Eisenstein, J. (2017). Mimicking word embeddings using subword rnns. In Proceedings of
Empirical Methods for Natural Language Processing (EMNLP).
Ritter, A., Clark, S., Mausam, & Etzioni, O. (2011). Named entity recognition in tweets: an experimental study. In
Proceedings of EMNLP.
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). Line: Large-scale information network
embedding. In Proceedings of the Conference on World-Wide Web (WWW), (pp. 1067–1077).
Yang, Y. & Chang, M.-W. (2015). S-mart: Novel tree-based structured learning algorithms applied to tweet entity
linking. In Proceedings of the Association for Computational Linguistics (ACL), (pp. 504–513)., Beijing.
Yang, Y. & Eisenstein, J. (2017). Overcoming language variation in sentiment analysis with social attention.
Transactions of the Association for Computational Linguistics (TACL), in press.

More Related Content

Viewers also liked

Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017MLconf
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017MLconf
 
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...MLconf
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017MLconf
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017MLconf
 
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017MLconf
 
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...MLconf
 
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017MLconf
 
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017MLconf
 
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...MLconf
 
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference MLconf
 
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017MLconf
 
Claudia Perlich, Chief Scientist, Dstillery
Claudia Perlich, Chief Scientist, Dstillery Claudia Perlich, Chief Scientist, Dstillery
Claudia Perlich, Chief Scientist, Dstillery MLconf
 
Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017
Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017
Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017MLconf
 
Erik Bernhardsson, CTO, Better Mortgage
Erik Bernhardsson, CTO, Better MortgageErik Bernhardsson, CTO, Better Mortgage
Erik Bernhardsson, CTO, Better MortgageMLconf
 
Yuri M. Brovman, Data Scientist, eBay
Yuri M. Brovman, Data Scientist, eBayYuri M. Brovman, Data Scientist, eBay
Yuri M. Brovman, Data Scientist, eBayMLconf
 
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017 Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017 MLconf
 
Garrett Goh, Scientist, Pacific Northwest National Lab
Garrett Goh, Scientist, Pacific Northwest National Lab Garrett Goh, Scientist, Pacific Northwest National Lab
Garrett Goh, Scientist, Pacific Northwest National Lab MLconf
 

Viewers also liked (18)

Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
 
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...
 
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
 
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017
 
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017
 
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...
 
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017
 
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
Tim Chartier, Chief Academic Officer, Tresata at MLconf ATL 2017
 
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...
 
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference
 
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017
 
Claudia Perlich, Chief Scientist, Dstillery
Claudia Perlich, Chief Scientist, Dstillery Claudia Perlich, Chief Scientist, Dstillery
Claudia Perlich, Chief Scientist, Dstillery
 
Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017
Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017
Mukund Narasimhan, Engineer, Pinterest at MLconf Seattle 2017
 
Erik Bernhardsson, CTO, Better Mortgage
Erik Bernhardsson, CTO, Better MortgageErik Bernhardsson, CTO, Better Mortgage
Erik Bernhardsson, CTO, Better Mortgage
 
Yuri M. Brovman, Data Scientist, eBay
Yuri M. Brovman, Data Scientist, eBayYuri M. Brovman, Data Scientist, eBay
Yuri M. Brovman, Data Scientist, eBay
 
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017 Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017
Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017
 
Garrett Goh, Scientist, Pacific Northwest National Lab
Garrett Goh, Scientist, Pacific Northwest National Lab Garrett Goh, Scientist, Pacific Northwest National Lab
Garrett Goh, Scientist, Pacific Northwest National Lab
 

Similar to Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georgia Institute of Technology at MLconf ATL 2017

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingpunedevscom
 
Wei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelWei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelRehgan Avon
 
___ __ Newlanguage evolution ___ BernabeuatLangUE.pdf
___ __ Newlanguage evolution ___  BernabeuatLangUE.pdf___ __ Newlanguage evolution ___  BernabeuatLangUE.pdf
___ __ Newlanguage evolution ___ BernabeuatLangUE.pdftkobelt
 
2022 AAAI DSTC10 Invited Talk
2022 AAAI DSTC10 Invited Talk2022 AAAI DSTC10 Invited Talk
2022 AAAI DSTC10 Invited TalkVerena Rieser
 
New Frontiers in IA: Design in the Era of Cognitive Computing
New Frontiers in IA: Design in the Era of Cognitive ComputingNew Frontiers in IA: Design in the Era of Cognitive Computing
New Frontiers in IA: Design in the Era of Cognitive ComputingPaul King
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games ResearchJose Zagal
 
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...Grammarly
 
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxSpeaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxwilliame8
 
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxSpeaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxrafbolet0
 
The Social Impact of NLP
The Social Impact of NLPThe Social Impact of NLP
The Social Impact of NLPantonellarose
 
Questions On Natural Language Processing
Questions On Natural Language ProcessingQuestions On Natural Language Processing
Questions On Natural Language ProcessingAdriana Wilson
 
Can you tell if they're learning? ICALT 7 July 2015
Can you tell if they're learning? ICALT 7 July 2015Can you tell if they're learning? ICALT 7 July 2015
Can you tell if they're learning? ICALT 7 July 2015studywbv
 
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...Amit Sheth
 
Communication between open source developers
Communication between open source developersCommunication between open source developers
Communication between open source developersAlexander Serebrenik
 
Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...
Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...
Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...Cloudera, Inc.
 
OpenmHealth Overview
OpenmHealth OverviewOpenmHealth Overview
OpenmHealth OverviewOpen mHealth
 
Edet 637 Dual Coding Theory
Edet 637 Dual Coding TheoryEdet 637 Dual Coding Theory
Edet 637 Dual Coding Theoryguestb8ed61
 
Meta design and social creativity
Meta design and social creativityMeta design and social creativity
Meta design and social creativityJohn Thomas
 

Similar to Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georgia Institute of Technology at MLconf ATL 2017 (20)

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Wei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelWei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI Panel
 
___ __ Newlanguage evolution ___ BernabeuatLangUE.pdf
___ __ Newlanguage evolution ___  BernabeuatLangUE.pdf___ __ Newlanguage evolution ___  BernabeuatLangUE.pdf
___ __ Newlanguage evolution ___ BernabeuatLangUE.pdf
 
2022 AAAI DSTC10 Invited Talk
2022 AAAI DSTC10 Invited Talk2022 AAAI DSTC10 Invited Talk
2022 AAAI DSTC10 Invited Talk
 
New Frontiers in IA: Design in the Era of Cognitive Computing
New Frontiers in IA: Design in the Era of Cognitive ComputingNew Frontiers in IA: Design in the Era of Cognitive Computing
New Frontiers in IA: Design in the Era of Cognitive Computing
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games Research
 
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...
Grammarly AI-NLP Club #1 - Domain and Social Bias in NLP: Case Study in Langu...
 
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxSpeaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
 
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docxSpeaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
Speaker Profession Xiomara Mejia, Melanie Sanoff, Claudia Le.docx
 
The Social Impact of NLP
The Social Impact of NLPThe Social Impact of NLP
The Social Impact of NLP
 
Why is My Team Failing? (By Christine Loch)
Why is My Team Failing? (By Christine Loch)Why is My Team Failing? (By Christine Loch)
Why is My Team Failing? (By Christine Loch)
 
Questions On Natural Language Processing
Questions On Natural Language ProcessingQuestions On Natural Language Processing
Questions On Natural Language Processing
 
Can you tell if they're learning? ICALT 7 July 2015
Can you tell if they're learning? ICALT 7 July 2015Can you tell if they're learning? ICALT 7 July 2015
Can you tell if they're learning? ICALT 7 July 2015
 
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge,...
 
Communication between open source developers
Communication between open source developersCommunication between open source developers
Communication between open source developers
 
08 09.4.what is-discourse-2
08 09.4.what is-discourse-208 09.4.what is-discourse-2
08 09.4.what is-discourse-2
 
Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...
Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...
Unlocking the Power of Social Chatter; Recent Endeavors @ Netflix | Wrangle C...
 
OpenmHealth Overview
OpenmHealth OverviewOpenmHealth Overview
OpenmHealth Overview
 
Edet 637 Dual Coding Theory
Edet 637 Dual Coding TheoryEdet 637 Dual Coding Theory
Edet 637 Dual Coding Theory
 
Meta design and social creativity
Meta design and social creativityMeta design and social creativity
Meta design and social creativity
 

More from MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Recently uploaded

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 

Recently uploaded (20)

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 

Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georgia Institute of Technology at MLconf ATL 2017

  • 1. Making natural language processing robust to sociolinguistic variation Jacob Eisenstein @jacobeisenstein Georgia Institute of Technology September 9, 2017
  • 2. Machine reading From text to structured representations.
  • 3. Annotate and train Machine reading From text to structured representations.
  • 4. Annotate and train Machine reading From text to structured representations. New domains of digitized texts offer opportunities as well as challenges.
  • 5. Language data then and now Then: news text, small set of authors, professionally edited, fixed style
  • 6. Language data then and now Then: news text, small set of authors, professionally edited, fixed style Now: open domain, everyone is an author, unedited, many styles
  • 7. Social media has forced NLP to confront the challenge of missing social context (Eisenstein, 2013): (Gimpel et al., 2011) (Ritter et al., 2011) (Foster et al., 2011)
  • 8. Social media has forced NLP to confront the challenge of missing social context (Eisenstein, 2013): tacit assumptions about audience knowledge language variation across social groups (Gimpel et al., 2011) (Ritter et al., 2011) (Foster et al., 2011)
  • 9. Social media has forced NLP to confront the challenge of missing social context (Eisenstein, 2013): tacit assumptions about audience knowledge language variation across social groups (Gimpel et al., 2011) (Ritter et al., 2011) (Foster et al., 2011)
  • 10.
  • 11.
  • 12. Finding tacit context in the social network Social media texts lack context, because it is implicit between the writer and the reader. Homophily: socially connected individuals tend to share traits.
  • 14.
  • 15.
  • 16.
  • 17. We project embeddings for entities, words, and authors into a shared semantic space. “Dirk Novitsky” “the warriors” Inner products in this space indicate compatibility.
  • 22. Socially-Infused En,ty Linking 48 g2(x, yt, u, t; ⇥2) = v(u) u > W(u,e) v(e) yt + v (m) t > W(m,e) v(e) yt author embedding men,on embedding v(u) u v(e) yt v(e) yt v (m) t g2 (x, yt, t) g1 en,ty embedding
  • 23. Loss-augmented training Socially-Infused En,ty Linking 48 g2(x, yt, u, t; ⇥2) = v(u) u > W(u,e) v(e) yt + v (m) t > W(m,e) v(e) yt author embedding men,on embedding v(u) u v(e) yt v(e) yt v (m) t g2 (x, yt, t) g1 en,ty embedding
  • 29. Classifier Struct Struct+Social S-MART 64 66 68 70 72 74 76 78 F1 Dataset NEEL TACL +3.2 +2.0 Structure prediction improves accuracy. Social context yields further improvements. S-MART is the prior state-of-the-art (Yang & Chang, 2015).
  • 30. Social media has forced NLP to confront the challenge of missing social context (Eisenstein, 2013): tacit assumptions about audience knowledge language variation across social groups (Gimpel et al., 2011) (Ritter et al., 2011) (Foster et al., 2011)
  • 31. Social media has forced NLP to confront the challenge of missing social context (Eisenstein, 2013): tacit assumptions about audience knowledge language variation across social groups (Gimpel et al., 2011) (Ritter et al., 2011) (Foster et al., 2011)
  • 32. Language variation: a challenge for NLP “I would like to believe he’s sick rather than just mean and evil.”
  • 33. Language variation: a challenge for NLP “I would like to believe he’s sick rather than just mean and evil.” “You could’ve been getting down to this sick beat.” (Yang & Eisenstein, 2017)
  • 34. Personalization by ensemble Goal: personalized conditional likelihood, P(y | x, a), where a is the author. Problem: We have labeled examples for only a few authors.
  • 35. Personalization by ensemble Goal: personalized conditional likelihood, P(y | x, a), where a is the author. Problem: We have labeled examples for only a few authors. Personalization ensemble P(y | x, a) = k Pk(y | x)πa(k) Pk(y | x) is a basis model πa(·) are the ensemble weights for author a
  • 36. Homophily to the rescue? Sick! Sick! Sick!Sick! Labeled data Unlabeled data Are language styles assortative on the social network?
  • 37. Evidence for linguistic homophily Pilot study: is classifier accuracy assortative on the Twitter social network? assort(G) = 1 #|G| (i,j)∈G δ(yi = ˆyi)δ(yj = ˆyj) + δ(yi = ˆyi)δ(yj = ˆyj)
  • 38. Evidence for linguistic homophily Pilot study: is classifier accuracy assortative on the Twitter social network? assort(G) = 1 #|G| (i,j)∈G δ(yi = ˆyi)δ(yj = ˆyj) + δ(yi = ˆyi)δ(yj = ˆyj) 0 20 40 60 80 100 rewiring epochs 0.700 0.705 0.710 0.715 0.720 0.725 0.730 0.735 assortativity follow 0 20 40 60 80 100 rewiring epochs mention 0 20 40 60 80 100 rewiring epochs retweet original network random rewiring
  • 39. Network-driven personalization For each author, estimate a node embedding ea (Tang et al., 2015). Nodes who share neighbors get similar embeddings. πa =SoftMax(f (ea)) P(y | x, a) = K k=1 Pk(y | x)πa(k)
  • 40. Results Mixture of Experts NLSE Social Personalization 0.0 0.5 1.0 1.5 2.0 2.5 3.0F1improvementoverConvNet +0.10 +1.90 +2.80 Twitter Sentiment Analysis Improvements over ConvNet baseline: +2.8% on Twitter Sentiment Analysis +2.7% on Ciao Product Reviews NLSE is prior state-of-the-art (Astudillo et al., 2015).
  • 41. Variable sentiment words More positive More negative 1 banging loss fever broken fucking dear like god yeah wow 2 chilling cold ill sick suck satisfy trust wealth strong lmao 3 ass damn piss bitch shit talent honestly voting win clever 4 insane bawling fever weird cry lmao super lol haha hahaha 5 ruin silly bad boring dreadful lovatics wish beliebers ariana- tors kendall
  • 42. Summary Robustness is a key challenge for making NLP effective on social media data: Tacit assumptions about shared knowledge; language variation Social metadata gives NLP systems the flexibility to handle each author differently.
  • 43. Summary Robustness is a key challenge for making NLP effective on social media data: Tacit assumptions about shared knowledge; language variation Social metadata gives NLP systems the flexibility to handle each author differently. The long tail of rare events is the other big challenge. Word embeddings for unseen words (Pinter et al., 2017) Lexicon-based supervision (Eisenstein, 2017) Applications to finding rare events in electronic health records (ongoing work with Jimeng Sun)
  • 44. Acknowledgments Students and collaborators: Yi Yang (GT → Bloomberg) Mingwei Chang (Google Research) See https://gtnlp.wordpress.com/ for more! Funding: National Science Foundation, National Institutes for Health, Georgia Tech
  • 45. References I Astudillo, R. F., Amir, S., Lin, W., Silva, M., & Trancoso, I. (2015). Learning word representations from scarce and noisy data with embedding sub-spaces. In Proceedings of the Association for Computational Linguistics (ACL), Beijing. Eisenstein, J. (2013). What to do about bad language on the internet. In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), (pp. 359–369). Eisenstein, J. (2017). Unsupervised learning for lexicon-based classification. In Proceedings of the National Conference on Artificial Intelligence (AAAI), San Francisco. Foster, J., Cetinoglu, O., Wagner, J., Le Roux, J., Nivre, J., Hogan, D., & van Genabith, J. (2011). From news to comment: Resources and benchmarks for parsing the language of web 2.0. In Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), (pp. 893–901)., Chiang Mai, Thailand. Asian Federation of Natural Language Processing. Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan, J., & Smith, N. A. (2011). Part-of-speech tagging for Twitter: annotation, features, and experiments. In Proceedings of the Association for Computational Linguistics (ACL), (pp. 42–47)., Portland, OR. Pinter, Y., Guthrie, R., & Eisenstein, J. (2017). Mimicking word embeddings using subword rnns. In Proceedings of Empirical Methods for Natural Language Processing (EMNLP). Ritter, A., Clark, S., Mausam, & Etzioni, O. (2011). Named entity recognition in tweets: an experimental study. In Proceedings of EMNLP. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). Line: Large-scale information network embedding. In Proceedings of the Conference on World-Wide Web (WWW), (pp. 1067–1077). Yang, Y. & Chang, M.-W. (2015). S-mart: Novel tree-based structured learning algorithms applied to tweet entity linking. In Proceedings of the Association for Computational Linguistics (ACL), (pp. 504–513)., Beijing. Yang, Y. & Eisenstein, J. (2017). Overcoming language variation in sentiment analysis with social attention. Transactions of the Association for Computational Linguistics (TACL), in press.