SlideShare a Scribd company logo
1 of 25
Download to read offline
Understanding email traffic
David Graus, University of Amsterdam
d.p.graus@uva.nl
@dvdgrs
2
3
Recipient recommendation
Ò Given a sender, an email, all possible recipients
(in an enterprise);
Ò Predict which recipient(s) are most likely to
receive the email
4
Why?
Ò Understanding communication in/structure of an
enterprise
Ò Applications in:
Ò enterprise search
Ò expert finding
Ò community detection
Ò spam classification
Ò anomaly detection
5
How?
Ò Gmail
Ò Who do you frequently “co-address”
Ò egonetwork
Ò Related work
Ò Social Network Analysis (SNA)
Ò Email content
Ò Us
Ò SNA + Email content
6
Part 1: Social Network Analysis?
d.p.graus@uva.nl z.ren@uva.nl
derijke@uva.nl
7
image by Calvinius - Creative Commons Attribution-Share Alike 3.0
8
SNA for predicting recipients?
1. Importance of a node in the network
More important people are more likely to be the
recipient of an email
2. Strength of connection between two nodes
Given sender of the email, the recipients who are
frequently addressed are more likely to be the recipient
9
SNA for predicting recipients?
1. Importance of a node in the network
1. Number of received emails
2. PageRank score of node
2. Strength of connection between two nodes
1. Number of emails sent between nodes
2. Number of times two nodes are adressed together
10
Part 2: Email content
Ò Statistical Language Models (LMs)
!
Ò Assign a probability to a sequence of words;
Ò Compute models for different corpora;
!
Ò Used in lots of places;
Ò Information Retrieval
Ò Machine Translation
Ò Speech Recognition
11
Language Models
Ò Language models as communication “profiles”
12
Language Models
Ò Language models as communication “profiles”
1. Incoming LM (how people talk to user)
13
Language Models
Ò Language models as communication “profiles”
1. Incoming LM (how people talk to user)
2. Outgoing LM (how user talks to people)
14
Language Models
Ò Language models as communication “profiles”
1. Incoming LM (how people talk to user)
2. Outgoing LM (how user talks to people)
3. Interpersonal LM (how node1 

talks with node2)
15
Language Models
Ò Language models as communication “profiles”
1. Incoming LM (how people talk to user)
2. Outgoing LM (how user talks to people)
3. Interpersonal LM (how node1 

talks with node2)
16
Language Models
Ò Language models as communication “profiles”
1. Incoming LM (how people talk to user)
2. Outgoing LM (how user talks to people)
3. Interpersonal LM (how node1 

talks with node2)
4. Corpus LM (how everyone 

talks)
17
Why language models?
Ò Comparisons between communication profiles:
Ò Find nodes with most similar communication
18
SNA
!
!
1. Importance of a node
in the network
!
3. Strength of
connection between
nodes
!
!
!
Email Content
!
!
1. Incoming LM
2. Outgoing LM
3. Interpersonal LM
4. Corpus-based LM
19
Approach: time-based
t=0 1 email, 2 addresses
t=1 2 emails, 2 addresses
t=2 3 emails, 4 addresses
t=3 4 emails, 5 addresses
!
etc…
!
t=n 607.011 emails, 2.068 addresses
20
At some time interval t
Ò Given the email, sender, and network
Ò Remove recipients from email
Ò Rank all nodes in the network
Ò By computing for each candidate (recipient)
node:
1. Importance of candidate
2. Strength of connection between sender and
candidate
3. Similarity between sender and candidate LMs
21
22
Findings: what works for predicting
recipients?
Ò Importance of node: 

Number of received emails of node
!
Ò Strength of connection: 

Number of emails between nodes
!
Ò LM Similarity: 

Interpersonal LM is most important
23
Findings: SNA vs email content
Ò SNA:
Ò SNA signals deteriorate over time
Ò SNA signals are most informative on highly
active users
!
Ò Email content:
Ò LM signal improves over time
Ò LM signal does worse with highly active users
24
Finally
Ò Combining Social Network Analysis with
Language Modeling is better than doing either.
25
Why for E-Discovery
Ò Anomaly detection
Ò Given a working prediction model; identify
“unexpected” communication
Ò Language models for communication
Ò For a node, find the most different
interpersonal communication
Ò Friends/family vs colleagues?
Ò Find communication that differs from the
corpus-based communication

More Related Content

Viewers also liked

Big Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenBig Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenDavid Graus
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsDavid Graus
 
Semantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseSemantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseDavid Graus
 
Analyzing and Predicting Task Reminders
Analyzing and Predicting Task RemindersAnalyzing and Predicting Task Reminders
Analyzing and Predicting Task RemindersDavid Graus
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDavid Graus
 
Document Classification using the Python Natural Language Toolkit
Document Classification using the Python Natural Language ToolkitDocument Classification using the Python Natural Language Toolkit
Document Classification using the Python Natural Language ToolkitBen Healey
 

Viewers also liked (6)

Big Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenBig Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & Valkuilen
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic events
 
Semantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron DatabaseSemantic Annotation of the Cyttron Database
Semantic Annotation of the Cyttron Database
 
Analyzing and Predicting Task Reminders
Analyzing and Predicting Task RemindersAnalyzing and Predicting Task Reminders
Analyzing and Predicting Task Reminders
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
 
Document Classification using the Python Natural Language Toolkit
Document Classification using the Python Natural Language ToolkitDocument Classification using the Python Natural Language Toolkit
Document Classification using the Python Natural Language Toolkit
 

Similar to Understanding Email Traffic (talk @ E-Discovery NL Symposium)

Mining Email Social Networks
Mining Email Social NetworksMining Email Social Networks
Mining Email Social Networksarnamoy10
 
MiningEmailSocialNetworks
MiningEmailSocialNetworksMiningEmailSocialNetworks
MiningEmailSocialNetworkswebuploader
 
miniproject.ppt.pptx
miniproject.ppt.pptxminiproject.ppt.pptx
miniproject.ppt.pptxAnush90
 
E -MAIL AND INTERNET
E -MAIL AND INTERNETE -MAIL AND INTERNET
E -MAIL AND INTERNETProf Ansari
 
David Troy - Presentation at Emerging Communications Conference & Awards (eCo...
David Troy - Presentation at Emerging Communications Conference & Awards (eCo...David Troy - Presentation at Emerging Communications Conference & Awards (eCo...
David Troy - Presentation at Emerging Communications Conference & Awards (eCo...eCommConf
 
Information Systems Security3Information Systems Secur.docx
Information Systems Security3Information Systems Secur.docxInformation Systems Security3Information Systems Secur.docx
Information Systems Security3Information Systems Secur.docxjaggernaoma
 
S N A I L Final Presentation
S N A I L    Final  PresentationS N A I L    Final  Presentation
S N A I L Final PresentationQiong Wu
 
L26 communication services
L26   communication servicesL26   communication services
L26 communication servicesheidirobison
 
Web 2.0: Making Email a Useful Web App
Web 2.0: Making Email a Useful Web AppWeb 2.0: Making Email a Useful Web App
Web 2.0: Making Email a Useful Web AppAndy Denmark
 
NACCAP 2010 - Email Marketing for Admissions
NACCAP 2010 - Email Marketing for AdmissionsNACCAP 2010 - Email Marketing for Admissions
NACCAP 2010 - Email Marketing for AdmissionsTargetX
 
A Quick Email Etiquette Education
A Quick Email Etiquette EducationA Quick Email Etiquette Education
A Quick Email Etiquette EducationChelse Benham
 
Predicting Communication Intention in Social Media
Predicting Communication Intention in Social MediaPredicting Communication Intention in Social Media
Predicting Communication Intention in Social MediaCharalampos Chelmis
 
บริการต่างๆบนอินเตอร์เน็ต
บริการต่างๆบนอินเตอร์เน็ตบริการต่างๆบนอินเตอร์เน็ต
บริการต่างๆบนอินเตอร์เน็ตChanisara Pratchayakul
 

Similar to Understanding Email Traffic (talk @ E-Discovery NL Symposium) (20)

Mining Email Social Networks
Mining Email Social NetworksMining Email Social Networks
Mining Email Social Networks
 
MiningEmailSocialNetworks
MiningEmailSocialNetworksMiningEmailSocialNetworks
MiningEmailSocialNetworks
 
EMail.pdf
EMail.pdfEMail.pdf
EMail.pdf
 
miniproject.ppt.pptx
miniproject.ppt.pptxminiproject.ppt.pptx
miniproject.ppt.pptx
 
E -MAIL AND INTERNET
E -MAIL AND INTERNETE -MAIL AND INTERNET
E -MAIL AND INTERNET
 
David Troy - Presentation at Emerging Communications Conference & Awards (eCo...
David Troy - Presentation at Emerging Communications Conference & Awards (eCo...David Troy - Presentation at Emerging Communications Conference & Awards (eCo...
David Troy - Presentation at Emerging Communications Conference & Awards (eCo...
 
Email
EmailEmail
Email
 
hwk1
hwk1hwk1
hwk1
 
hwk1
hwk1hwk1
hwk1
 
Information Systems Security3Information Systems Secur.docx
Information Systems Security3Information Systems Secur.docxInformation Systems Security3Information Systems Secur.docx
Information Systems Security3Information Systems Secur.docx
 
คอม
คอมคอม
คอม
 
E Mail
E MailE Mail
E Mail
 
S N A I L Final Presentation
S N A I L    Final  PresentationS N A I L    Final  Presentation
S N A I L Final Presentation
 
L26 communication services
L26   communication servicesL26   communication services
L26 communication services
 
Web 2.0: Making Email a Useful Web App
Web 2.0: Making Email a Useful Web AppWeb 2.0: Making Email a Useful Web App
Web 2.0: Making Email a Useful Web App
 
NACCAP 2010 - Email Marketing for Admissions
NACCAP 2010 - Email Marketing for AdmissionsNACCAP 2010 - Email Marketing for Admissions
NACCAP 2010 - Email Marketing for Admissions
 
A Quick Email Etiquette Education
A Quick Email Etiquette EducationA Quick Email Etiquette Education
A Quick Email Etiquette Education
 
Predicting Communication Intention in Social Media
Predicting Communication Intention in Social MediaPredicting Communication Intention in Social Media
Predicting Communication Intention in Social Media
 
บริการต่างๆบนอินเตอร์เน็ต
บริการต่างๆบนอินเตอร์เน็ตบริการต่างๆบนอินเตอร์เน็ต
บริการต่างๆบนอินเตอร์เน็ต
 
Code_Nattakit
Code_NattakitCode_Nattakit
Code_Nattakit
 

More from David Graus

Pragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsPragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsDavid Graus
 
Bias in Recommendations
Bias in RecommendationsBias in Recommendations
Bias in RecommendationsDavid Graus
 
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.David Graus
 
CAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactCAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactDavid Graus
 
Opening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsOpening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsDavid Graus
 
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyZoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyDavid Graus
 
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesLayman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesDavid Graus
 
Financial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamFinancial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamDavid Graus
 
De Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDe Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDavid Graus
 
Financial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoFinancial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoDavid Graus
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDavid Graus
 
Understanding Email Traffic
Understanding Email TrafficUnderstanding Email Traffic
Understanding Email TrafficDavid Graus
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus
 
Semantic Search in E-Discovery
Semantic Search in E-DiscoverySemantic Search in E-Discovery
Semantic Search in E-DiscoveryDavid Graus
 
Semantic annotation, clustering and visualization
Semantic annotation, clustering and visualizationSemantic annotation, clustering and visualization
Semantic annotation, clustering and visualizationDavid Graus
 

More from David Graus (15)

Pragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsPragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientists
 
Bias in Recommendations
Bias in RecommendationsBias in Recommendations
Bias in Recommendations
 
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
 
CAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactCAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for Impact
 
Opening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsOpening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender Systems
 
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyZoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
 
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesLayman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
 
Financial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamFinancial News Mining @ PyData Amsterdam
Financial News Mining @ PyData Amsterdam
 
De Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDe Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgeven
 
Financial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoFinancial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.info
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
 
Understanding Email Traffic
Understanding Email TrafficUnderstanding Email Traffic
Understanding Email Traffic
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
 
Semantic Search in E-Discovery
Semantic Search in E-DiscoverySemantic Search in E-Discovery
Semantic Search in E-Discovery
 
Semantic annotation, clustering and visualization
Semantic annotation, clustering and visualizationSemantic annotation, clustering and visualization
Semantic annotation, clustering and visualization
 

Recently uploaded

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computationsit20ad004
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 

Recently uploaded (20)

VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computation
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 

Understanding Email Traffic (talk @ E-Discovery NL Symposium)

  • 1. Understanding email traffic David Graus, University of Amsterdam d.p.graus@uva.nl @dvdgrs
  • 2. 2
  • 3. 3 Recipient recommendation Ò Given a sender, an email, all possible recipients (in an enterprise); Ò Predict which recipient(s) are most likely to receive the email
  • 4. 4 Why? Ò Understanding communication in/structure of an enterprise Ò Applications in: Ò enterprise search Ò expert finding Ò community detection Ò spam classification Ò anomaly detection
  • 5. 5 How? Ò Gmail Ò Who do you frequently “co-address” Ò egonetwork Ò Related work Ò Social Network Analysis (SNA) Ò Email content Ò Us Ò SNA + Email content
  • 6. 6 Part 1: Social Network Analysis? d.p.graus@uva.nl z.ren@uva.nl derijke@uva.nl
  • 7. 7 image by Calvinius - Creative Commons Attribution-Share Alike 3.0
  • 8. 8 SNA for predicting recipients? 1. Importance of a node in the network More important people are more likely to be the recipient of an email 2. Strength of connection between two nodes Given sender of the email, the recipients who are frequently addressed are more likely to be the recipient
  • 9. 9 SNA for predicting recipients? 1. Importance of a node in the network 1. Number of received emails 2. PageRank score of node 2. Strength of connection between two nodes 1. Number of emails sent between nodes 2. Number of times two nodes are adressed together
  • 10. 10 Part 2: Email content Ò Statistical Language Models (LMs) ! Ò Assign a probability to a sequence of words; Ò Compute models for different corpora; ! Ò Used in lots of places; Ò Information Retrieval Ò Machine Translation Ò Speech Recognition
  • 11. 11 Language Models Ò Language models as communication “profiles”
  • 12. 12 Language Models Ò Language models as communication “profiles” 1. Incoming LM (how people talk to user)
  • 13. 13 Language Models Ò Language models as communication “profiles” 1. Incoming LM (how people talk to user) 2. Outgoing LM (how user talks to people)
  • 14. 14 Language Models Ò Language models as communication “profiles” 1. Incoming LM (how people talk to user) 2. Outgoing LM (how user talks to people) 3. Interpersonal LM (how node1 
 talks with node2)
  • 15. 15 Language Models Ò Language models as communication “profiles” 1. Incoming LM (how people talk to user) 2. Outgoing LM (how user talks to people) 3. Interpersonal LM (how node1 
 talks with node2)
  • 16. 16 Language Models Ò Language models as communication “profiles” 1. Incoming LM (how people talk to user) 2. Outgoing LM (how user talks to people) 3. Interpersonal LM (how node1 
 talks with node2) 4. Corpus LM (how everyone 
 talks)
  • 17. 17 Why language models? Ò Comparisons between communication profiles: Ò Find nodes with most similar communication
  • 18. 18 SNA ! ! 1. Importance of a node in the network ! 3. Strength of connection between nodes ! ! ! Email Content ! ! 1. Incoming LM 2. Outgoing LM 3. Interpersonal LM 4. Corpus-based LM
  • 19. 19 Approach: time-based t=0 1 email, 2 addresses t=1 2 emails, 2 addresses t=2 3 emails, 4 addresses t=3 4 emails, 5 addresses ! etc… ! t=n 607.011 emails, 2.068 addresses
  • 20. 20 At some time interval t Ò Given the email, sender, and network Ò Remove recipients from email Ò Rank all nodes in the network Ò By computing for each candidate (recipient) node: 1. Importance of candidate 2. Strength of connection between sender and candidate 3. Similarity between sender and candidate LMs
  • 21. 21
  • 22. 22 Findings: what works for predicting recipients? Ò Importance of node: 
 Number of received emails of node ! Ò Strength of connection: 
 Number of emails between nodes ! Ò LM Similarity: 
 Interpersonal LM is most important
  • 23. 23 Findings: SNA vs email content Ò SNA: Ò SNA signals deteriorate over time Ò SNA signals are most informative on highly active users ! Ò Email content: Ò LM signal improves over time Ò LM signal does worse with highly active users
  • 24. 24 Finally Ò Combining Social Network Analysis with Language Modeling is better than doing either.
  • 25. 25 Why for E-Discovery Ò Anomaly detection Ò Given a working prediction model; identify “unexpected” communication Ò Language models for communication Ò For a node, find the most different interpersonal communication Ò Friends/family vs colleagues? Ò Find communication that differs from the corpus-based communication