SlideShare a Scribd company logo
1 of 61
Computing Professional Identity for the Economic Graph
Agenda 
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
About me 
©2013 LinkedIn Corporation. All Rights Reserved. 
4 
Vitaly Gordon
About me 
©2013 LinkedIn Corporation. All Rights Reserved. 
5
About me 
©2013 LinkedIn Corporation. All Rights Reserved. 
6
About me 
©2013 LinkedIn Corporation. All Rights Reserved. 
7
About me 
©2013 LinkedIn Corporation. All Rights Reserved. 
8
About me 
©2013 LinkedIn Corporation. All Rights Reserved. 
9
About me 
@bigdatasc /in/vitalygordon 
©2013 LinkedIn Corporation. All Rights Reserved. 
10
What’s in it for you? 
©2013 LinkedIn Corporation. All Rights Reserved. 
11
What’s in it for you? 
©2013 LinkedIn Corporation. All Rights Reserved. 
12 
1. You will learn about how LinkedIn 
takes a massive vision and breaks it 
down to small data problems
What’s in it for you? 
©2013 LinkedIn Corporation. All Rights Reserved. 
13 
1. You will learn about how LinkedIn 
takes a massive vision and breaks it 
down to small data problems 
2. You will learn about how hard 
cleaning data can be
What’s in it for you? 
©2013 LinkedIn Corporation. All Rights Reserved. 
14 
1. You will learn about how LinkedIn 
takes a massive vision and breaks it 
down to small data problems 
2. You will learn about how hard 
cleaning data can be 
3. You will learn why LinkedIn needs 
endorsements
What’s in it for me? 
©2013 LinkedIn Corporation. All Rights Reserved. 
15
What’s in it for me? 
©2013 LinkedIn Corporation. All Rights Reserved. 
16
What’s in it for me? 
©2013 LinkedIn Corporation. All Rights Reserved. 
17 
@bigdatasc
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
©2013 LinkedIn Corporation. All Rights Reserved. 
19 
Create economic opportunity for 
every professional in the world
©2013 LinkedIn Corporation. All Rights Reserved. 
20
©2013 LinkedIn Corporation. All Rights Reserved. 
21 
• CEO 
• Chief Executive Officer 
• CEO and Founder 
• CEO & Co-founder 
• President and CEO 
• Owner
©2013 LinkedIn Corporation. All Rights Reserved. 
22 
• IBM 
• International Business Machines 
• International Bus. Machines 
• IBM Research 
• IBM T.J. Watson Research Center 
• IBM Canada 
• IBM India
©2013 LinkedIn Corporation. All Rights Reserved. 
23 
• UCLA 
• University of California, Los Angeles 
• UC Los Angeles 
• The Anderson School of Management
©2013 LinkedIn Corporation. All Rights Reserved. 
24
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
Why Do We Need Identity Standardization? 
©2013 LinkedIn Corporation. All Rights Reserved. 
26
Why Do We Need Identity Standardization? 
©2013 LinkedIn Corporation. All Rights Reserved. 
27
Why Do We Need Identity Standardization? 
©2013 LinkedIn Corporation. All Rights Reserved. 
28
©2013 LinkedIn Corporation. All Rights Reserved. 
29
Text Based Solution 
 Applies acronym expansion (e.g. vp -> vice president) 
 Applies abbreviation expansion (e.g. sr. -> senior) 
 Select the most common standard titles 
 Selects standard sub strings (e.g. software engineer and tech lead 
in search -> [software engineer, tech lead]) 
©2013 LinkedIn Corporation. All Rights Reserved. 
30
Problems with a Text Based Approach 
©2013 LinkedIn Corporation. All Rights Reserved. 
31 
Senior Software 
Engineer 
Software Engineer
Problems with a Text Based Approach 
©2013 LinkedIn Corporation. All Rights Reserved. 
32 
Software Engineer Software Developer Programmer
Problems with a Text Based Approach 
©2013 LinkedIn Corporation. All Rights Reserved. 
33 
Architect
©2013 LinkedIn Corporation. All Rights Reserved. 
34
©2013 LinkedIn Corporation. All Rights Reserved. 
35
©2013 LinkedIn Corporation. All Rights Reserved. 
36
©2013 LinkedIn Corporation. All Rights Reserved. 
37
38 
Profile Inferred Skills Endorsements Skill Vectors
39 
Profile Inferred Skills Endorsements Skill Vectors 
30% 
25% 
20% 
15% 
10% 
5% 
0% 
Data Mining Hadoop Machine 
Learning 
Java Algorithms Python MapReduce Data Science
40 
Profile Inferred Skills Endorsements Skill Vectors 
30% 
25% 
20% 
15% 
10% 
5% 
0% 
Data Mining Hadoop Machine 
Learning 
Java Algorithms Python MapReduce Data Science 
http://www.slideshare.net/s_shah/strata-endorsements
Ontology Creation 
41
Ontology Creation 
42
Classification 
43
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
Normalization 
©2013 LinkedIn Corporation. All Rights Reserved. 
45
Normalization 
46
Clustering 
©2013 LinkedIn Corporation. All Rights Reserved. 
47
Clustering 
48 
• Each topic is a distribution over words 
• Each document is a mixture of corpus-wide topics 
• Each word is drawn from one of those topics
Anomaly Detection 
49
Anomaly Detection 
50
Anomaly Detection 
51 
http://www.slideshare.net/tdunning/strata-2014-anomaly-detection
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
Summary 
©2013 LinkedIn Corporation. All Rights Reserved. 
60 
1. User generated content from 300M 
members, creates 300M problems
Summary 
©2013 LinkedIn Corporation. All Rights Reserved. 
61 
1. User generated content from 300M 
members, creates 300M problems 
2. Data cleaning is so much more than 
filtering out empty values
Summary 
©2013 LinkedIn Corporation. All Rights Reserved. 
62 
1. User generated content from 300M 
members, creates 300M problems 
2. Data cleaning is so much more than 
filtering out empty values 
3. Try to be creative and work around 
difficult language problems
1 Introduction 
2 LinkedIn’s Vision 
3 Computing Professional Identity 
4 Selected Topics 
5 Summary 
6 Final Words
We’re building the next big thing
We’re building the next big thing 
Join Us
We’re building the next big thing 
Join Us!
We’re building the next big thing 
Join Us! 
DJ Patil Gary Flake Beau Cronin
@bigdatasc /in/vitalygordon

More Related Content

Similar to Computing Professional Identity for the Economic Graph

7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)Social Fresh Conference
 
7 Badass Tactics for SlideShare Content Domination
7 Badass Tactics for SlideShare Content Domination7 Badass Tactics for SlideShare Content Domination
7 Badass Tactics for SlideShare Content DominationLinkedIn
 
7 Badass Tactics for Slideshare Content Domination
7 Badass Tactics for Slideshare Content Domination 7 Badass Tactics for Slideshare Content Domination
7 Badass Tactics for Slideshare Content Domination Jason Miller
 
How Linkedin uses Automic for Big Data Processes
How Linkedin uses Automic for Big Data ProcessesHow Linkedin uses Automic for Big Data Processes
How Linkedin uses Automic for Big Data ProcessesCA | Automic Software
 
SF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsSF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsPeter Skomoroch
 
LinkedIn 101: LinkedIn in 10 Minutes or Less
LinkedIn 101: LinkedIn in 10 Minutes or LessLinkedIn 101: LinkedIn in 10 Minutes or Less
LinkedIn 101: LinkedIn in 10 Minutes or LessLinkedIn Higher Education
 
LinkedIn Career Services Webinar Slides - December 2013
LinkedIn Career Services Webinar Slides - December 2013LinkedIn Career Services Webinar Slides - December 2013
LinkedIn Career Services Webinar Slides - December 2013LinkedIn Higher Education
 
Linkedin job search fundamentals part 1
Linkedin job search fundamentals part 1Linkedin job search fundamentals part 1
Linkedin job search fundamentals part 1Safe Rise
 
Big Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInBig Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInMinh-Hoang Nguyen
 
Linkedin Trending content report - Feb 2014 update
Linkedin Trending content report - Feb 2014 updateLinkedin Trending content report - Feb 2014 update
Linkedin Trending content report - Feb 2014 updateWSI Business Performance
 
Präsentation share point
Präsentation share pointPräsentation share point
Präsentation share pointcoda-efurt
 
Interior Designs
Interior DesignsInterior Designs
Interior Designsarun kumar
 
Sharepoint Architecture
Sharepoint Architecture Sharepoint Architecture
Sharepoint Architecture arun kumar
 
The LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity ModelThe LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity ModelLima Consulting Group
 
Forging an Analytics Center of Excellence
Forging an Analytics Center of ExcellenceForging an Analytics Center of Excellence
Forging an Analytics Center of ExcellenceLewandog, Inc,
 
Microsoft PPT_Sharepoint_introduction
Microsoft PPT_Sharepoint_introductionMicrosoft PPT_Sharepoint_introduction
Microsoft PPT_Sharepoint_introductionDipti Bohra
 
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?CloudIDSummit
 
Big data arch_analytics
Big data arch_analyticsBig data arch_analytics
Big data arch_analyticsSrinu Adira
 
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup Jason Miller
 
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bhaskar Ghosh
 

Similar to Computing Professional Identity for the Economic Graph (20)

7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
7 Badass SlideShare Tactics - Jason Miller (Social Fresh WEST 2013)
 
7 Badass Tactics for SlideShare Content Domination
7 Badass Tactics for SlideShare Content Domination7 Badass Tactics for SlideShare Content Domination
7 Badass Tactics for SlideShare Content Domination
 
7 Badass Tactics for Slideshare Content Domination
7 Badass Tactics for Slideshare Content Domination 7 Badass Tactics for Slideshare Content Domination
7 Badass Tactics for Slideshare Content Domination
 
How Linkedin uses Automic for Big Data Processes
How Linkedin uses Automic for Big Data ProcessesHow Linkedin uses Automic for Big Data Processes
How Linkedin uses Automic for Big Data Processes
 
SF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsSF Data Science: Developing Data Products
SF Data Science: Developing Data Products
 
LinkedIn 101: LinkedIn in 10 Minutes or Less
LinkedIn 101: LinkedIn in 10 Minutes or LessLinkedIn 101: LinkedIn in 10 Minutes or Less
LinkedIn 101: LinkedIn in 10 Minutes or Less
 
LinkedIn Career Services Webinar Slides - December 2013
LinkedIn Career Services Webinar Slides - December 2013LinkedIn Career Services Webinar Slides - December 2013
LinkedIn Career Services Webinar Slides - December 2013
 
Linkedin job search fundamentals part 1
Linkedin job search fundamentals part 1Linkedin job search fundamentals part 1
Linkedin job search fundamentals part 1
 
Big Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInBig Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedIn
 
Linkedin Trending content report - Feb 2014 update
Linkedin Trending content report - Feb 2014 updateLinkedin Trending content report - Feb 2014 update
Linkedin Trending content report - Feb 2014 update
 
Präsentation share point
Präsentation share pointPräsentation share point
Präsentation share point
 
Interior Designs
Interior DesignsInterior Designs
Interior Designs
 
Sharepoint Architecture
Sharepoint Architecture Sharepoint Architecture
Sharepoint Architecture
 
The LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity ModelThe LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity Model
 
Forging an Analytics Center of Excellence
Forging an Analytics Center of ExcellenceForging an Analytics Center of Excellence
Forging an Analytics Center of Excellence
 
Microsoft PPT_Sharepoint_introduction
Microsoft PPT_Sharepoint_introductionMicrosoft PPT_Sharepoint_introduction
Microsoft PPT_Sharepoint_introduction
 
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
CIS14: NSTIC - Why the Identity Ecosystem Steering Group (IDESG)?
 
Big data arch_analytics
Big data arch_analyticsBig data arch_analytics
Big data arch_analytics
 
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
Driving Revenue w/ Social, Content, Marketing Automation - Scoop.It Meetup
 
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
 

Recently uploaded

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 

Recently uploaded (20)

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 

Computing Professional Identity for the Economic Graph

  • 1. Computing Professional Identity for the Economic Graph
  • 2. Agenda 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 3. 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 4. About me ©2013 LinkedIn Corporation. All Rights Reserved. 4 Vitaly Gordon
  • 5. About me ©2013 LinkedIn Corporation. All Rights Reserved. 5
  • 6. About me ©2013 LinkedIn Corporation. All Rights Reserved. 6
  • 7. About me ©2013 LinkedIn Corporation. All Rights Reserved. 7
  • 8. About me ©2013 LinkedIn Corporation. All Rights Reserved. 8
  • 9. About me ©2013 LinkedIn Corporation. All Rights Reserved. 9
  • 10. About me @bigdatasc /in/vitalygordon ©2013 LinkedIn Corporation. All Rights Reserved. 10
  • 11. What’s in it for you? ©2013 LinkedIn Corporation. All Rights Reserved. 11
  • 12. What’s in it for you? ©2013 LinkedIn Corporation. All Rights Reserved. 12 1. You will learn about how LinkedIn takes a massive vision and breaks it down to small data problems
  • 13. What’s in it for you? ©2013 LinkedIn Corporation. All Rights Reserved. 13 1. You will learn about how LinkedIn takes a massive vision and breaks it down to small data problems 2. You will learn about how hard cleaning data can be
  • 14. What’s in it for you? ©2013 LinkedIn Corporation. All Rights Reserved. 14 1. You will learn about how LinkedIn takes a massive vision and breaks it down to small data problems 2. You will learn about how hard cleaning data can be 3. You will learn why LinkedIn needs endorsements
  • 15. What’s in it for me? ©2013 LinkedIn Corporation. All Rights Reserved. 15
  • 16. What’s in it for me? ©2013 LinkedIn Corporation. All Rights Reserved. 16
  • 17. What’s in it for me? ©2013 LinkedIn Corporation. All Rights Reserved. 17 @bigdatasc
  • 18. 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 19. ©2013 LinkedIn Corporation. All Rights Reserved. 19 Create economic opportunity for every professional in the world
  • 20. ©2013 LinkedIn Corporation. All Rights Reserved. 20
  • 21. ©2013 LinkedIn Corporation. All Rights Reserved. 21 • CEO • Chief Executive Officer • CEO and Founder • CEO & Co-founder • President and CEO • Owner
  • 22. ©2013 LinkedIn Corporation. All Rights Reserved. 22 • IBM • International Business Machines • International Bus. Machines • IBM Research • IBM T.J. Watson Research Center • IBM Canada • IBM India
  • 23. ©2013 LinkedIn Corporation. All Rights Reserved. 23 • UCLA • University of California, Los Angeles • UC Los Angeles • The Anderson School of Management
  • 24. ©2013 LinkedIn Corporation. All Rights Reserved. 24
  • 25. 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 26. Why Do We Need Identity Standardization? ©2013 LinkedIn Corporation. All Rights Reserved. 26
  • 27. Why Do We Need Identity Standardization? ©2013 LinkedIn Corporation. All Rights Reserved. 27
  • 28. Why Do We Need Identity Standardization? ©2013 LinkedIn Corporation. All Rights Reserved. 28
  • 29. ©2013 LinkedIn Corporation. All Rights Reserved. 29
  • 30. Text Based Solution  Applies acronym expansion (e.g. vp -> vice president)  Applies abbreviation expansion (e.g. sr. -> senior)  Select the most common standard titles  Selects standard sub strings (e.g. software engineer and tech lead in search -> [software engineer, tech lead]) ©2013 LinkedIn Corporation. All Rights Reserved. 30
  • 31. Problems with a Text Based Approach ©2013 LinkedIn Corporation. All Rights Reserved. 31 Senior Software Engineer Software Engineer
  • 32. Problems with a Text Based Approach ©2013 LinkedIn Corporation. All Rights Reserved. 32 Software Engineer Software Developer Programmer
  • 33. Problems with a Text Based Approach ©2013 LinkedIn Corporation. All Rights Reserved. 33 Architect
  • 34. ©2013 LinkedIn Corporation. All Rights Reserved. 34
  • 35. ©2013 LinkedIn Corporation. All Rights Reserved. 35
  • 36. ©2013 LinkedIn Corporation. All Rights Reserved. 36
  • 37. ©2013 LinkedIn Corporation. All Rights Reserved. 37
  • 38. 38 Profile Inferred Skills Endorsements Skill Vectors
  • 39. 39 Profile Inferred Skills Endorsements Skill Vectors 30% 25% 20% 15% 10% 5% 0% Data Mining Hadoop Machine Learning Java Algorithms Python MapReduce Data Science
  • 40. 40 Profile Inferred Skills Endorsements Skill Vectors 30% 25% 20% 15% 10% 5% 0% Data Mining Hadoop Machine Learning Java Algorithms Python MapReduce Data Science http://www.slideshare.net/s_shah/strata-endorsements
  • 44. 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 45. Normalization ©2013 LinkedIn Corporation. All Rights Reserved. 45
  • 47. Clustering ©2013 LinkedIn Corporation. All Rights Reserved. 47
  • 48. Clustering 48 • Each topic is a distribution over words • Each document is a mixture of corpus-wide topics • Each word is drawn from one of those topics
  • 51. Anomaly Detection 51 http://www.slideshare.net/tdunning/strata-2014-anomaly-detection
  • 52. 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 53. Summary ©2013 LinkedIn Corporation. All Rights Reserved. 60 1. User generated content from 300M members, creates 300M problems
  • 54. Summary ©2013 LinkedIn Corporation. All Rights Reserved. 61 1. User generated content from 300M members, creates 300M problems 2. Data cleaning is so much more than filtering out empty values
  • 55. Summary ©2013 LinkedIn Corporation. All Rights Reserved. 62 1. User generated content from 300M members, creates 300M problems 2. Data cleaning is so much more than filtering out empty values 3. Try to be creative and work around difficult language problems
  • 56. 1 Introduction 2 LinkedIn’s Vision 3 Computing Professional Identity 4 Selected Topics 5 Summary 6 Final Words
  • 57. We’re building the next big thing
  • 58. We’re building the next big thing Join Us
  • 59. We’re building the next big thing Join Us!
  • 60. We’re building the next big thing Join Us! DJ Patil Gary Flake Beau Cronin

Editor's Notes

  1. Computing Professional Identity for the Economic Graph
  2. How do you evaluate that people do the same thing?
  3. The crowdsourcing turkers left very confused
  4. Clustering titles is like clustering geography, it depends on the context
  5. Elephants can run, but that’s not what you should use them for