SlideShare a Scribd company logo
1 of 42
INTRODUCTION
AN INITIATIVE TO ESTABLISH THE ANALYTICS
AND DATA SCIENCE STANDARDS
Strata Data NYC
September 26th, 2019
© 2019 Forte Partners. All rights protected and reserved. 2
DO YOU BELIEVE A PILOT NEEDS A LICENSE? SPECIAL TRAINING?
CERTIFICATION?
Would you rather ride
with him?
Or with them at the
controls?
Who would you trust to pilot a commercial plane you are riding on?
© 2019 Forte Partners. All rights protected and reserved. 3
HOW ABOUT A SURGEON PERFORMING SURGERY ON YOU?
Bad things happen when we cannot define
the necessary skills and knowledge…
How do you know he or she is qualified to operate?
© 2019 Forte Partners. All rights protected and reserved. 4
WHO IS ANALYSING YOUR DATA?
ARE THEY QUALIFIED?
Are you extracting the right value from your data assets?
What happens when the wrong outcomes are provided?
Do you think bad things can happen
when people who are not qualified
are analyzing your data?
In the new world, failure to use data
properly likely means the failure of
your business…
© 2019 Forte Partners. All rights protected and reserved. 5
DATA SCIENCE HAS BEEN A GLOBAL HOT TOPIC FOR THE LAST DECADE AND
CONSIDERED AS A STRATEGIC CAPABILITY ACROSS EVERY SECTOR TODAY
Harvard Business Review Glassdoor
Best Job in America for 2016, 2017,2018, 2019
McKinseyCourseraWorld Economic Forum
The Economist
2012 2013 2014 2015 2016 2017 2018 2019
© 2019 Forte Partners. All rights protected and reserved. 6
NUMBER OF ANALYTICS PROFESSIONALS IS INCREASING AT A HIGH RATE
ACCORDING TO KAGGLE
* Kaggle Blog - Reviewing 2018 and Previewing 2019
4,466 24,313 70,980 137,873
240,933
437,442
589,552
1,400,000
2,500,000
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
CAGR %120
The number of Kaggle
members gives insight
about the rapid increase
in number of analytics
professionals
© 2019 Forte Partners. All rights protected and reserved. 7
A MULTITUDE OF GROUPS WITH GROWING MEMBERSHIPS IS AN IMPORTANT
INDICATOR FOR ANALYTICS MARKET SIZE
26.000.000+ LINKEDIN MEMBERS:
CAPABILITY TARGETING*
1.000.000+ GLOBAL LINKEDIN
MEMBERS: TITLE TARGETING*
Estimated audience according to LinkedIn
* Selected countries are United States, United Kingdom, China, India, Germany
© 2019 Forte Partners. All rights protected and reserved. 8
A QUICK KEYWORD SEARCH ON LINKEDIN SHOWS A LARGE # OF
PROFESSIONALS DEFINING THEMSELVES IN ANALYTICS RELATED SPACES
Group Name Members
Big Data and Analytics 341.381
Data Science Central 279.831
Big Data, Analytics, Business Intelligence & Visualization Experts Community 266,508
Big Data | Analytics | Strategy | Finance | Innovation 241.666
Business Intelligence Professionals (BI, Big Data, Analytics, IoT) 237,694
Business Analytics, Big Data, and Artificial Intelligence 199,966
Data Mining, Statistics, Big Data, Data Visualization, and Data Science 193,050
Python Community 139,698
Microsoft Business Intelligence 130,730
Change Consulting | Digital Transformation Data Analytics Security 109,738
Big Data 99,790
Analytics and Artificial Intelligence (AI) in Marketing and Retail 73,185
TDWI: Analytics and Data Management Discussion Group 70,766
Hadoop Users 69,167
Data Warehouse - Big Data - Hadoop - Cloud - Data Science - ETL 68,693
Business Analyst forum [BA forum] 68,388
Data Scientists 66,491
Python Professionals 59,526
Big Data & Hadoop Professionals 55,678
Data Warehousing (Business Intelligence, ETL) Professional's Group 53,994
Business Intelligence 52,110
KDnuggets Machine Learning, Data Science, Data Mining, Big Data, AI 48,345
Top 100 LinkedIn Group’s Member Base: 2,600,000+
© 2019 Forte Partners. All rights protected and reserved. 9
HOW MANY DATA SCIENTISTS ARE THERE IN THE WORLD?
WHAT DO YOU THINK?
• 200K - 700K new grads join the analytics job market annually
• The number of jobs for all US data professionals will increase to 2,720,000 openings by 2020.
• Annual demand for the fast-growing new roles of data scientists, developers, and engineers in US
will reach nearly 700,000 openings.
“There are between 1.5-3 million data scientists in the world.”
- Anthony Goldbloom, Co-founder & CEO @Kaggle
https://www.huffpost.com/entry/where-will-data-science-b_b_12375864
https://www.forbes.com/sites/louiscolumbus/2017/05/13/ibm-predicts-demand-for-data-scientists-will-soar-28-by-2020/
© 2019 Forte Partners. All rights protected and reserved. 10
DESPITE THE INCREASINGLY LARGE NUMBERS, THERE IS STILL DATA SCIENCE
SKILLS SHORTAGE IN US
https://economicgraph.linkedin.com/research/LinkedIns-2017-US-Emerging-Jobs-Report
https://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/big-data-the-next-frontier-for-innovation
https://economicgraph.linkedin.com/resources/linkedin-workforce-report-august-2018
In 2011, McKinsey forecasted that US
could face a shortage of 150-190K people
with deep analytical skills by 2018 …
… as verified by LinkedIn 2018 Workforce
Report; there is a shortage of 151K people
with “data science skills”
Top Emerging Jobs (2012-2017)
© 2019 Forte Partners. All rights protected and reserved. 11
VARIOUS STUDIES ALSO HIGHLIGHT HOW CHALLENGING IT IS TO FIND THE
RIGHT PEOPLE WITH NECESSARY SKILLS
* The Quant Crunch - How the demand for data science skills is disrupting the job market? Burning Glass, IBM, BHEF (2017)
© 2019 Forte Partners. All rights protected and reserved. 12
• Shortage of 150K+ people with data science skills in the US based
on LinkedIn August 2018 Workforce Report
• Y/Y increase in data science job posts is 2X the increase in data
science job searches according to Indeed 2019 Hiring Lab Report
• Big Data & Analytics is the scarcest skill in the KPMG 2019 CIO Survey
with 44% of participants struggling to find the right talent
• 82% of Data Scientist job postings require 3+ years prior work
experience, and 43% require a Master's or higher degree
• HR recruiters apply pre-screening cut-offs for additional academic
attributes such as school tier and GPA
• Recruitment processes are lengthy and complex with multiple phone
and on-site interviews combined with various tests
THE “SKILL GAP” RELATED CHALLENGES BETWEEN THE SUPPLY AND DEMAND
SIDE HINDER THE POTENTIAL CONTRIBUTION OF THE FIELD TO THE ECONOMY
Unmet Demand and the Elusive Data Science Talent
• Ineligible due to application requirements
• Gets eliminated through interview bias
• Dissatisfied with job because of skills mismatch
Difficulties in Hiring and Matching Expectations
• Unfilled positions in search of unicorns
• Losing star candidates/hires to competition
• Extra headcount/upskill costs due to bad hires
© 2019 Forte Partners. All rights protected and reserved. 13
IN ADDITION TO THE SHORTAGE, WE SEE A HUGE VARIETY OF SKILLS FOR
DATA SCIENTISTS EVEN AT THE SAME COMPANY
**** **** ****
© 2019 Forte Partners. All rights protected and reserved. 14
IT GETS EVEN MORE COMPLICATED WHEN YOU LOOK ACROSS DIFFERENT
COMPANIES AND SECTORS
**** **** ****
© 2019 Forte Partners. All rights protected and reserved. 15
SIMILARLY AT JOB POSTINGS, YOU SEE A WIDE VARIETY OF ROLE
DEFINITIONS AND EXPECTATIONS FROM THE SAME JOB TITLE
© 2019 Forte Partners. All rights protected and reserved. 16
INSIGHT
– A job searcher on Quora:
“Most interviewers had me write pseudocode, in something like Python. Most asked me some product-specific
questions, such as, "How would you use data to improve X feature on our website?" Some interviewers asked me to
write SQL, in addition to or instead of pseudocode. Another question I was often asked was how to set up some kind
of experiment, such as, "How would we design an experiment to see whether our new homepage is better?” or "How
can we use data to improve search results?". One or two interviewers asked me algorithms questions (quicksort, etc)
but not in very much depth.
Beyond that, there was little in common. The formats varied a lot. Some interviews were all-day
affairs - back-to-back meetings with programmers all day - and others were just a quick meeting with
a CTO. Some interviews had me filling whiteboards with code, while others just consisted of a face-to-
face conversation. A few of the interviews involved some sort of social/culture component, ranging
from formal interviews with non-technical people to happy hours.”
© 2019 Forte Partners. All rights protected and reserved. 17
THERE ARE OVER 250 PROGRAMS IN THE US THAT OFFER GRADUATE
DEGREES IN ANALYTICS OR DATA SCIENCE
* Michael Rappy, Institute for Advanced Analytics - https://analytics.ncsu.edu/?page_id=4184
© 2019 Forte Partners. All rights protected and reserved. 18
DESPITE HAVING SIMILAR NAMES AND OBJECTIVES, COURSE OFFERINGS
AND APPROACH OF THESE PROGRAMS VARY WIDELY
Carnegie Mellon
Master of Data Science
Columbia University
Master of Data Science
Analytics Major
1. Data Science Seminar (11-631)
2. Capstone Planning Seminar (11-634)
3. Data Science Analytics Capstone (11-632)
4. Core Curriculum (five courses)
Choose two courses in ML/Statistics:
- 10-601 Machine Learning
- 11-641 Machine Learning for Text Mining
- 10-701 Advanced Machine Learning
- 10-605 Machine Learning with Big Data Sets
Choose two courses in Software Systems:
- 11-791 Design and Engineering of Intelligent Info Systems
- 15-619 Cloud Computing
- 11-792 Information Systems Project
- 11-642 Search Engines
Choose one course with a focus on Big Data:
- 15-826 Multimedia Databases and Data Mining
- 10-605 Machine Learning with Big Data Sets
- 11-676 Big Data Analytics
5. Three electives – any graduate level course 600 and above in SCS
REQUIRED/CORE COURSES
1. STAT W4203 Probability Theory
2. CSOR W4246 Algorithms for Data Science
3. STAT W5703 Statistical Inference and Modeling
4. COMS W4121 Computer Systems for Data Science
5. COMS W4776 Machine Learning for Data Science
6. STAT W4701 Exploratory Data Analysis and Visualization
7. ENGI E4800 Data Science Capstone and Ethics
https://www.cmu.edu/graduate/data-science
https://datascience.columbia.edu/master-of-science-in-data-science
© 2019 Forte Partners. All rights protected and reserved. 19
http://www.byui.edu/catalog/#/programs/41PwqJ9RZ
https://catalog.winona.edu/preview_program.php?catoid=21&poid=4333
Brigham Young University
Data Science
Winona State University
Data Science
CIT111 - Introduction to Databases
CS101 - Introduction to Programming
CS241 - Survey Object-Oriented Programming/Data Struct.
CS450 - Machine Learning and Data Mining
MATH325 - Intermediate Statistics
MATH425 - Applied Linear Regression
MATH488 - Statistical Consulting
+
CS335 - Data Wrangling, Exploration, and Visualization
MATH221A - Business Statistics
Electives + Project + Internship
DSCI 210 - Data Science
DSCI 310 - Data Summary and Visualization
DSCI 325 - Management of Structured Data
STAT 210 - Statistics
STAT 310 - Intermediate Statistics
STAT 360 - Regression Analysis
+
MATH 140 - Applied Calculus
CS 234-250 - Algorithms and Problem-Solving I-II
CS 385 - Applied Database Management Systems
DSCI 395-495 - Professional Skill Development & Communication
Electives + Project OR Internship
… WHICH IS ALSO VALID FOR UNDERGRADUATE PROGRAMS
© 2019 Forte Partners. All rights protected and reserved. 20
… AS WELL AS ONLINE CERTIFICATES
udacity.com/course/intro-to-data-science--ud359
coursera.org/specializations/data-analysis
LESSON 1
Introduction to Data Science
• Pi-Chaun (Data Scientist @ Google): What
is Data Science?
• Gabor (Data Scientist @ Twitter): What is
Data Science?
• Problems solved by data science.
LESSON 2
Data Wrangling
• What is Data Wrangling?
• Acquiring data.
• Common data formats.
LESSON 3
Data Analysis
• Statistical rigor.
• Kurt (Data Scientist @ Twitter) - Why is
Stats Useful?
• Introduction to normal distribution.
LESSON 4
Data Visualization
• Effective information visualization.
• An analysis of Napoleon's invasion of
Russia!
• Don (Principal Data Scientist @ AT&T):
Communicating Findings.
LESSON 5
MapReduce
• Introduction to Big Data and
MapReduce.
• Learn the basics of MapReduce.
• Mapper.
LESSON 1
Data Management and Visualization
• Managing Data
• Visualizing Data
LESSON 2
Data Analysis Tools
• Hypothesis Testing and ANOVA
• Chi Square Test of Independence
• Pearson Correlation
• Exploring Statistical Interactions
LESSON 3
Regression Modeling in Practice
• Basics of Linear Regression
• Multiple Regression
• Logistic Regression
LESSON 4
Machine Learning for Data Analysis
• Decision Trees
• Random Forests
• Lasso Regression
• K-Means Cluster Analysis
Coursera | Intro to Data ScienceUdacity | Intro to Data Science
© 2019 Forte Partners. All rights protected and reserved. 21
AS A RESULT OF THE SKILL-GAP AND CONFUSION FOR FAST GROWING ROLES,
THE WASTE OF TIME AND COSTS FOR US MARKET ARE NON-NEGLIGIBLE
DS Position Remains
Open
Overall Engineer
Recruiting Cost
Data Scientist Hiring
Steps
45 days (5 longer) 30.000$ 6
• According to IBM; Data Science and Analytics jobs remain open an average of 45 days, 5 days longer than average.
• By April-19, 10% of Data Scientists changed/began to a new role in the last 90 days.
• Hiring process for DS’s 12.5% longer & 3750$ costlier
• 2.7 M Job-Openings in 2020 (700K for fast-growing roles)
The wasted cost and time during recruitment:
- 525 M $
- 700 K Days = 1.944 years of time
By April-19 Data Scientist ML Eng./Spec. Data Analysts Statisticians
Fresh-starters
(Last 90 days)
10% 12% 6% 3-4%
Software Eng. Sales Rep Accountant
4% 4% 2%
The wasted cost and time with wrong hires:
- 868 M $
- 441 K Days = 1.225 years of time
https://www.forbes.com/sites/louiscolumbus/2017/05/13/ibm-predicts-demand-for-data-scientists-will-soar-28-by-2020/
https://www.quora.com/What-is-the-average-cost-of-recruiting-an-engineer-in-Silicon-Valley
© 2019 Forte Partners. All rights protected and reserved. 22
Lack of Mentors
71% of direct managers
lack knowledge to help
technical development
Unclear Mandate
30.4% cite lack of clear
business questions as a
main challenge
Proof of Expertise
59% data scientists are
self-claimed experts with
self training
Too Many Options
750+ US university
degrees, 2750+ online
courses, vendor trainings
Diversity Gaps
Only 18% are female,
minorities have low
representation
Retention
Challenges
21.4% of junior data
scientists change jobs in
a year
Inefficient
Utilization
Only 9% can quantify
impact, 85% projects
considered failure
Ineffective
Selection
61% of managers
believe recruiters don’t
understand the needs
Specialized Needs
60% job postings list 5
or more different data
science skills
Culture Gaps
52% believe their
company culture is a
barrier to AI adoption
A DEEP DIVE INTO THE ROOT CAUSES OF “THE CONFUSION & SKILL GAP”
REVEALS 10 MAIN CHALLENGES THAT ARE LINKED TO THE BOTH SIDES OF THE
EQUATION
1
2
3
4
5
6
7
8
9
10
© 2019 Forte Partners. All rights protected and reserved. 23
A COMMON GROUND NEEDS TO BE DEFINED FIRST TO FACILITATE THE
DISCUSSIONS IN THE ECOSYSTEM TO OVERCOME THESE CHALLENGES
Recruitment Agencies
Data Science Community
Training Centers
Universities & Academia
Regulatory Bodies
Businesses
Service Providers
Application DevelopersAnalytics
and Data
Science
Standards
© 2019 Forte Partners. All rights protected and reserved. 24
INITIATIVE FOR ANALYTICS AND DATA SCIENCE (IADSS) AIMS TO SUPPORT THE
DATA SCIENCE ECOSYSTEM BY DEFINING PROFESSIONAL STANDARDS
Job Titles, Roles
Knowledge and Skills
Requirements
Assessment and Measurement
Industry Standards
© 2019 Forte Partners. All rights protected and reserved. 25
We reached out to hundreds of analytics leaders / practitioners
globally through interviews and a detailed questionnaire
We are active online as well as conferences and meet-ups including
ICDM, KDD, ODSC, Strata
WE ARE WORKING ON A STANDARDIZATION AND ASSESSMENT FRAMEWORK
THROUGH INDUSTRY-WIDE COLLABORATION
Research Study Community Outreach
© 2019 Forte Partners. All rights protected and reserved. 26
UNDER THE GUIDANCE OF IADSS ADVISORY BOARD ON THE RESEARCH,
CONTENT AND ENGAGEMENT
Advisory Board
© 2019 Forte Partners. All rights protected and reserved. 27
WE USE SEVERAL DATA SOURCES TO GET A COMPREHENSIVE VIEW OF THE
DATA SCIENCE PROFESSIONAL LANDSCAPE
Literature Review
Extensive review of existing
research and content on data
science and analytics
landscape in academia and
industry
LinkedIn Analysis
Analyzing real job-postings
and employee profiles to
address skill gaps in the job
market.
Detailed Questionnaire
Reaching and professionals in
the data science field with
organizational and knowledge
questions
1-1 Interviews
In-depth interviews with data
science leaders to get
narrative on organizational
definitions and practices
REPORT I
Literature Review, Essential
Definitions, Body of Knowledge
REPORT II
Skills and knowledge by industry
roles and titles
© 2019 Forte Partners. All rights protected and reserved. 28
Conway, D. (2010). The Data Science Venn Diagram. Retrieved July 15, 2019, from Drew Conway website
NIST (2015), NIST Big Data Interoperability: 2015 NIST Big Data Public Working Group Definitions and Taxonomies Subgroup.
AS EXPECTED, THERE IS NO CONSENSUS ON WHAT DATA SCIENCE IS
ACCORDING TO EXISTING LITERATURE
While Conway (2010) defines “Machine
Learning” as a combination of “Hacking
Skills” and “Math & Statistics
Knowledge”; NIST (2015) has its own
Venn diagram.
In our KDD 2019 workshop, we had as
many definitions of data science as there
were participants.
© 2019 Forte Partners. All rights protected and reserved. 29
WHAT IS AGREED ABOUT DATA SCIENCE?
“The need for computing with data, under varied computational
constraints and in the presence of highly disparate,
unstructured data, is a defining tenet of data science practice.
Data science implies a set of activities that, in today’s view, can
be considered transdisciplinary.
Data scientists deliver against the main challenges that our
sources highlight: the computational challenge, the knowledge
discovery challenge, and organizational challenges.”*
*IADSS’ Analytics and Data Science Standardization and Assessment Framework Report Part I
© 2019 Forte Partners. All rights protected and reserved. 30
IADSS’ “WORKING” DEFINITION OF DATA SCIENCE
*IADSS’ Analytics and Data Science Standardization and Assessment Framework Report Part I
“the ability to extract knowledge
and insights from large and
complex data sets.”
DJ Patil, Former Chief Data Scientist of the
United States
“to understand [data], to
process it, to extract value from
it, to visualize it, to
communicate it.”
Hal Varian, Chief Economist, Google
and UC Berkeley Professor of
Information Sciences, Business, and
Economics
“a field of big data geared toward
providing meaningful information
based on large amounts of complex
data. Data science, or data-driven
science, combines different fields
of work in statistics and
computation in order to interpret
data for the purpose of decision
making.”
Investopedia
“Solving Problems by Inference from Data”
© 2019 Forte Partners. All rights protected and reserved. 31
IADSS’ “WORKING” DEFINITION OF DATA SCIENCE
*IADSS’ Analytics and Data Science Standardization and Assessment Framework Report Part I
- Decision
making/support
- Prediction
- Insights extraction
- Reporting
- Explaining, confirming,
refuting
- Exploration
- Discovery
- Optimization
- ML/Induction
- Unsupervised Learning
- Computational
Statistical modeling
- Pattern/Summary
Extraction
- Structured/Unstructured
- Metadata
- Data representation,
Cleaning, &
Transformation
- Data collection and
preparation
- Entity extraction
“Solving Problems by Inference from Data”
© 2019 Forte Partners. All rights protected and reserved. 32
1-1 INTERVIEW INSIGHTS PROVIDE NARRATIVE ON THE CHALLENGES AND
PRACTICES OF DATA SCIENCE MANAGERS
“You want to curb attrition and that ends up affecting your decisions on
recognizing and promoting people between levels which might be
inconsistent with actual skill sets and how they're progressing in their roles.
But I don't see a solution for either because the market is so hot and they're
getting bombarded with job offers. And that leads to a lot of frustration and
cultural impact on the organization”
“I take courses and certifications on platforms
like Coursera as an expression of interest
rather than expertise. It shows commitment
to lifelong learning and which I think is really
important for the Data Science community
and participants.”
“I've been on a panel where the panelist next to me who took a
statistics course some time when she was at university and doesn't
think math is important for Data Science.”
“I just hired a data scientist and we started
with about 60 applicants. The role was
fairly well described and as a result I
immediately eliminated 40 without a
screening call. So I got down to about 20
for screening calls… down to 5 interviews
onsite.
At the end none of them were acceptable
except for one. So from 60 to 1, this is a
huge effort. After the screening, the ones
that actually made it to interviews almost
all of them failed on the math questions.
There are many of them strong in
engineering and but the math is rare.”
“We've seen folks create a bunch of beautiful dashboards and cost of tools
has gone down precipitously in the last 20 years but that doesn't mean that
you know what you’re looking at or ensure it won’t be misused and
misrepresented. Same thing on the data science front. The most important
thing is not being able to use an algorithm that you picked off a tool but to
know how and why you're using it.
© 2019 Forte Partners. All rights protected and reserved. 33
WE ARE DEVELOPING A BODY OF KNOWLEDGE THAT DETAILS THE SKILL &
KNOWLEDGE UNIVERSE FOR DATA SCIENCE
Under Science & Math and Programming &
Technology Domains, there are 11 Main Areas,
42 Subjects and more than 200 Topics.
© 2019 Forte Partners. All rights protected and reserved. 34
A DRILL-DOWN INTO TITLES ON LINKEDIN
****
Job Post Analysis Professional Profile Analysis
© 2019 Forte Partners. All rights protected and reserved. 35
A GLOBAL SURVEY COLLECTED DETAILED DATA ON EXPECTED SKILLS AND
KNOWLEDGE FOR A VARIETY OF ROLES IN THE DATA SCIENCE SPACE
Insight about analytics/ data science
team(s)
Training, Development & Hiring
• Analytics Director
• Analytics Manager
• BI Analyst / Specialist
• BI Director
• Big Data Engineer
• Chief Data/ Analytics Officer
• Data Analyst
• Data Architect
• Data EngineerData Miner
• Data Modeler
• Data Science Director
• Data Scientist
• Machine Learning Engineer
• Machine Learning Scientist/
Expert/ Specialist
• Scientist / Researcher
Job Titles
• Data mining basics
• Science skills
• Engineering skills
• Business / soft skills
• Leadership related skills
• Business domain skills
• Tool skills
Required Skills and Knowledge
© 2019 Forte Partners. All rights protected and reserved. 36
RESEARCH PARTICIPANTS COME FROM A WIDE SPECTRUM OF INDUSTRIES
AND GEOGRAPHIES
• More than 800 survey responses
collected from professionals and
data science/analytics/BI
executives.
© 2019 Forte Partners. All rights protected and reserved. 37
THROUGH THE SURVEY WE ALSO GAINED INSIGHT INTO ORGANIZATIONAL
STRUCTURES, RECRUITMENT AND TRAINING PRACTICES
© 2019 Forte Partners. All rights protected and reserved. 38
ORGANIZATIONS HAVE A MULTITUDE OF TITLES IN DATA SCIENCE AND
ANALYTICS TEAMS: “DATA SCIENTIST” IS THE MOST COMMON
© 2019 Forte Partners. All rights protected and reserved. 39
A MUST-HAVE ANALYSIS: AUTOMATICALLY EXTRACTED PROTOTYPICAL
SKILL-SETS FROM SURVEY RESPONSES
© 2019 Forte Partners. All rights protected and reserved. 40
MATCHING TITLES WITH PROTOTYPES
© 2019 Forte Partners. All rights protected and reserved. 41
A DEEPER LOOK INTO VARIANCE: 3 TYPES OF DATA SCIENTISTS
Estimated 46% of Data Scientists
(Mostly Data Preparation, ML/DS Development
and Communication & Collaboration)
Estimated 25% of Data Scientists
(Mostly Data Prep., ML/DS Development and
Communication & Collaboration, lower expertise)
Estimated 29% of
Data Scientists
DS-1
DS-2
DS-3
DS/ML
(dev.)
Rep. &
Vis.
Big
Data
ML
(Eng.)
Data
Eng.
Comm. &
Collaboration
Basic Analysis
& Data
Preparation
ML
(Theory)
Rel.
DBs
Lead. &
Domain
Knowledge
The composition of the skill-sets varies greatly even across
the respondents holding the same job title
© 2019 by IADSS
Contact
 129 Newbury Street 3rd Floor, Boston, MA 02116
@ info@iadss.org
https://www.iadss.org/
IADSS.org
IADSS Discussion Group
@IADSSglobal
IADSS Channel

More Related Content

What's hot

Data science for business leaders executive program
Data science for business leaders executive programData science for business leaders executive program
Data science for business leaders executive programmjitu309
 
Data Science Salon: Building a Data Science Culture
Data Science Salon: Building a Data Science CultureData Science Salon: Building a Data Science Culture
Data Science Salon: Building a Data Science CultureFormulatedby
 
Data Scientist Toolbox
Data Scientist ToolboxData Scientist Toolbox
Data Scientist ToolboxAndrei Savu
 
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...Formulatedby
 
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Edureka!
 
1645 track 1 bress_using his laptop
1645 track 1 bress_using his laptop1645 track 1 bress_using his laptop
1645 track 1 bress_using his laptopRising Media, Inc.
 
Lifecycle of a Data Science Project
Lifecycle of a Data Science ProjectLifecycle of a Data Science Project
Lifecycle of a Data Science ProjectDigital Vidya
 
#Datacaeer - AI Guild workshop on data roles in industry with Adam Green
#Datacaeer - AI Guild workshop on data roles in industry with Adam Green#Datacaeer - AI Guild workshop on data roles in industry with Adam Green
#Datacaeer - AI Guild workshop on data roles in industry with Adam GreenAI Guild
 
2016 Data Science Salary Survey
2016 Data Science Salary Survey2016 Data Science Salary Survey
2016 Data Science Salary SurveyTrieu Nguyen
 
How academic institutions best support PhDs and postdocs in the transition to...
How academic institutions best support PhDs and postdocs in the transition to...How academic institutions best support PhDs and postdocs in the transition to...
How academic institutions best support PhDs and postdocs in the transition to...AI Guild
 
Data Science Salon: Applying Machine Learning to Modernize Business Processes
Data Science Salon: Applying Machine Learning to Modernize Business ProcessesData Science Salon: Applying Machine Learning to Modernize Business Processes
Data Science Salon: Applying Machine Learning to Modernize Business ProcessesFormulatedby
 
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0Dr. Mohan K. Bavirisetty
 
Different Career Paths in Data Science
Different Career Paths in Data ScienceDifferent Career Paths in Data Science
Different Career Paths in Data ScienceRoger Huang
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in dataDavid Rostcheck
 
Exploring What a Typical Data Science Project Looks Like
Exploring What a Typical Data Science Project Looks LikeExploring What a Typical Data Science Project Looks Like
Exploring What a Typical Data Science Project Looks LikeProduct School
 
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...Edureka!
 
How to Prepare for a Career in Data Science
How to Prepare for a Career in Data ScienceHow to Prepare for a Career in Data Science
How to Prepare for a Career in Data ScienceJuuso Parkkinen
 
925 plenary rexer_using our laptop
925 plenary rexer_using our laptop925 plenary rexer_using our laptop
925 plenary rexer_using our laptopRising Media, Inc.
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data ScienceSanghamitra Deb
 

What's hot (20)

Data science for business leaders executive program
Data science for business leaders executive programData science for business leaders executive program
Data science for business leaders executive program
 
Data Science Salon: Building a Data Science Culture
Data Science Salon: Building a Data Science CultureData Science Salon: Building a Data Science Culture
Data Science Salon: Building a Data Science Culture
 
Data Scientist Toolbox
Data Scientist ToolboxData Scientist Toolbox
Data Scientist Toolbox
 
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...
 
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
 
Challenges of Executing AI
Challenges of Executing AIChallenges of Executing AI
Challenges of Executing AI
 
1645 track 1 bress_using his laptop
1645 track 1 bress_using his laptop1645 track 1 bress_using his laptop
1645 track 1 bress_using his laptop
 
Lifecycle of a Data Science Project
Lifecycle of a Data Science ProjectLifecycle of a Data Science Project
Lifecycle of a Data Science Project
 
#Datacaeer - AI Guild workshop on data roles in industry with Adam Green
#Datacaeer - AI Guild workshop on data roles in industry with Adam Green#Datacaeer - AI Guild workshop on data roles in industry with Adam Green
#Datacaeer - AI Guild workshop on data roles in industry with Adam Green
 
2016 Data Science Salary Survey
2016 Data Science Salary Survey2016 Data Science Salary Survey
2016 Data Science Salary Survey
 
How academic institutions best support PhDs and postdocs in the transition to...
How academic institutions best support PhDs and postdocs in the transition to...How academic institutions best support PhDs and postdocs in the transition to...
How academic institutions best support PhDs and postdocs in the transition to...
 
Data Science Salon: Applying Machine Learning to Modernize Business Processes
Data Science Salon: Applying Machine Learning to Modernize Business ProcessesData Science Salon: Applying Machine Learning to Modernize Business Processes
Data Science Salon: Applying Machine Learning to Modernize Business Processes
 
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
 
Different Career Paths in Data Science
Different Career Paths in Data ScienceDifferent Career Paths in Data Science
Different Career Paths in Data Science
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in data
 
Exploring What a Typical Data Science Project Looks Like
Exploring What a Typical Data Science Project Looks LikeExploring What a Typical Data Science Project Looks Like
Exploring What a Typical Data Science Project Looks Like
 
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
 
How to Prepare for a Career in Data Science
How to Prepare for a Career in Data ScienceHow to Prepare for a Career in Data Science
How to Prepare for a Career in Data Science
 
925 plenary rexer_using our laptop
925 plenary rexer_using our laptop925 plenary rexer_using our laptop
925 plenary rexer_using our laptop
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
 

Similar to Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit Hamutcu

Learn All about Data Science from the Best Private University in Karnataka
Learn All about Data Science from the Best Private University in KarnatakaLearn All about Data Science from the Best Private University in Karnataka
Learn All about Data Science from the Best Private University in KarnatakaREVA University
 
Fundamental of data analytics
Fundamental of data analyticsFundamental of data analytics
Fundamental of data analyticsEhsanMalik17
 
Supply Chain Trends Overview
Supply Chain Trends OverviewSupply Chain Trends Overview
Supply Chain Trends OverviewCorDell Larkin
 
best data science training in bangalore
best data science training in bangalorebest data science training in bangalore
best data science training in bangalorenearlearn
 
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...Edureka!
 
Data Science Leaders Outlook In India 2019: By AIM & Simplilearn
Data Science Leaders Outlook In India 2019: By AIM & SimplilearnData Science Leaders Outlook In India 2019: By AIM & Simplilearn
Data Science Leaders Outlook In India 2019: By AIM & SimplilearnRicha Bhatia
 
DataEd Slides: Approaching Data Management Technologies
DataEd Slides:  Approaching Data Management TechnologiesDataEd Slides:  Approaching Data Management Technologies
DataEd Slides: Approaching Data Management TechnologiesDATAVERSITY
 
Data Science, Analytics and AI: Gamechangers for the Future of Work
Data Science, Analytics and AI: Gamechangers for the Future of WorkData Science, Analytics and AI: Gamechangers for the Future of Work
Data Science, Analytics and AI: Gamechangers for the Future of WorkSharala Axryd
 
Why Learn Hadoop & Big Data Technology in 2019 ?
Why Learn Hadoop & Big  Data Technology in 2019 ?Why Learn Hadoop & Big  Data Technology in 2019 ?
Why Learn Hadoop & Big Data Technology in 2019 ?Janbaskjdd
 
Data Science Whitepaper
Data Science WhitepaperData Science Whitepaper
Data Science WhitepaperTuan Yang
 
Data Science Growth Accelerator
Data Science Growth AcceleratorData Science Growth Accelerator
Data Science Growth AcceleratorKanika Khanna
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON Padma shree. T
 
Career_Jobs_in_Data_Science.pptx
Career_Jobs_in_Data_Science.pptxCareer_Jobs_in_Data_Science.pptx
Career_Jobs_in_Data_Science.pptxHarpreetSharma14
 
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearnWhat does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearnPraj H
 
Certified Big Data Science Analyst (CBDSA)
Certified Big Data Science Analyst (CBDSA)Certified Big Data Science Analyst (CBDSA)
Certified Big Data Science Analyst (CBDSA)GICTTraining
 
Future Skills Needs for Data and Analytics
Future Skills Needs for Data and Analytics Future Skills Needs for Data and Analytics
Future Skills Needs for Data and Analytics Dublinked .
 
Decentralizing Analytics - A Strategy for Organizing Effective Analytics Teams
Decentralizing Analytics - A Strategy for Organizing Effective Analytics TeamsDecentralizing Analytics - A Strategy for Organizing Effective Analytics Teams
Decentralizing Analytics - A Strategy for Organizing Effective Analytics TeamsKen Raetz
 
Emerging opportunities in the age of data
Emerging opportunities in the age of dataEmerging opportunities in the age of data
Emerging opportunities in the age of dataEjaz Siddiqui
 
10 reasons why you should choose big data hadoop as career in 2018
10 reasons why you should choose big data hadoop as career in 201810 reasons why you should choose big data hadoop as career in 2018
10 reasons why you should choose big data hadoop as career in 2018JanBask Training
 

Similar to Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit Hamutcu (20)

Learn All about Data Science from the Best Private University in Karnataka
Learn All about Data Science from the Best Private University in KarnatakaLearn All about Data Science from the Best Private University in Karnataka
Learn All about Data Science from the Best Private University in Karnataka
 
Fundamental of data analytics
Fundamental of data analyticsFundamental of data analytics
Fundamental of data analytics
 
Supply Chain Trends Overview
Supply Chain Trends OverviewSupply Chain Trends Overview
Supply Chain Trends Overview
 
best data science training in bangalore
best data science training in bangalorebest data science training in bangalore
best data science training in bangalore
 
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
 
Data Science Leaders Outlook In India 2019: By AIM & Simplilearn
Data Science Leaders Outlook In India 2019: By AIM & SimplilearnData Science Leaders Outlook In India 2019: By AIM & Simplilearn
Data Science Leaders Outlook In India 2019: By AIM & Simplilearn
 
DataEd Slides: Approaching Data Management Technologies
DataEd Slides:  Approaching Data Management TechnologiesDataEd Slides:  Approaching Data Management Technologies
DataEd Slides: Approaching Data Management Technologies
 
Data Science, Analytics and AI: Gamechangers for the Future of Work
Data Science, Analytics and AI: Gamechangers for the Future of WorkData Science, Analytics and AI: Gamechangers for the Future of Work
Data Science, Analytics and AI: Gamechangers for the Future of Work
 
Why Learn Hadoop & Big Data Technology in 2019 ?
Why Learn Hadoop & Big  Data Technology in 2019 ?Why Learn Hadoop & Big  Data Technology in 2019 ?
Why Learn Hadoop & Big Data Technology in 2019 ?
 
Data Science Whitepaper
Data Science WhitepaperData Science Whitepaper
Data Science Whitepaper
 
Data Science Growth Accelerator
Data Science Growth AcceleratorData Science Growth Accelerator
Data Science Growth Accelerator
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON
 
Career_Jobs_in_Data_Science.pptx
Career_Jobs_in_Data_Science.pptxCareer_Jobs_in_Data_Science.pptx
Career_Jobs_in_Data_Science.pptx
 
Paper publication
Paper publicationPaper publication
Paper publication
 
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearnWhat does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
 
Certified Big Data Science Analyst (CBDSA)
Certified Big Data Science Analyst (CBDSA)Certified Big Data Science Analyst (CBDSA)
Certified Big Data Science Analyst (CBDSA)
 
Future Skills Needs for Data and Analytics
Future Skills Needs for Data and Analytics Future Skills Needs for Data and Analytics
Future Skills Needs for Data and Analytics
 
Decentralizing Analytics - A Strategy for Organizing Effective Analytics Teams
Decentralizing Analytics - A Strategy for Organizing Effective Analytics TeamsDecentralizing Analytics - A Strategy for Organizing Effective Analytics Teams
Decentralizing Analytics - A Strategy for Organizing Effective Analytics Teams
 
Emerging opportunities in the age of data
Emerging opportunities in the age of dataEmerging opportunities in the age of data
Emerging opportunities in the age of data
 
10 reasons why you should choose big data hadoop as career in 2018
10 reasons why you should choose big data hadoop as career in 201810 reasons why you should choose big data hadoop as career in 2018
10 reasons why you should choose big data hadoop as career in 2018
 

Recently uploaded

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 

Recently uploaded (20)

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 

Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit Hamutcu

  • 1. INTRODUCTION AN INITIATIVE TO ESTABLISH THE ANALYTICS AND DATA SCIENCE STANDARDS Strata Data NYC September 26th, 2019
  • 2. © 2019 Forte Partners. All rights protected and reserved. 2 DO YOU BELIEVE A PILOT NEEDS A LICENSE? SPECIAL TRAINING? CERTIFICATION? Would you rather ride with him? Or with them at the controls? Who would you trust to pilot a commercial plane you are riding on?
  • 3. © 2019 Forte Partners. All rights protected and reserved. 3 HOW ABOUT A SURGEON PERFORMING SURGERY ON YOU? Bad things happen when we cannot define the necessary skills and knowledge… How do you know he or she is qualified to operate?
  • 4. © 2019 Forte Partners. All rights protected and reserved. 4 WHO IS ANALYSING YOUR DATA? ARE THEY QUALIFIED? Are you extracting the right value from your data assets? What happens when the wrong outcomes are provided? Do you think bad things can happen when people who are not qualified are analyzing your data? In the new world, failure to use data properly likely means the failure of your business…
  • 5. © 2019 Forte Partners. All rights protected and reserved. 5 DATA SCIENCE HAS BEEN A GLOBAL HOT TOPIC FOR THE LAST DECADE AND CONSIDERED AS A STRATEGIC CAPABILITY ACROSS EVERY SECTOR TODAY Harvard Business Review Glassdoor Best Job in America for 2016, 2017,2018, 2019 McKinseyCourseraWorld Economic Forum The Economist 2012 2013 2014 2015 2016 2017 2018 2019
  • 6. © 2019 Forte Partners. All rights protected and reserved. 6 NUMBER OF ANALYTICS PROFESSIONALS IS INCREASING AT A HIGH RATE ACCORDING TO KAGGLE * Kaggle Blog - Reviewing 2018 and Previewing 2019 4,466 24,313 70,980 137,873 240,933 437,442 589,552 1,400,000 2,500,000 0 500,000 1,000,000 1,500,000 2,000,000 2,500,000 3,000,000 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 CAGR %120 The number of Kaggle members gives insight about the rapid increase in number of analytics professionals
  • 7. © 2019 Forte Partners. All rights protected and reserved. 7 A MULTITUDE OF GROUPS WITH GROWING MEMBERSHIPS IS AN IMPORTANT INDICATOR FOR ANALYTICS MARKET SIZE 26.000.000+ LINKEDIN MEMBERS: CAPABILITY TARGETING* 1.000.000+ GLOBAL LINKEDIN MEMBERS: TITLE TARGETING* Estimated audience according to LinkedIn * Selected countries are United States, United Kingdom, China, India, Germany
  • 8. © 2019 Forte Partners. All rights protected and reserved. 8 A QUICK KEYWORD SEARCH ON LINKEDIN SHOWS A LARGE # OF PROFESSIONALS DEFINING THEMSELVES IN ANALYTICS RELATED SPACES Group Name Members Big Data and Analytics 341.381 Data Science Central 279.831 Big Data, Analytics, Business Intelligence & Visualization Experts Community 266,508 Big Data | Analytics | Strategy | Finance | Innovation 241.666 Business Intelligence Professionals (BI, Big Data, Analytics, IoT) 237,694 Business Analytics, Big Data, and Artificial Intelligence 199,966 Data Mining, Statistics, Big Data, Data Visualization, and Data Science 193,050 Python Community 139,698 Microsoft Business Intelligence 130,730 Change Consulting | Digital Transformation Data Analytics Security 109,738 Big Data 99,790 Analytics and Artificial Intelligence (AI) in Marketing and Retail 73,185 TDWI: Analytics and Data Management Discussion Group 70,766 Hadoop Users 69,167 Data Warehouse - Big Data - Hadoop - Cloud - Data Science - ETL 68,693 Business Analyst forum [BA forum] 68,388 Data Scientists 66,491 Python Professionals 59,526 Big Data & Hadoop Professionals 55,678 Data Warehousing (Business Intelligence, ETL) Professional's Group 53,994 Business Intelligence 52,110 KDnuggets Machine Learning, Data Science, Data Mining, Big Data, AI 48,345 Top 100 LinkedIn Group’s Member Base: 2,600,000+
  • 9. © 2019 Forte Partners. All rights protected and reserved. 9 HOW MANY DATA SCIENTISTS ARE THERE IN THE WORLD? WHAT DO YOU THINK? • 200K - 700K new grads join the analytics job market annually • The number of jobs for all US data professionals will increase to 2,720,000 openings by 2020. • Annual demand for the fast-growing new roles of data scientists, developers, and engineers in US will reach nearly 700,000 openings. “There are between 1.5-3 million data scientists in the world.” - Anthony Goldbloom, Co-founder & CEO @Kaggle https://www.huffpost.com/entry/where-will-data-science-b_b_12375864 https://www.forbes.com/sites/louiscolumbus/2017/05/13/ibm-predicts-demand-for-data-scientists-will-soar-28-by-2020/
  • 10. © 2019 Forte Partners. All rights protected and reserved. 10 DESPITE THE INCREASINGLY LARGE NUMBERS, THERE IS STILL DATA SCIENCE SKILLS SHORTAGE IN US https://economicgraph.linkedin.com/research/LinkedIns-2017-US-Emerging-Jobs-Report https://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/big-data-the-next-frontier-for-innovation https://economicgraph.linkedin.com/resources/linkedin-workforce-report-august-2018 In 2011, McKinsey forecasted that US could face a shortage of 150-190K people with deep analytical skills by 2018 … … as verified by LinkedIn 2018 Workforce Report; there is a shortage of 151K people with “data science skills” Top Emerging Jobs (2012-2017)
  • 11. © 2019 Forte Partners. All rights protected and reserved. 11 VARIOUS STUDIES ALSO HIGHLIGHT HOW CHALLENGING IT IS TO FIND THE RIGHT PEOPLE WITH NECESSARY SKILLS * The Quant Crunch - How the demand for data science skills is disrupting the job market? Burning Glass, IBM, BHEF (2017)
  • 12. © 2019 Forte Partners. All rights protected and reserved. 12 • Shortage of 150K+ people with data science skills in the US based on LinkedIn August 2018 Workforce Report • Y/Y increase in data science job posts is 2X the increase in data science job searches according to Indeed 2019 Hiring Lab Report • Big Data & Analytics is the scarcest skill in the KPMG 2019 CIO Survey with 44% of participants struggling to find the right talent • 82% of Data Scientist job postings require 3+ years prior work experience, and 43% require a Master's or higher degree • HR recruiters apply pre-screening cut-offs for additional academic attributes such as school tier and GPA • Recruitment processes are lengthy and complex with multiple phone and on-site interviews combined with various tests THE “SKILL GAP” RELATED CHALLENGES BETWEEN THE SUPPLY AND DEMAND SIDE HINDER THE POTENTIAL CONTRIBUTION OF THE FIELD TO THE ECONOMY Unmet Demand and the Elusive Data Science Talent • Ineligible due to application requirements • Gets eliminated through interview bias • Dissatisfied with job because of skills mismatch Difficulties in Hiring and Matching Expectations • Unfilled positions in search of unicorns • Losing star candidates/hires to competition • Extra headcount/upskill costs due to bad hires
  • 13. © 2019 Forte Partners. All rights protected and reserved. 13 IN ADDITION TO THE SHORTAGE, WE SEE A HUGE VARIETY OF SKILLS FOR DATA SCIENTISTS EVEN AT THE SAME COMPANY **** **** ****
  • 14. © 2019 Forte Partners. All rights protected and reserved. 14 IT GETS EVEN MORE COMPLICATED WHEN YOU LOOK ACROSS DIFFERENT COMPANIES AND SECTORS **** **** ****
  • 15. © 2019 Forte Partners. All rights protected and reserved. 15 SIMILARLY AT JOB POSTINGS, YOU SEE A WIDE VARIETY OF ROLE DEFINITIONS AND EXPECTATIONS FROM THE SAME JOB TITLE
  • 16. © 2019 Forte Partners. All rights protected and reserved. 16 INSIGHT – A job searcher on Quora: “Most interviewers had me write pseudocode, in something like Python. Most asked me some product-specific questions, such as, "How would you use data to improve X feature on our website?" Some interviewers asked me to write SQL, in addition to or instead of pseudocode. Another question I was often asked was how to set up some kind of experiment, such as, "How would we design an experiment to see whether our new homepage is better?” or "How can we use data to improve search results?". One or two interviewers asked me algorithms questions (quicksort, etc) but not in very much depth. Beyond that, there was little in common. The formats varied a lot. Some interviews were all-day affairs - back-to-back meetings with programmers all day - and others were just a quick meeting with a CTO. Some interviews had me filling whiteboards with code, while others just consisted of a face-to- face conversation. A few of the interviews involved some sort of social/culture component, ranging from formal interviews with non-technical people to happy hours.”
  • 17. © 2019 Forte Partners. All rights protected and reserved. 17 THERE ARE OVER 250 PROGRAMS IN THE US THAT OFFER GRADUATE DEGREES IN ANALYTICS OR DATA SCIENCE * Michael Rappy, Institute for Advanced Analytics - https://analytics.ncsu.edu/?page_id=4184
  • 18. © 2019 Forte Partners. All rights protected and reserved. 18 DESPITE HAVING SIMILAR NAMES AND OBJECTIVES, COURSE OFFERINGS AND APPROACH OF THESE PROGRAMS VARY WIDELY Carnegie Mellon Master of Data Science Columbia University Master of Data Science Analytics Major 1. Data Science Seminar (11-631) 2. Capstone Planning Seminar (11-634) 3. Data Science Analytics Capstone (11-632) 4. Core Curriculum (five courses) Choose two courses in ML/Statistics: - 10-601 Machine Learning - 11-641 Machine Learning for Text Mining - 10-701 Advanced Machine Learning - 10-605 Machine Learning with Big Data Sets Choose two courses in Software Systems: - 11-791 Design and Engineering of Intelligent Info Systems - 15-619 Cloud Computing - 11-792 Information Systems Project - 11-642 Search Engines Choose one course with a focus on Big Data: - 15-826 Multimedia Databases and Data Mining - 10-605 Machine Learning with Big Data Sets - 11-676 Big Data Analytics 5. Three electives – any graduate level course 600 and above in SCS REQUIRED/CORE COURSES 1. STAT W4203 Probability Theory 2. CSOR W4246 Algorithms for Data Science 3. STAT W5703 Statistical Inference and Modeling 4. COMS W4121 Computer Systems for Data Science 5. COMS W4776 Machine Learning for Data Science 6. STAT W4701 Exploratory Data Analysis and Visualization 7. ENGI E4800 Data Science Capstone and Ethics https://www.cmu.edu/graduate/data-science https://datascience.columbia.edu/master-of-science-in-data-science
  • 19. © 2019 Forte Partners. All rights protected and reserved. 19 http://www.byui.edu/catalog/#/programs/41PwqJ9RZ https://catalog.winona.edu/preview_program.php?catoid=21&poid=4333 Brigham Young University Data Science Winona State University Data Science CIT111 - Introduction to Databases CS101 - Introduction to Programming CS241 - Survey Object-Oriented Programming/Data Struct. CS450 - Machine Learning and Data Mining MATH325 - Intermediate Statistics MATH425 - Applied Linear Regression MATH488 - Statistical Consulting + CS335 - Data Wrangling, Exploration, and Visualization MATH221A - Business Statistics Electives + Project + Internship DSCI 210 - Data Science DSCI 310 - Data Summary and Visualization DSCI 325 - Management of Structured Data STAT 210 - Statistics STAT 310 - Intermediate Statistics STAT 360 - Regression Analysis + MATH 140 - Applied Calculus CS 234-250 - Algorithms and Problem-Solving I-II CS 385 - Applied Database Management Systems DSCI 395-495 - Professional Skill Development & Communication Electives + Project OR Internship … WHICH IS ALSO VALID FOR UNDERGRADUATE PROGRAMS
  • 20. © 2019 Forte Partners. All rights protected and reserved. 20 … AS WELL AS ONLINE CERTIFICATES udacity.com/course/intro-to-data-science--ud359 coursera.org/specializations/data-analysis LESSON 1 Introduction to Data Science • Pi-Chaun (Data Scientist @ Google): What is Data Science? • Gabor (Data Scientist @ Twitter): What is Data Science? • Problems solved by data science. LESSON 2 Data Wrangling • What is Data Wrangling? • Acquiring data. • Common data formats. LESSON 3 Data Analysis • Statistical rigor. • Kurt (Data Scientist @ Twitter) - Why is Stats Useful? • Introduction to normal distribution. LESSON 4 Data Visualization • Effective information visualization. • An analysis of Napoleon's invasion of Russia! • Don (Principal Data Scientist @ AT&T): Communicating Findings. LESSON 5 MapReduce • Introduction to Big Data and MapReduce. • Learn the basics of MapReduce. • Mapper. LESSON 1 Data Management and Visualization • Managing Data • Visualizing Data LESSON 2 Data Analysis Tools • Hypothesis Testing and ANOVA • Chi Square Test of Independence • Pearson Correlation • Exploring Statistical Interactions LESSON 3 Regression Modeling in Practice • Basics of Linear Regression • Multiple Regression • Logistic Regression LESSON 4 Machine Learning for Data Analysis • Decision Trees • Random Forests • Lasso Regression • K-Means Cluster Analysis Coursera | Intro to Data ScienceUdacity | Intro to Data Science
  • 21. © 2019 Forte Partners. All rights protected and reserved. 21 AS A RESULT OF THE SKILL-GAP AND CONFUSION FOR FAST GROWING ROLES, THE WASTE OF TIME AND COSTS FOR US MARKET ARE NON-NEGLIGIBLE DS Position Remains Open Overall Engineer Recruiting Cost Data Scientist Hiring Steps 45 days (5 longer) 30.000$ 6 • According to IBM; Data Science and Analytics jobs remain open an average of 45 days, 5 days longer than average. • By April-19, 10% of Data Scientists changed/began to a new role in the last 90 days. • Hiring process for DS’s 12.5% longer & 3750$ costlier • 2.7 M Job-Openings in 2020 (700K for fast-growing roles) The wasted cost and time during recruitment: - 525 M $ - 700 K Days = 1.944 years of time By April-19 Data Scientist ML Eng./Spec. Data Analysts Statisticians Fresh-starters (Last 90 days) 10% 12% 6% 3-4% Software Eng. Sales Rep Accountant 4% 4% 2% The wasted cost and time with wrong hires: - 868 M $ - 441 K Days = 1.225 years of time https://www.forbes.com/sites/louiscolumbus/2017/05/13/ibm-predicts-demand-for-data-scientists-will-soar-28-by-2020/ https://www.quora.com/What-is-the-average-cost-of-recruiting-an-engineer-in-Silicon-Valley
  • 22. © 2019 Forte Partners. All rights protected and reserved. 22 Lack of Mentors 71% of direct managers lack knowledge to help technical development Unclear Mandate 30.4% cite lack of clear business questions as a main challenge Proof of Expertise 59% data scientists are self-claimed experts with self training Too Many Options 750+ US university degrees, 2750+ online courses, vendor trainings Diversity Gaps Only 18% are female, minorities have low representation Retention Challenges 21.4% of junior data scientists change jobs in a year Inefficient Utilization Only 9% can quantify impact, 85% projects considered failure Ineffective Selection 61% of managers believe recruiters don’t understand the needs Specialized Needs 60% job postings list 5 or more different data science skills Culture Gaps 52% believe their company culture is a barrier to AI adoption A DEEP DIVE INTO THE ROOT CAUSES OF “THE CONFUSION & SKILL GAP” REVEALS 10 MAIN CHALLENGES THAT ARE LINKED TO THE BOTH SIDES OF THE EQUATION 1 2 3 4 5 6 7 8 9 10
  • 23. © 2019 Forte Partners. All rights protected and reserved. 23 A COMMON GROUND NEEDS TO BE DEFINED FIRST TO FACILITATE THE DISCUSSIONS IN THE ECOSYSTEM TO OVERCOME THESE CHALLENGES Recruitment Agencies Data Science Community Training Centers Universities & Academia Regulatory Bodies Businesses Service Providers Application DevelopersAnalytics and Data Science Standards
  • 24. © 2019 Forte Partners. All rights protected and reserved. 24 INITIATIVE FOR ANALYTICS AND DATA SCIENCE (IADSS) AIMS TO SUPPORT THE DATA SCIENCE ECOSYSTEM BY DEFINING PROFESSIONAL STANDARDS Job Titles, Roles Knowledge and Skills Requirements Assessment and Measurement Industry Standards
  • 25. © 2019 Forte Partners. All rights protected and reserved. 25 We reached out to hundreds of analytics leaders / practitioners globally through interviews and a detailed questionnaire We are active online as well as conferences and meet-ups including ICDM, KDD, ODSC, Strata WE ARE WORKING ON A STANDARDIZATION AND ASSESSMENT FRAMEWORK THROUGH INDUSTRY-WIDE COLLABORATION Research Study Community Outreach
  • 26. © 2019 Forte Partners. All rights protected and reserved. 26 UNDER THE GUIDANCE OF IADSS ADVISORY BOARD ON THE RESEARCH, CONTENT AND ENGAGEMENT Advisory Board
  • 27. © 2019 Forte Partners. All rights protected and reserved. 27 WE USE SEVERAL DATA SOURCES TO GET A COMPREHENSIVE VIEW OF THE DATA SCIENCE PROFESSIONAL LANDSCAPE Literature Review Extensive review of existing research and content on data science and analytics landscape in academia and industry LinkedIn Analysis Analyzing real job-postings and employee profiles to address skill gaps in the job market. Detailed Questionnaire Reaching and professionals in the data science field with organizational and knowledge questions 1-1 Interviews In-depth interviews with data science leaders to get narrative on organizational definitions and practices REPORT I Literature Review, Essential Definitions, Body of Knowledge REPORT II Skills and knowledge by industry roles and titles
  • 28. © 2019 Forte Partners. All rights protected and reserved. 28 Conway, D. (2010). The Data Science Venn Diagram. Retrieved July 15, 2019, from Drew Conway website NIST (2015), NIST Big Data Interoperability: 2015 NIST Big Data Public Working Group Definitions and Taxonomies Subgroup. AS EXPECTED, THERE IS NO CONSENSUS ON WHAT DATA SCIENCE IS ACCORDING TO EXISTING LITERATURE While Conway (2010) defines “Machine Learning” as a combination of “Hacking Skills” and “Math & Statistics Knowledge”; NIST (2015) has its own Venn diagram. In our KDD 2019 workshop, we had as many definitions of data science as there were participants.
  • 29. © 2019 Forte Partners. All rights protected and reserved. 29 WHAT IS AGREED ABOUT DATA SCIENCE? “The need for computing with data, under varied computational constraints and in the presence of highly disparate, unstructured data, is a defining tenet of data science practice. Data science implies a set of activities that, in today’s view, can be considered transdisciplinary. Data scientists deliver against the main challenges that our sources highlight: the computational challenge, the knowledge discovery challenge, and organizational challenges.”* *IADSS’ Analytics and Data Science Standardization and Assessment Framework Report Part I
  • 30. © 2019 Forte Partners. All rights protected and reserved. 30 IADSS’ “WORKING” DEFINITION OF DATA SCIENCE *IADSS’ Analytics and Data Science Standardization and Assessment Framework Report Part I “the ability to extract knowledge and insights from large and complex data sets.” DJ Patil, Former Chief Data Scientist of the United States “to understand [data], to process it, to extract value from it, to visualize it, to communicate it.” Hal Varian, Chief Economist, Google and UC Berkeley Professor of Information Sciences, Business, and Economics “a field of big data geared toward providing meaningful information based on large amounts of complex data. Data science, or data-driven science, combines different fields of work in statistics and computation in order to interpret data for the purpose of decision making.” Investopedia “Solving Problems by Inference from Data”
  • 31. © 2019 Forte Partners. All rights protected and reserved. 31 IADSS’ “WORKING” DEFINITION OF DATA SCIENCE *IADSS’ Analytics and Data Science Standardization and Assessment Framework Report Part I - Decision making/support - Prediction - Insights extraction - Reporting - Explaining, confirming, refuting - Exploration - Discovery - Optimization - ML/Induction - Unsupervised Learning - Computational Statistical modeling - Pattern/Summary Extraction - Structured/Unstructured - Metadata - Data representation, Cleaning, & Transformation - Data collection and preparation - Entity extraction “Solving Problems by Inference from Data”
  • 32. © 2019 Forte Partners. All rights protected and reserved. 32 1-1 INTERVIEW INSIGHTS PROVIDE NARRATIVE ON THE CHALLENGES AND PRACTICES OF DATA SCIENCE MANAGERS “You want to curb attrition and that ends up affecting your decisions on recognizing and promoting people between levels which might be inconsistent with actual skill sets and how they're progressing in their roles. But I don't see a solution for either because the market is so hot and they're getting bombarded with job offers. And that leads to a lot of frustration and cultural impact on the organization” “I take courses and certifications on platforms like Coursera as an expression of interest rather than expertise. It shows commitment to lifelong learning and which I think is really important for the Data Science community and participants.” “I've been on a panel where the panelist next to me who took a statistics course some time when she was at university and doesn't think math is important for Data Science.” “I just hired a data scientist and we started with about 60 applicants. The role was fairly well described and as a result I immediately eliminated 40 without a screening call. So I got down to about 20 for screening calls… down to 5 interviews onsite. At the end none of them were acceptable except for one. So from 60 to 1, this is a huge effort. After the screening, the ones that actually made it to interviews almost all of them failed on the math questions. There are many of them strong in engineering and but the math is rare.” “We've seen folks create a bunch of beautiful dashboards and cost of tools has gone down precipitously in the last 20 years but that doesn't mean that you know what you’re looking at or ensure it won’t be misused and misrepresented. Same thing on the data science front. The most important thing is not being able to use an algorithm that you picked off a tool but to know how and why you're using it.
  • 33. © 2019 Forte Partners. All rights protected and reserved. 33 WE ARE DEVELOPING A BODY OF KNOWLEDGE THAT DETAILS THE SKILL & KNOWLEDGE UNIVERSE FOR DATA SCIENCE Under Science & Math and Programming & Technology Domains, there are 11 Main Areas, 42 Subjects and more than 200 Topics.
  • 34. © 2019 Forte Partners. All rights protected and reserved. 34 A DRILL-DOWN INTO TITLES ON LINKEDIN **** Job Post Analysis Professional Profile Analysis
  • 35. © 2019 Forte Partners. All rights protected and reserved. 35 A GLOBAL SURVEY COLLECTED DETAILED DATA ON EXPECTED SKILLS AND KNOWLEDGE FOR A VARIETY OF ROLES IN THE DATA SCIENCE SPACE Insight about analytics/ data science team(s) Training, Development & Hiring • Analytics Director • Analytics Manager • BI Analyst / Specialist • BI Director • Big Data Engineer • Chief Data/ Analytics Officer • Data Analyst • Data Architect • Data EngineerData Miner • Data Modeler • Data Science Director • Data Scientist • Machine Learning Engineer • Machine Learning Scientist/ Expert/ Specialist • Scientist / Researcher Job Titles • Data mining basics • Science skills • Engineering skills • Business / soft skills • Leadership related skills • Business domain skills • Tool skills Required Skills and Knowledge
  • 36. © 2019 Forte Partners. All rights protected and reserved. 36 RESEARCH PARTICIPANTS COME FROM A WIDE SPECTRUM OF INDUSTRIES AND GEOGRAPHIES • More than 800 survey responses collected from professionals and data science/analytics/BI executives.
  • 37. © 2019 Forte Partners. All rights protected and reserved. 37 THROUGH THE SURVEY WE ALSO GAINED INSIGHT INTO ORGANIZATIONAL STRUCTURES, RECRUITMENT AND TRAINING PRACTICES
  • 38. © 2019 Forte Partners. All rights protected and reserved. 38 ORGANIZATIONS HAVE A MULTITUDE OF TITLES IN DATA SCIENCE AND ANALYTICS TEAMS: “DATA SCIENTIST” IS THE MOST COMMON
  • 39. © 2019 Forte Partners. All rights protected and reserved. 39 A MUST-HAVE ANALYSIS: AUTOMATICALLY EXTRACTED PROTOTYPICAL SKILL-SETS FROM SURVEY RESPONSES
  • 40. © 2019 Forte Partners. All rights protected and reserved. 40 MATCHING TITLES WITH PROTOTYPES
  • 41. © 2019 Forte Partners. All rights protected and reserved. 41 A DEEPER LOOK INTO VARIANCE: 3 TYPES OF DATA SCIENTISTS Estimated 46% of Data Scientists (Mostly Data Preparation, ML/DS Development and Communication & Collaboration) Estimated 25% of Data Scientists (Mostly Data Prep., ML/DS Development and Communication & Collaboration, lower expertise) Estimated 29% of Data Scientists DS-1 DS-2 DS-3 DS/ML (dev.) Rep. & Vis. Big Data ML (Eng.) Data Eng. Comm. & Collaboration Basic Analysis & Data Preparation ML (Theory) Rel. DBs Lead. & Domain Knowledge The composition of the skill-sets varies greatly even across the respondents holding the same job title
  • 42. © 2019 by IADSS Contact  129 Newbury Street 3rd Floor, Boston, MA 02116 @ info@iadss.org https://www.iadss.org/ IADSS.org IADSS Discussion Group @IADSSglobal IADSS Channel

Editor's Notes

  1. Survey sonuçları hariç ne yapmaya çalıştığımızdan bahsedelim: Quote ekleyelim LinkedIn araştırması Survey structure slide + How many reach we have + Organizational insights from Exec. Survey
  2. Makyaj
  3. Utkunun University degree sayısı için kaynağı nedir? Tutarsızlık var
  4. Survey sonuçları hariç ne yapmaya çalıştığımızdan bahsedelim: Quote ekleyelim LinkedIn araştırması Survey structure slide + How many reach we have + Organizational insights from Exec. Survey
  5. Survey sonuçları hariç ne yapmaya çalıştığımızdan bahsedelim: Quote ekleyelim LinkedIn araştırması Survey structure slide + How many reach we have + Organizational insights from Exec. Survey
  6. Survey sonuçları hariç ne yapmaya çalıştığımızdan bahsedelim: Quote ekleyelim LinkedIn araştırması Survey structure slide + How many reach we have + Organizational insights from Exec. Survey
  7. Survey sonuçları hariç ne yapmaya çalıştığımızdan bahsedelim: Quote ekleyelim LinkedIn araştırması Survey structure slide + How many reach we have + Organizational insights from Exec. Survey
  8. Survey sonuçları hariç ne yapmaya çalıştığımızdan bahsedelim: Quote ekleyelim LinkedIn araştırması Survey structure slide + How many reach we have + Organizational insights from Exec. Survey
  9. Survey sonuçları hariç ne yapmaya çalıştığımızdan bahsedelim: Quote ekleyelim LinkedIn araştırması Survey structure slide + How many reach we have + Organizational insights from Exec. Survey
  10. Survey sonuçları hariç ne yapmaya çalıştığımızdan bahsedelim: Quote ekleyelim LinkedIn araştırması Survey structure slide + How many reach we have + Organizational insights from Exec. Survey
  11. Survey sonuçları hariç ne yapmaya çalıştığımızdan bahsedelim: Quote ekleyelim LinkedIn araştırması Survey structure slide + How many reach we have + Organizational insights from Exec. Survey
  12. Survey sonuçları hariç ne yapmaya çalıştığımızdan bahsedelim: Quote ekleyelim LinkedIn araştırması Survey structure slide + How many reach we have + Organizational insights from Exec. Survey
  13. Survey sonuçları hariç ne yapmaya çalıştığımızdan bahsedelim: Quote ekleyelim LinkedIn araştırması Survey structure slide + How many reach we have + Organizational insights from Exec. Survey
  14. Survey sonuçları hariç ne yapmaya çalıştığımızdan bahsedelim: Quote ekleyelim LinkedIn araştırması Survey structure slide + How many reach we have + Organizational insights from Exec. Survey