SlideShare a Scribd company logo
1 of 40
The Analytics & Data Science
Landscape
Philip E. Bourne
peb6a@virginia.edu
Analytics Challenges in Modern Tax Administration
November 16, 2020
Disclaimer
• I pay taxes but typically do not get
it right
• My PhD is actually in physical
chemistry
• I did work for the NIH for 3 years
On a Positive Note
• I have been working with “big” data for many years
• As Dean I am very interested in mapping the
capabilities of our students to the needs of the
workplace
• As a researcher I am concerned that the research our
school undertakes is for societal benefit
This moment in time….
What of the future?
One view is the 6D’s
5
Digitization
Deception
Disruption
Demonetization
Dematerialization
Democratization
Time
Volume,Velocity,Variety
Digital camera invented by
Kodak but shelved
Megapixels & quality improve slowly;
Kodak slow to react
Film market collapses;
Kodak goes bankrupt
Phones replace
cameras
Instagram,
Flickr become the
value proposition
Digital media becomes bona fide
form of communication
From a presentation to the Advisory Board to the NIH Director
Example - photography
6
Everything is Digital Data to be Analyzed…
Play the data science game – pick an
object/subject and you will immediately see a
reason why data science is important …
If I were a tie maker I would be
undertaking a data science analysis right
now…
Large collection of
random images with
metadata before and
during the pandemic
Who is still wearing
ties?
• Age
• Profession
• Ethnicity
• Socioeconomic
status
• Location
• …..
Causality –
Does the pandemic
represent a shift in tie
wearing? If so by how
much?
Prediction –
What will be the market
post COVID?
https://en.wikipedia.org/wiki/Jim_Gray_(computer_scientist)
https://www.microsoft.com/en-us/research/wp-
content/uploads/2009/10/Fourth_Paradigm.pdf
https://twitter.com/aip_publishing/status/856825353645559808
This is a paradigm
shift ..
A Paradigm Shift Reflected in the Workforce
Increased Demand over the Past Five Years
74%
Artificial Intelligence specialists
Top industries hiring this talent: Computer software, internet,
information technology and services, higher education, consumer
electronics
37%
Data Scientist
Top industries hiring this talent: Information technology and
services, computer software, internet, financial services, higher
education
33%
Data Engineer
Top industries hiring this talent: Information technology and
services, internet, computer software, financial services, hospital
and healthcare
How is Academia Responding?
Every University has Some Initiative
Workforce Demand Outweighs
Supply – A Problem for the IRS?
The Rising Demand for Data Scientists
*for graduates seeking employment
100% 100% 100% 98% 97%
UVA School of Data Science
Graduate Job Placement
2019 2018 2017 2016 2015
*
Roles
Machine Learning Engineer, Director of Data
Science, Deep Learning Research Scientist,
Senior Data Analyst, Data Science Developer,
Consultant, Product Data Analyst, Financial
Engineer, Engagement Manager & more
Industries
● Finance
● Government
● Healthcare & Medicine
● Professional Sports
● Commerce
● Media
● Higher Ed
● Technology
Recent Poll of Machine Learning PhD Students
A New School for a New Century
A School Without Walls
Mission
To be a national and international leader in responsible data science
emphasizing interdisciplinary collaboration which results in furthering
discovery, sharing knowledge, and societal benefit
Our Working Definition
• Use of the ever increasing amount of open, complex, diverse
digital data frequently in ubiquitous cloud environments
• Finding ways to ask and then answer relevant questions by
combining such diverse data sets
• Arriving at statistically significant conclusions not otherwise
obtainable
• Sharing such findings in a useful way
• Translating such findings into actions that improve the human
condition
Use Case – Data Integration
Researcher and Assistant Professor of Medicine
Dr. Thomas Hartka, also a current online Masters
in Data Science student, is combining two
disparate data sets—electronic health records and
DMV crash data—to save lives after motor vehicle
crashes.
“I enrolled in the MSDS program to
expand my research on automotive
safety. I have already used
techniques from classes in my work.
I hope to expand my research to
real-time analytics to improve
emergency room care.”
— Dr. Thomas Hartka, UVA School
of Medicine
Guiding Principles
• Excellence
• Integrity
• Diversity
• Openness and transparency
• FAIR data - the ability to Find, Access, Interoperate and Reuse data
• Innovation
• For the social good
• Data/code as first
class citizen – Part
of promotion and
tenure
http://www.ncbi.nlm.nih.gov/pubmed/26207759
Why FAIR?
[Adapted from Carole Goble]
Only 12% of data
from research is
preserved
Infrastructure
Commons - Platform Stack
https://datascience.nih.gov/commons
Compute Platform: Cloud or HPC
Services: APIs, Containers, Indexing,
Software: Services & Tools
scientific analysis tools/workflows
Data
“Reference” Data Sets
User defined data
DigitalObjectCompliance
App store/User Interface
Why not more like AirBnB?
https://doi.org/10.1371/journal.pbio.2001818
How and What are We Teaching …
The 4+1 Model
The model is based
on the core insight
that all definitions of
data science assume
a pipeline and that
this pipeline forms a
parallel process
[From Raf Alvarado]
Our Representation of Data Science
The 4+1 Model
• Value – assuring
societal benefit
• Design -
Communication of the
value of data
• Systems – the means
to communicate and
convey benefit
• Analytics – models
and methods
• Practice – where
everything happens
[From Raf Alvarado]
The 4+1 Model Interplay
[From Raf Alvarado]
• Value + Design = Openness,
responsibility
• Value + Analytics = Human
centered AI, algorithmic bias
• Value + Systems =
sustainability, access,
environmental impact
• Design + Analytics = literate
programming, visualization
• Design + Systems =
dashboards, engineering
design
• Analytics + Systems = ML
engineering
The 4+1 Model
27
Integration Practice of DS, Capstones
Analytics Linear Models, Data Mining, Bayesian ML, Deep
Learning, Text Analytics, Foundations of CS
Systems Programming and Systems, Big Data Analytics
Value Ethics of Big Data
Design Practice of DS, Visualization
We strive to build a curriculum that aligns with our model
Distinctive Features
28
 Foundational topics in analytics from linear models to data
mining and machine learning
 An integrated curriculum developed in consultation with
practicing data scientists that incorporates a challenging
capstone experience
 Applications and data drawn from different disciplines, e.g.,
science, business, and health
 Instruction in the best practices in the management and
conduct of data science projects
 Computational methods built on the latest techniques in R and
Python
 Required course in data ethics
 Emphasis on team science — data science is a team sport
Analytics and Machine Learning
29
STAT 6021 Linear Models - Multiple linear regression, logistic
regression. (R)
SYS 6018 Data Mining - Tree-based methods, kernel methods,
unsupervised learning. Uses An Introduction to Statistical Learning
by James, Witten, Hastie and Tibshirani. (R)
SYS 6014 Bayesian Machine Learning - Methods to handle
uncertainty and to apply per variable weight distributions (as
opposed to a single optimal value). (Python)
SYS 6016 Machine Learning - Focuses on neural networks,
including deep learning, convolutional neural networks, recurrent
neural networks, and autoencoders. (Python)
Computer Science
30
CS 5010 Programming and Systems for Data Science -- Python,
Pandas, data analysis at scale and in context, some development
practices.
CS 5021 Foundations of CS -- Data structures, algorithms,
complexity, relational and noSQL databases; "CS in a box."
Data Ethics
31
Virtuous cycle between {computer science,
statistics, applied mathematics} and the
humanities
Exemplified with use cases
It’s a culture not a tick of the box
Computer Science
Statistics
Applied Mathematics
Humanities
Practice and Application of Data
Science
32
DS 6001 and 6003 focus on data design
Flow of data between Human and Machine domains
H → M: Establishing data so that it can be analyzed
M → H: Presenting results of analysis to the world
6001: Data engineering pipeline -- acquiring, cleaning, exploring
6003: Data product development -- presenting, visualizing, app
dev
Electives
33
CS 6160 Theory of Computation
CS 6444 Parallel Computing
CS 6501 Text Mining
CS 6750 Database Systems
DS 5001 Exploratory Text Analytics
DS 6559 Biomedical Cloud Computing Seminar
SARC 5400 Data Visualization
STAT 6250 Longitudinal Data Analysis
STAT 6260 Categorical Data Analysis
SYS 6023 Cognitive Systems Engineering
SYS 6050 Risk Analysis
SYS 6582 Reinforcement Learning
SYS 7001 System and Decision Sciences
Capstones
34
A parallel and culminating experience that focuses on a real world
data science problem
Emphasizes problem definition and scoping
Employ project management
Involves developing, evaluating, and creating a data product for a
client
Requires presentations, a proposal and a published paper (IEEE)
Teams of students work on separate projects under guidance of an
advisor
Furthering Discovery to Build a Better World
RESEARCH
Cybersecurity
Detecting broad-spectrum cyber
threats almost immediately after
they are launched through a $7.6
million Defense Advanced
Research Projects Agency
(DARPA) grant.
Environment
Using NASA data collected aboard the
International Space Station to examine climate
change in the Shenandoah National Forest
and beyond, and find solutions
Health & Medicine
Securing high-performance computing
equipment and personnel to allow
collaboration across the university on brain
science research like Autism, Alzheimer’s,
mental health disorders, traumatic brain
injuries and more.
Business
Discovering what makes a job
interview successful for the
candidate and the recruiter, and
how to mitigate bias in the
recruiting process
Democracy
Investigating how terrorist groups recruit
women through propaganda and examining
risk and threat assessment for extremist
violence perpetrated by women.
Education
Helping economically disadvantaged,
underrepresented populations pursue tailored
educational workforce pathways that have a
higher probability of leading them to success.
Applying Data Science Across Industries
“To tackle challenges in science and medicine.”
— Elizabeth Driskell, MSDS ‘20
“To inform public policy and government.”
— Bradley Katcher, MSDS ‘20
“I want to use data science to find a new way of
thinking.” — Alex Gromadzki, MSDS ‘21
“I want to use data science to solve complex business
problems.” — Ruslan Askerov, MSDS ‘21
“To address poverty and income inequality.”
— Arti Patel, MSDS ‘20
Growing the School
M.S. IN DATA SCIENCE
Residential & Online
2020
2020-2023
UNDERGRADUATE
COURSES
increase to 18
courses per AY
2021
PH.D. PROGRAM
2023
UNDERGRADUATE
MAJOR
Building occupied
Team Size (FTEs)
5
40
60
80
120
Exec. Ed.
SDS and IRS - Actions
• Workforce pipeline - awareness
• Continuing Ed opportunities
• Provision of synthetic data
• Funded and collaborative research
• Faculty, Capstone, Presidential Fellowship, PhD Internships
• IRS Admits – MSDS, PhD
• Join the corporate commons
• ….
QUESTIONS?
peb6a@virginia.edu
@pebourne
SDS Faculty Research
Data Science Faculty member or affiliated
faculty Website Research Interests
Nada Basit
https://engineering.virginia.edu/facul
ty/nada-basit
Machine Learning, Bioinformatics, Data Mining, Pattern
Recognition
Phil Bourne
https://engineering.virginia.edu/facul
ty/philip-e-bourne
Multiscale Modeling Using Data Science Techniques
Early Stage Drug Discovery and Drug Repurposing
Early Stage Drug Methods and Tools for Macromolecular
Don Brown
https://engineering.virginia.edu/facul
ty/donald-e-brown-phd
Data Fusion, Knowledge Discovery, and Simulation
Optimization
Sallie Keller
https://biocomplexity.virginia.edu/sal
lie-keller
social and decision informatics, statistical underpinnings of
data science, and data access and confidentiality.
Daniel Mietchen
https://tools.wmflabs.org/scholia/aut
hor/Q20895785
Computational Biology, Biodiversity integrating research
workflows with the World Wide Web through open
licensing, open standards, and open collaboration via
Rafael Avarado http://transducer.ontoligent.com/
Cultural Analytics and Machine Learning, Digital
Humanities, Text Analysis
Heman Shakeri https://www.hemanshakeri.com/
structure and function of interconnected networks, often
expressed via graphs that comprise a set of nodes and a
set of connections between them.
Jonathan Kropko
https://facultydirectory.virginia.edu/f
aculty/jk8sd
methods to examine historical data, to test theories of
voting in U.S. presidential elections, and to handle
nonresponse in surveys.
Michael Porter
https://engineering.virginia.edu/facul
ty/michael-d-porter
event prediction, pattern and anomaly detection, and data
linkage - applications for Criminology, Transportation,
Terrorism, Defense, Security, Forensics, Business
Mohammad Fallahi-Sichani new hire
designing and building new experimental and
computational tools to enable the analysis, interpretation
and rational modulation of multi-scale processes that
Jack Van Horn
https://scholar.google.com/citations?
user=i9bGqbgAAAAJ&hl=en Psychology and Data Science, Cognitive Neuroscience
Pete Alonzi https://github.com/alonzi
Vicente Ordonez
https://engineering.virginia.edu/facul
ty/vicente-ordonez-roman
Computer Vision, Natural Language Processing and
Machine Learning
Tim Clark
https://scholar.google.com/citations?
user=k-iwlCUAAAAJ&hl=en
next generation approaches for biomedical
communications and data integration, including
semantically integrated data repositories, claims and
Gerard Learmonth
https://www.researchgate.net/profil
e/Gerard_Learmonth
Generation and testing of pseudorandom number
generators; Abstract database design; Strategic
applications of information systems and technology
Hongning Wang http://www.cs.virginia.edu/~hw5x/
data mining, machine learning, and information retrieval,
with a special emphasis on computational user behavior
modelin
Stephen Adams
http://www.nsfcvdi.org/wordpress/c
vdi_personnel/steven-adams-ph-d/
Adaptive Decision Systems Lab at UVA and his research is
applied to several domains including activity recognition,
prognostics and health management for manufacturing
Aidong Zhang
https://engineering.virginia.edu/facul
ty/aidong-zhang ML, Data mining, bioinformatics
Jundong Li http://people.virginia.edu/~jl6qk/
Data Mining, Machine Learning, Social Computing, and
Deep Learning
Brian Wright
https://www.linkedin.com/in/brian-
wright-ph-d-90063027/

More Related Content

What's hot

Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.pptneelamoberoi1030
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentationAASTHA PANDEY
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Simplilearn
 
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...Edureka!
 
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...Simplilearn
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data ScienceSpotle.ai
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning ProjectEng Teong Cheah
 
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...Simplilearn
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceEdureka!
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An OverviewMachinePulse
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
Introduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceIntroduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceData Science Thailand
 
Data scientist roadmap
Data scientist roadmapData scientist roadmap
Data scientist roadmapSonu Kumar
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceSrishti44
 

What's hot (20)

Data analytics vs. Data analysis
Data analytics vs. Data analysisData analytics vs. Data analysis
Data analytics vs. Data analysis
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
 
Data Analytics Life Cycle
Data Analytics Life CycleData Analytics Life Cycle
Data Analytics Life Cycle
 
Predictive analytics
Predictive analytics Predictive analytics
Predictive analytics
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
 
Big data architecture
Big data architectureBig data architecture
Big data architecture
 
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
 
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
 
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An Overview
 
Big Data
Big DataBig Data
Big Data
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Introduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceIntroduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data Science
 
Data scientist roadmap
Data scientist roadmapData scientist roadmap
Data scientist roadmap
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 

Similar to The Analytics and Data Science Landscape

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptxshalini s
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxwahiba ben abdessalem
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfvishal choudhary
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxssuser1a4f0f
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First CourseArnab Majumdar
 
The UVA School of Data Science
The UVA School of Data ScienceThe UVA School of Data Science
The UVA School of Data SciencePhilip Bourne
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Joanne Luciano
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangePhilip Bourne
 
Analytics Unleashed_ Navigating the World of Data Science.pdf
Analytics Unleashed_ Navigating the World of Data Science.pdfAnalytics Unleashed_ Navigating the World of Data Science.pdf
Analytics Unleashed_ Navigating the World of Data Science.pdfkhushnuma khan
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data SciencePhilip Bourne
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptxShambhavi Vats
 
UVA School of Data Science
UVA School of Data ScienceUVA School of Data Science
UVA School of Data SciencePhilip Bourne
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AlonePhilip Bourne
 
Proposal for the Theme on Big Data.pdf
Proposal for the Theme on Big Data.pdfProposal for the Theme on Big Data.pdf
Proposal for the Theme on Big Data.pdfshayamiticharles
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsDhruv Saxena
 

Similar to The Analytics and Data Science Landscape (20)

Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptx
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
ppt1.pptx
ppt1.pptxppt1.pptx
ppt1.pptx
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First Course
 
Big data
Big dataBig data
Big data
 
The UVA School of Data Science
The UVA School of Data ScienceThe UVA School of Data Science
The UVA School of Data Science
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything Change
 
Analytics Unleashed_ Navigating the World of Data Science.pdf
Analytics Unleashed_ Navigating the World of Data Science.pdfAnalytics Unleashed_ Navigating the World of Data Science.pdf
Analytics Unleashed_ Navigating the World of Data Science.pdf
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data Science
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptx
 
UVA School of Data Science
UVA School of Data ScienceUVA School of Data Science
UVA School of Data Science
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
Proposal for the Theme on Big Data.pdf
Proposal for the Theme on Big Data.pdfProposal for the Theme on Big Data.pdf
Proposal for the Theme on Big Data.pdf
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Big Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARLBig Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARL
 

More from Philip Bourne

AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationPhilip Bourne
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingPhilip Bourne
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityPhilip Bourne
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?Philip Bourne
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug DiscoveryPhilip Bourne
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchPhilip Bourne
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data SciencePhilip Bourne
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewPhilip Bourne
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptxPhilip Bourne
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Philip Bourne
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision EducationPhilip Bourne
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Philip Bourne
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Philip Bourne
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance SustainabilityPhilip Bourne
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesPhilip Bourne
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in ResearchPhilip Bourne
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?Philip Bourne
 
The Most Important Ten Simple Rules
The Most Important Ten Simple RulesThe Most Important Ten Simple Rules
The Most Important Ten Simple RulesPhilip Bourne
 
Capstone Experience - SWOT Analysis
Capstone Experience - SWOT AnalysisCapstone Experience - SWOT Analysis
Capstone Experience - SWOT AnalysisPhilip Bourne
 
Data Science During and After COVID-19
Data Science During and After COVID-19Data Science During and After COVID-19
Data Science During and After COVID-19Philip Bourne
 

More from Philip Bourne (20)

AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a Conversation
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data Science
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance Sustainability
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular Scales
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?
 
The Most Important Ten Simple Rules
The Most Important Ten Simple RulesThe Most Important Ten Simple Rules
The Most Important Ten Simple Rules
 
Capstone Experience - SWOT Analysis
Capstone Experience - SWOT AnalysisCapstone Experience - SWOT Analysis
Capstone Experience - SWOT Analysis
 
Data Science During and After COVID-19
Data Science During and After COVID-19Data Science During and After COVID-19
Data Science During and After COVID-19
 

Recently uploaded

Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 

Recently uploaded (20)

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 

The Analytics and Data Science Landscape

  • 1. The Analytics & Data Science Landscape Philip E. Bourne peb6a@virginia.edu Analytics Challenges in Modern Tax Administration November 16, 2020
  • 2. Disclaimer • I pay taxes but typically do not get it right • My PhD is actually in physical chemistry • I did work for the NIH for 3 years
  • 3. On a Positive Note • I have been working with “big” data for many years • As Dean I am very interested in mapping the capabilities of our students to the needs of the workplace • As a researcher I am concerned that the research our school undertakes is for societal benefit
  • 4. This moment in time….
  • 5. What of the future? One view is the 6D’s 5
  • 6. Digitization Deception Disruption Demonetization Dematerialization Democratization Time Volume,Velocity,Variety Digital camera invented by Kodak but shelved Megapixels & quality improve slowly; Kodak slow to react Film market collapses; Kodak goes bankrupt Phones replace cameras Instagram, Flickr become the value proposition Digital media becomes bona fide form of communication From a presentation to the Advisory Board to the NIH Director Example - photography 6
  • 7. Everything is Digital Data to be Analyzed… Play the data science game – pick an object/subject and you will immediately see a reason why data science is important …
  • 8.
  • 9. If I were a tie maker I would be undertaking a data science analysis right now… Large collection of random images with metadata before and during the pandemic Who is still wearing ties? • Age • Profession • Ethnicity • Socioeconomic status • Location • ….. Causality – Does the pandemic represent a shift in tie wearing? If so by how much? Prediction – What will be the market post COVID?
  • 11. A Paradigm Shift Reflected in the Workforce Increased Demand over the Past Five Years 74% Artificial Intelligence specialists Top industries hiring this talent: Computer software, internet, information technology and services, higher education, consumer electronics 37% Data Scientist Top industries hiring this talent: Information technology and services, computer software, internet, financial services, higher education 33% Data Engineer Top industries hiring this talent: Information technology and services, internet, computer software, financial services, hospital and healthcare
  • 12. How is Academia Responding?
  • 13. Every University has Some Initiative
  • 14. Workforce Demand Outweighs Supply – A Problem for the IRS?
  • 15. The Rising Demand for Data Scientists *for graduates seeking employment 100% 100% 100% 98% 97% UVA School of Data Science Graduate Job Placement 2019 2018 2017 2016 2015 * Roles Machine Learning Engineer, Director of Data Science, Deep Learning Research Scientist, Senior Data Analyst, Data Science Developer, Consultant, Product Data Analyst, Financial Engineer, Engagement Manager & more Industries ● Finance ● Government ● Healthcare & Medicine ● Professional Sports ● Commerce ● Media ● Higher Ed ● Technology
  • 16. Recent Poll of Machine Learning PhD Students
  • 17. A New School for a New Century A School Without Walls Mission To be a national and international leader in responsible data science emphasizing interdisciplinary collaboration which results in furthering discovery, sharing knowledge, and societal benefit
  • 18. Our Working Definition • Use of the ever increasing amount of open, complex, diverse digital data frequently in ubiquitous cloud environments • Finding ways to ask and then answer relevant questions by combining such diverse data sets • Arriving at statistically significant conclusions not otherwise obtainable • Sharing such findings in a useful way • Translating such findings into actions that improve the human condition
  • 19. Use Case – Data Integration Researcher and Assistant Professor of Medicine Dr. Thomas Hartka, also a current online Masters in Data Science student, is combining two disparate data sets—electronic health records and DMV crash data—to save lives after motor vehicle crashes. “I enrolled in the MSDS program to expand my research on automotive safety. I have already used techniques from classes in my work. I hope to expand my research to real-time analytics to improve emergency room care.” — Dr. Thomas Hartka, UVA School of Medicine
  • 20. Guiding Principles • Excellence • Integrity • Diversity • Openness and transparency • FAIR data - the ability to Find, Access, Interoperate and Reuse data • Innovation • For the social good
  • 21. • Data/code as first class citizen – Part of promotion and tenure http://www.ncbi.nlm.nih.gov/pubmed/26207759 Why FAIR? [Adapted from Carole Goble] Only 12% of data from research is preserved
  • 22. Infrastructure Commons - Platform Stack https://datascience.nih.gov/commons Compute Platform: Cloud or HPC Services: APIs, Containers, Indexing, Software: Services & Tools scientific analysis tools/workflows Data “Reference” Data Sets User defined data DigitalObjectCompliance App store/User Interface Why not more like AirBnB? https://doi.org/10.1371/journal.pbio.2001818
  • 23. How and What are We Teaching …
  • 24. The 4+1 Model The model is based on the core insight that all definitions of data science assume a pipeline and that this pipeline forms a parallel process [From Raf Alvarado]
  • 25. Our Representation of Data Science The 4+1 Model • Value – assuring societal benefit • Design - Communication of the value of data • Systems – the means to communicate and convey benefit • Analytics – models and methods • Practice – where everything happens [From Raf Alvarado]
  • 26. The 4+1 Model Interplay [From Raf Alvarado] • Value + Design = Openness, responsibility • Value + Analytics = Human centered AI, algorithmic bias • Value + Systems = sustainability, access, environmental impact • Design + Analytics = literate programming, visualization • Design + Systems = dashboards, engineering design • Analytics + Systems = ML engineering
  • 27. The 4+1 Model 27 Integration Practice of DS, Capstones Analytics Linear Models, Data Mining, Bayesian ML, Deep Learning, Text Analytics, Foundations of CS Systems Programming and Systems, Big Data Analytics Value Ethics of Big Data Design Practice of DS, Visualization We strive to build a curriculum that aligns with our model
  • 28. Distinctive Features 28  Foundational topics in analytics from linear models to data mining and machine learning  An integrated curriculum developed in consultation with practicing data scientists that incorporates a challenging capstone experience  Applications and data drawn from different disciplines, e.g., science, business, and health  Instruction in the best practices in the management and conduct of data science projects  Computational methods built on the latest techniques in R and Python  Required course in data ethics  Emphasis on team science — data science is a team sport
  • 29. Analytics and Machine Learning 29 STAT 6021 Linear Models - Multiple linear regression, logistic regression. (R) SYS 6018 Data Mining - Tree-based methods, kernel methods, unsupervised learning. Uses An Introduction to Statistical Learning by James, Witten, Hastie and Tibshirani. (R) SYS 6014 Bayesian Machine Learning - Methods to handle uncertainty and to apply per variable weight distributions (as opposed to a single optimal value). (Python) SYS 6016 Machine Learning - Focuses on neural networks, including deep learning, convolutional neural networks, recurrent neural networks, and autoencoders. (Python)
  • 30. Computer Science 30 CS 5010 Programming and Systems for Data Science -- Python, Pandas, data analysis at scale and in context, some development practices. CS 5021 Foundations of CS -- Data structures, algorithms, complexity, relational and noSQL databases; "CS in a box."
  • 31. Data Ethics 31 Virtuous cycle between {computer science, statistics, applied mathematics} and the humanities Exemplified with use cases It’s a culture not a tick of the box Computer Science Statistics Applied Mathematics Humanities
  • 32. Practice and Application of Data Science 32 DS 6001 and 6003 focus on data design Flow of data between Human and Machine domains H → M: Establishing data so that it can be analyzed M → H: Presenting results of analysis to the world 6001: Data engineering pipeline -- acquiring, cleaning, exploring 6003: Data product development -- presenting, visualizing, app dev
  • 33. Electives 33 CS 6160 Theory of Computation CS 6444 Parallel Computing CS 6501 Text Mining CS 6750 Database Systems DS 5001 Exploratory Text Analytics DS 6559 Biomedical Cloud Computing Seminar SARC 5400 Data Visualization STAT 6250 Longitudinal Data Analysis STAT 6260 Categorical Data Analysis SYS 6023 Cognitive Systems Engineering SYS 6050 Risk Analysis SYS 6582 Reinforcement Learning SYS 7001 System and Decision Sciences
  • 34. Capstones 34 A parallel and culminating experience that focuses on a real world data science problem Emphasizes problem definition and scoping Employ project management Involves developing, evaluating, and creating a data product for a client Requires presentations, a proposal and a published paper (IEEE) Teams of students work on separate projects under guidance of an advisor
  • 35. Furthering Discovery to Build a Better World RESEARCH Cybersecurity Detecting broad-spectrum cyber threats almost immediately after they are launched through a $7.6 million Defense Advanced Research Projects Agency (DARPA) grant. Environment Using NASA data collected aboard the International Space Station to examine climate change in the Shenandoah National Forest and beyond, and find solutions Health & Medicine Securing high-performance computing equipment and personnel to allow collaboration across the university on brain science research like Autism, Alzheimer’s, mental health disorders, traumatic brain injuries and more. Business Discovering what makes a job interview successful for the candidate and the recruiter, and how to mitigate bias in the recruiting process Democracy Investigating how terrorist groups recruit women through propaganda and examining risk and threat assessment for extremist violence perpetrated by women. Education Helping economically disadvantaged, underrepresented populations pursue tailored educational workforce pathways that have a higher probability of leading them to success.
  • 36. Applying Data Science Across Industries “To tackle challenges in science and medicine.” — Elizabeth Driskell, MSDS ‘20 “To inform public policy and government.” — Bradley Katcher, MSDS ‘20 “I want to use data science to find a new way of thinking.” — Alex Gromadzki, MSDS ‘21 “I want to use data science to solve complex business problems.” — Ruslan Askerov, MSDS ‘21 “To address poverty and income inequality.” — Arti Patel, MSDS ‘20
  • 37. Growing the School M.S. IN DATA SCIENCE Residential & Online 2020 2020-2023 UNDERGRADUATE COURSES increase to 18 courses per AY 2021 PH.D. PROGRAM 2023 UNDERGRADUATE MAJOR Building occupied Team Size (FTEs) 5 40 60 80 120 Exec. Ed.
  • 38. SDS and IRS - Actions • Workforce pipeline - awareness • Continuing Ed opportunities • Provision of synthetic data • Funded and collaborative research • Faculty, Capstone, Presidential Fellowship, PhD Internships • IRS Admits – MSDS, PhD • Join the corporate commons • ….
  • 40. SDS Faculty Research Data Science Faculty member or affiliated faculty Website Research Interests Nada Basit https://engineering.virginia.edu/facul ty/nada-basit Machine Learning, Bioinformatics, Data Mining, Pattern Recognition Phil Bourne https://engineering.virginia.edu/facul ty/philip-e-bourne Multiscale Modeling Using Data Science Techniques Early Stage Drug Discovery and Drug Repurposing Early Stage Drug Methods and Tools for Macromolecular Don Brown https://engineering.virginia.edu/facul ty/donald-e-brown-phd Data Fusion, Knowledge Discovery, and Simulation Optimization Sallie Keller https://biocomplexity.virginia.edu/sal lie-keller social and decision informatics, statistical underpinnings of data science, and data access and confidentiality. Daniel Mietchen https://tools.wmflabs.org/scholia/aut hor/Q20895785 Computational Biology, Biodiversity integrating research workflows with the World Wide Web through open licensing, open standards, and open collaboration via Rafael Avarado http://transducer.ontoligent.com/ Cultural Analytics and Machine Learning, Digital Humanities, Text Analysis Heman Shakeri https://www.hemanshakeri.com/ structure and function of interconnected networks, often expressed via graphs that comprise a set of nodes and a set of connections between them. Jonathan Kropko https://facultydirectory.virginia.edu/f aculty/jk8sd methods to examine historical data, to test theories of voting in U.S. presidential elections, and to handle nonresponse in surveys. Michael Porter https://engineering.virginia.edu/facul ty/michael-d-porter event prediction, pattern and anomaly detection, and data linkage - applications for Criminology, Transportation, Terrorism, Defense, Security, Forensics, Business Mohammad Fallahi-Sichani new hire designing and building new experimental and computational tools to enable the analysis, interpretation and rational modulation of multi-scale processes that Jack Van Horn https://scholar.google.com/citations? user=i9bGqbgAAAAJ&hl=en Psychology and Data Science, Cognitive Neuroscience Pete Alonzi https://github.com/alonzi Vicente Ordonez https://engineering.virginia.edu/facul ty/vicente-ordonez-roman Computer Vision, Natural Language Processing and Machine Learning Tim Clark https://scholar.google.com/citations? user=k-iwlCUAAAAJ&hl=en next generation approaches for biomedical communications and data integration, including semantically integrated data repositories, claims and Gerard Learmonth https://www.researchgate.net/profil e/Gerard_Learmonth Generation and testing of pseudorandom number generators; Abstract database design; Strategic applications of information systems and technology Hongning Wang http://www.cs.virginia.edu/~hw5x/ data mining, machine learning, and information retrieval, with a special emphasis on computational user behavior modelin Stephen Adams http://www.nsfcvdi.org/wordpress/c vdi_personnel/steven-adams-ph-d/ Adaptive Decision Systems Lab at UVA and his research is applied to several domains including activity recognition, prognostics and health management for manufacturing Aidong Zhang https://engineering.virginia.edu/facul ty/aidong-zhang ML, Data mining, bioinformatics Jundong Li http://people.virginia.edu/~jl6qk/ Data Mining, Machine Learning, Social Computing, and Deep Learning Brian Wright https://www.linkedin.com/in/brian- wright-ph-d-90063027/