This document discusses analytics education in the era of big data. It begins with an overview of different terms used such as analytics, data mining, data science, and knowledge discovery. It then discusses trends in big data including the 3 V's of volume, velocity, and variety. It notes that skills and jobs in analytics are in high demand but there is a shortage of people with deep analytical skills. The document provides an overview of analytics education including various certificate programs and online courses available. It emphasizes that analytics education works best when combined with learning by doing through competitions and hands-on projects.
9. >50% of “Analytics” searches are for
“Google Analytics”
Google Analytics introduced,
Dec 2005
(c) KDnuggets 2012 9
10. Google Trends observations
(as of Jan 2013) Decline in analytics in
2012?
data mining: 16 analytics -google: 54
Competing on Analytics book,
Tom Davenport, Apr 2007 Vacation drops
(c) KDnuggets 2013
11. Global View: searches for
data mining, analytics -google
Google Trends
(c) KDnuggets 2013 11
18. 3 Vs of Big Data
• Volume
– Gigabytes to Terabytes to Petabytes …
• Velocity
– online streaming
• Variety
– numbers, text, links, images, audio, video, …
(c) KDnuggets 2013 18
19. Volume + Velocity => No consistency
• CAP Theorem (Eric Brewer, 2000)
For highly scalable distributed systems, you can only
have two of following:
– 1) consistency,
– 2) high availability, and
– 3) (network) partition tolerance (network failure tolerance)
http://www.julianbrowne.com/article/viewer/brewers-cap-
theorem
Implication: Big data solutions must stop worrying
about consistency if they want high availability
(c) KDnuggets 2013 19
20. Big Data
• 2nd Industrial Revolution
• Do old activities better
• Create new activities/businesses
(c) KDnuggets 2013 20
21. Doing Old Things Better
“Classical” Analytics Application areas
– Churn prediction
– Direct marketing/Customer modeling
– Recommendations
– Fraud detection
– Security/Intelligence
–…
• Competition will level companies
(c) KDnuggets 2013 21
22. Limit to Predicting Human Behavior?
• There is randomness in human behavior and
once we find first-level effects, there are
diminishing returns in prediction on individual
level
• Many examples: Netflix Prize, Customer
modeling…
Gregory Piatetsky-Shapiro, Big Data Hype and Reality,
Harvard Business Review blog, Oct 2012
(c) KDnuggets 2013 22
26. Big Data Enables New Things !
– Google – first big success of big data
– Social networks (Facebook, Twitter, LinkedIn, …)
success depends on network size, i.e. big data
– Location analytics
– Health-care
• Personalized medicine
– Semantics and AI ?
• Imagine IBM Watson, Siri in 2020 ?
– Beware of Loss of privacy
(c) KDnuggets 2012 26
34. Shortage of Skills
• McKinsey: shortage by 2018 in the US of
– 140-190,000 people with deep analytical skills
– 1.5 M managers/analysts with the know-how to
use the analysis of big data to make effective
decisions.
Source:
www.mckinsey.com/mgi/publications/big_data/
(c) KDnuggets 2012 34
40. Rebranding from
“Data Mining” to “Big Data”
Data Mining
Big Data
Data Scientist
“Data mining” jobs are much more common, but
“Big Data” jobs are surging much faster than “Data Scientist”
(c) KDnuggets 2011 40
41. LinkedIn Analytics/Data Mining Skills
“Ground” analytics
skills most common
“Cloud” analytics
skills grow
fastest
Text Analytics skills
less common
Sentiment Analysis
– fastest growing
(c) KDnuggets 2012 41
Boris Evelson, Forrester also adds 4th V – Variability (meaning not constant)
Churn: bestalgorithms for predicting churn have lift of 5-7 – 5-7 times better than random. Behavioral advertising: 2-3% CTR – 10 times better than random