In this talk, delivered to the Cincinnati Business Intelligence Group on June 30, 2015, I discuss how businesses can benefit from the accelerating commoditization of advanced technology. From Hadoop, Spark and Splunk to IBM's Watson, smart machines are here to stay. Recent advances in machine learning, predictive analytics and Big Data will soon be available to mid-sized and smaller companies. Learn what these techniques are, how to build the cultural support necessary for change, and where to start in your own organization. Those that time these new technologies right will gain an edge on the competition. Through numerous real-life examples, I show that the future is already here.
Links to case studies in the presentation are below.
10 Use Cases for Hadoop:
http://www.intelcloudbuilders.com/docs/cloudera_WP_10_Common_Hadoopable_Problems.pdf
Market segmentation using clustering: http://www.greenbook.org/Content/TRC/4ExMarketSeg.pdf
Product recommendations at banks:
http://readwrite.com/2008/07/16/strands_brings_recommendation
http://pivotal.io/big-data/case-study/facilitating-data-analysis-to-better-understand-and-serve-customers-zions-bancorporation
1. Main Street, Meet Mr. Watson
The Accelerating Commoditization of Smart
Machines and How Businesses Can React
Matt Coatney
Director, WilmerHale LLP
Founder, Five Spot Research Ltd
matt@fivespotresearch.com
/in/mattcoatney
@mattdcoatney
2. A call to action: smart machines are here to stay
Key technologies and when to use them
How to prepare your organization
Key takeaways
5. Technology introduction AND adoption are accelerating
Source: Wall St Journal/Asymco
1900 1920 1940 1960 1980 2000 2020
Stove
Telephone
Electricity
Car
Radio
Washer
Refrigerator
TV
Dryer
AC
Dishwasher
Color TV
Microwave
VCR
Game Console
PC
Cellphone
Internet
Smartphone
HDTV
Tablet
6. Advanced technologies ARE within reach
1-2 years 3-5 years
Grouping similar items
Clustering and classification
Predicting behavior and
making recommendations
Classification, regression and association mining
Handling very large data
Cluster computing, NoSQL, etc.
Automating decisions and
improving interaction
Deep learning/AI
7. Grouping similar items
What it is Clustering: groups similar objects together (no pre-assigned categories)
Classification: groups similar objects together based on category samples
Common uses Content categorization and tagging
Customer and market segmentation
Medicine and insurance
Examples of
vendors
8. Full-text concepts
Document title
Document metadata
(date, author, practice, etc.)
Grouping similar items: classification example
Brief
Contract
Memorandum
Decision tree
Association rules
9. Grouping similar items: clustering example
http://www.greenbook.org/Content/TRC/4ExMarketSeg.pdf
Non-Traditionals (internet)
Direct Buyers (mail/phone)
Budget Conscious
Agent Loyals (personal touch)
Hassle-Free (passive)
Survey data
Geodemographic data
Credit information
Self-Organizing Map
(Neural Network)
10. Predicting behavior and
making recommendations
What it is Classification: predict new category outcomes based on similar objects
Regression: predict new numeric values based on similar objects/past performance
Association: predict commonly co-occurring objects, e.g. items bought together
Common uses E-commerce – more like this, frequently bought together,
people who viewed this also viewed, etc.
Commerce – stocking, supply chain/distribution
Fraud detection and cyber security
Examples of
vendors
11. Predicting behavior and making
recommendations: regression example
Doctors’ visits
Procedures
Prescriptions
Hospital stays
Clustering and
Regression Models
(Multiple Approaches –
Ensemble/Panel of Experts)
Likelihood of hospitalization
in the next year
12. Predicting behavior and making
recommendations: association example
http://readwrite.com/2008/07/16/strands_brings_recommendation http://pivotal.io/big-data/case-study/facilitating-data-analysis-to-
better-understand-and-serve-customers-zions-bancorporation
C1 checking biz card merchant payroll
C2 biz card merchant
C3 merchant payroll
C4 merchant biz card checking
….
Association Rule
Algorithms
Business card => Merchant Acct
Checking => Business card
…
13. Handling very large data
What it is Cluster computing: large arrays of commodity hardware
No SQL: efficient storage and retrieval for very large, semi-structured data
Scalable machine learning: clustering, classification, etc. optimized for large data
Data visualization: techniques and tools for meaningful display of massive data sets
Common uses High-volume transactions (e.g. customer interactions, web logs)
Social media interactions
“Internet of Things” (IoT) sensor data
Scientific computing
Examples of
vendors
15. Handling very large data: Hadoop example
http://www.intelcloudbuilders.com/docs/cloudera_WP_10_Common_Hadoopable_Problems.pdf
Customer churn at a
telecom company
Customer information
Call log data
Social media data
Location/cell coverage
Handset replacement and
current market options
Hadoop cluster and related
analysis components
(e.g. Mahout)
Likelihood a customer would
leave the carrier (e.g. friends
leaving, coverage issues)
16. Handling very large data: more examples
http://www.intelcloudbuilders.com/docs/cloudera_WP_10_Common_Hadoopable_Problems.pdf
Risk Modeling
“… A very clear picture of a customer’s financial
situation, his risk of default or late payment and
his satisfaction with the bank and its service.”
Ad Targeting
“The model uses large amounts of historical data on user
behavior to cluster ads and users, and to deduce preferences.”
17. Handling very large data: more examples
http://www.intelcloudbuilders.com/docs/cloudera_WP_10_Common_Hadoopable_Problems.pdf
Retail Promotion
Campaigns
“Hadoop was able to store the data from the sensors
inexpensively, so that the power company could afford to keep
long-term historical data around for forensic analysis. As a result,
the power company can see, and react to, long-term trends and
emerging problems in the grid.”
“The retailer loaded 20 years of sales transactions history into
a Hadoop cluster. It built analytic applications on the SQL
system for Hadoop, called Hive, to perform the same analyses
that it had done in its data warehouse system—but over much
larger quantities of data, and at much lower cost.”
Power Failure
Prediction
18. Automating decisions and
improving interaction
What it is More processing + better algorithms + much more data
Deep learning: incrementally trained, stacked neural networks; allows more
complex, nuanced patterns to be learned
Ensemble/panel of experts: improved performance combining multiple approaches
Common uses Cognitive tasks: answering questions, speech/image/audio recognition, game play
Improved human-computer interaction
Healthcare (e.g. Watson cancer diagnosis and treatment)
Examples of
vendors