SlideShare a Scribd company logo
1 of 50
Download to read offline
Batter Up! Advanced Sports Analytics with R and Storm 
Meeting the Real-Time Analytics Opportunity Head-On 
Bill Jacobs 
VP Product Marketing 
Revolution Analytics 
@bill_jacobs 
December 11, 2014 
Allen Day 
Principal Data Scientist 
MapR Technologies 
@allenday 
Vineet Sharma 
Dir., Partner Marketing 
MapR Technologies
Who Am I? 
Bill Jacobs, VP Product Marketing 
Revolution Analytics 
@bill_jacobs
Polling Question #1: 
Who Are You? (choose one) 
–Statistician or modeler 
–Data Scientist 
–Hadoop Expert 
–Application builder 
–Data guru 
–Business user 
–Baseball fan
Sports Analytics as Analogy. 
Sports Teams Are Like Other Corporations. 
–Great Value Achievable With Data 
–Vast Range of Data Sources 
–Timely Analysis Amplifies Value 
And apologies if you came to learn whom to bet upon in next year’s season.
Game Changing Big Data Analytics Applications 
Marketing: Clickstream & Campaign Analyses 
Digital Media: Recommendation Engines 
Social Media: Sentiment Analysis 
Retail: Purchase Prediction 
Insurance: Fraud Waste and Abuse 
Healthcare Delivery: Treatment Outcome Prediction 
Risk Analysis: Insurance Underwriting 
Manufacturing: Predictive Maintenance 
Operations: Supply Chain Optimization 
Econometrics: Market Prediction 
Marketing: Mix and Price Optimization 
Life Sciences: Pharmacogenetics 
Transportation: Asset Utilization
Polling Question #2: 
What Language or Tools Is In Use for Analytics (check all that apply) 
–R 
–SAS or SPSS 
–Python 
–Java 
–BI tools including: MSTR, Qlik, Tableau, Business Objects, Cognos 
–Salford Systems or MATLAB 
–H20, RapidMiner, KNIME or similar 
–Other data mining tools 
–Other programming languages 
–None or Don’t know
WELCOME & INTRODUCTIONS 
R Open Source 
-Language, Community, Collaboration 
-Robert Gentleman & Ross Ihaka, 1993 
-Version 1.0 released 2000 
-2.5 Million Global Users 
-Over 4,800 add-on “Packages” 
-Why R? R in Universities = New Talent Emerging Modeling/Visualization Lower Cost Alternative Open Source = Flexible & Innovative Access to Free Packages
R is exploding in popularity & functionality 
R Usage Growth Rexer Data Miner Survey, 2007-2013 
70% of data miners report using R 
R is the first choice of more data miners than any other software 
Source: www.rexeranalytics.com
Innovate with R 
Most widely used data analysis software 
•Used by 2M+ data scientists, statisticians and analysts 
Most powerful statistical programming language 
•Flexible, extensible and comprehensive for productivity 
Create beautiful and unique data visualizations 
•As seen in New York Times, Twitter and Flowing Data 
Thriving open-source community 
•Leading edge of analytics research 
Fills the talent gap 
•New graduates prefer R 
White Paper R is Hot bit.ly/r-is-hot
Polling Question #3: 
How are you using R today? (choose one) 
–Not using R 
–Studying R now 
–Initial R project(s) underway 
–R is widely used for exploration & modeling 
–R is deployed into production
Revolution Analytics In A Nutshell 
Our Vision: 
R is becoming the de- facto standard for enterprise predictive analytics 
Our Mission: 
Drive enterprise adoption of R by providing enhanced R products tailored to meet enterprise challenges
Revolution Analytics Builds & Delivers: 
Software Products: 
Stable Distributions 
Broad Platform Support 
Big Data Analytics in R 
Application Integration 
Deployment Platforms 
Agile Development Tooling 
Future Platform Support 
Support & Services 
Commercial Support Programs 
Training Programs 
Professional Services 
Academic Support Programs 
IP Indemnification
Revolution R Advantages for Analytics Professionals: 
Broadly-used, scalable R language 
Large (2M+), collaborative, young R analytics community 
Largest repository of statistical & analytical algorithms 
Big data analytics capabilities 
–Scales from workstations to Hadoop 
–Transparent parallelism 
–Cross platform compatibility 
–Multi-platform architectures 
Broadens career opportunities
Revolution R Advantages for Business Executives 
Viable Alternative to Legacy Analytics Solutions 
–Predictable Time To Results 
–Simplified Licensing 
–All-Inclusive Environment 
Lower Staffing Costs 
Controllable Open Source Risks 
–Support 
–IP Infringement Protections
Revolution R Advantages for IT Organizations 
Consistency Across Platforms Avoids Sprawl 
Support for Workstations, Servers, Hadoop, EDWs and Grids 
Heterogeneous Architecture Capabilities 
Integrates With Major BI & Application Tools 
Streamline Model Deployment 
Run Complex Analytics in the “Data Lake” 
Be a “Good Citizen” in shared systems 
Commercial Support Reduces Project Risks 
Quick Start Programs Accelerate Results 
Platform Continuity Future-Proofs Architectures
YARN 
Revolution R Enterprise: Predictive Analytics Across Huge Data in Hadoop 
Exploration Visualization Predictive Modeling 
HDFS
Polling Question #4: 
Stage of Hadoop Adoption? (choose one) 
–No Need 
–Studying 
–Setting-Up Hadoop 
–Experimenting with Hadoop 
–Deploying Hadoop Now 
–Hadoop in Production
© 2014 MapR Technologies 18 
Introducing: 
Vineet Sharma 
Director, Partner Marketing 
MapR Technologies
© 2014 MapR Technologies 19 
MapR + Revolution Leverages MapR As A Scalable 
Enterprise R Engine. 
• Plus: 
– Run RRE Analytics In MapR 
Hadoop Without Change 
– Eliminate Need To Design 
Parallel Software or “Think In 
MapReduce” 
– Leverage All Revolution R 
Enterprise Pre-Parallelized 
Algorithms 
– Enable Users To Build Custom 
Apps That Leverage Hadoop’s 
Parallelism 
– Slash Data Movement by 
Analyzing Data Inside the MapR 
Data Platform 
– Expand Deployment and 
Integration Options 
Rapid Adoption of R 
MapR Enterprise Data 
Platform Capabilities 
Broad Adoption of Hadoop 
for Big Data Analytics
© 2014 MapR Technologies 20 
Predictive 
Modeling 
Algorithms 
MapR 
FS 
Data 
Desktop Users with Analytical Access to Huge Data in 
Hadoop
© 2014 MapR Technologies 21 
MapR: Best Solution for Customer Success 
Top Ranked 
Exponential 
Growth 
500+ 
Customers 
Premier 
Investors 
>2x annual bookings 
80% of accounts expand 3X 
90% software licenses 
< 1% lifetime churn 
> $1B in incremental revenue 
generated by 1 customer
© 2014 MapR Technologies 22 
Management 
MapR Data Platform 
APACHE HADOOP AND OSS ECOSYSTEM 
Security 
YARN 
Pig 
Cascading 
Spark 
Batch 
Spark 
Streaming 
Storm* 
Streaming 
HBase 
Solr 
NoSQL & 
Search 
Juju 
Provisioning 
& 
coordination 
Savannah* 
Mahout 
MLLib 
ML, Graph 
GraphX 
MapReduce 
v1 & v2 
EXECUTION ENGINES DATA GOVERNANCE AND OPERATIONS 
Workflow 
& Data 
Governance 
Tez* 
Accumulo* 
Hive 
Impala 
Shark 
Drill 
SQL 
Sqoop Sentry* Oozie ZooKeeper 
Flume Knox* Falcon* Whirr 
Data 
Integration 
& Access 
HttpFS 
Hue 
MapR-FS MapR-DB 
* Certification/support planned 
The Power of the Open Source Community
© 2014 MapR Technologies 23 
MapR Distribution for Hadoop 
Management 
MapR Data Platform 
APACHE HADOOP AND OSS ECOSYSTEM 
Security 
YARN 
Pig 
Cascading 
Spark 
Batch 
Spark 
Streaming 
Storm* 
Streaming 
HBase 
Solr 
NoSQL & 
Search 
Juju 
Provisioning 
& 
coordination 
Savannah* 
Mahout 
MLLib 
ML, Graph 
GraphX 
MapReduce 
v1 & v2 
EXECUTION ENGINES DATA GOVERNANCE AND OPERATIONS 
Workflow 
& Data 
Governance 
Tez* 
Accumulo* 
Hive 
Impala 
Shark 
Drill 
SQL 
Sqoop Sentry* Oozie ZooKeeper 
Flume Knox* Falcon* Whirr 
Data 
Integration 
& Access 
HttpFS 
Hue 
Enterprise-grade Interoperability Performance Multi-tenancy Security Operational 
MapR-FS MapR-DB 
• Standard file access 
• Standard database 
access 
• Pluggable services 
• Broad developer 
support 
• Enterprise security 
authorization 
• Wire-level 
authentication 
• Data governance 
• Ability to support 
predictive analytics, 
real-time database 
operations, and 
support high arrival 
rate data 
• Ability to logically 
divide a cluster to 
support different 
use cases, job 
types, user groups, 
and administrators 
• 2X to 7X higher 
performance 
• Consistent, low 
latency 
• High availability 
• Data protection 
• Disaster recovery 
* Certification/support planned
Management 
APACHE HADOOP & OSS ECOSYSTEM 
ZooKeeper Oozie Hue Pig Hive Impala Shark 
Flume HttpFS Cascading Solr Juju Mahout MLLib 
Storm 
Spark 
Streaming Sqoop Whirr HBase YARN 
Drill Tez 
Knox Sentry 
Spark Falcon 
Revolution R Enterprise and MapR Hadoop 
Edge Node 
BI and other Apps 
R on the Desktop 
Via Browsers 
Mobile 
YARN 
MapReduce 
MapR Data Platform
© 2014 MapR Technologies 25 
Introducing: 
Allen Day 
Principal Data Scientist 
MapR Technologies 
@allenday
© 2014 MapR Technologies 26 
Talk Overview 
• Agile Real-time Stats 
• R + Storm 
github.com/allenday/R-Storm 
• DEMO 
• How to do it? 
• Q & A @allenday 
Agile 
Methods 
Advanced 
Statistics 
Continuous 
Real-time 
Delivery 
github.com/allenday/hadoop-summit-r-storm-demo-public
© 2014 MapR Techno©lo 2g0ie1s4 MapR Technologies 27 
Architecting R into the Storm 
Application Development Process
© 2014 MapR Technologies 28 
Quick intro 
• Allen Day, Principal Data Scientist [ @allenday ] 
7yr Hadoop dev, 12yr R dev/author 
PhD, Human Genetics, UCLA Medicine
© 2014 MapR Technologies 29 
What’s Storm? What’s R? 
• What’s Storm? 
– Processes a data stream. Akin to UNIX pipe + tee & merge commands 
– Runs on a cluster. Fault-tolerant and designed to scale out 
– Used for: real-time analytics & machine learning 
• What’s R? 
– Programming language with advanced statistics libraries 
– Does not scale out. Can scale up 
– Used for: prototyping, data modeling, visualization 
How to combine these?
© 2014 MapR Technologies 30 
R outside, Storm inside: not practical. Why? 
• Model-building and QA is done 
on data snapshots 
• However, R => Hadoop is 
realistic. Key difference: 
referenced data can be static 
– Use MapR snapshots for dev and 
QA 
– See also: RHIPE (Purdue) and 
RHadoop (RevolutionAnalytics) 
R 
Storm 
User
© 2014 MapR Technologies 31 
Storm outside, R inside: a good fit 
• Enables separation of concerns 
– Independently manage 
modeling, ops timelines, and 
version control 
– Integrate as needed 
• Enables role specialization 
– R built-ins allow faster iteration 
and more concise stats-type 
code 
– Do DevOps with specific SW 
engineering tech, e.g. Java 
Storm 
R 
User
© 2014 MapR Technologies 32 © 2014 MapR Technologies 
Q: Who really likes statistics? 
A: Baseball fans 
A: Team Managers = Portfolio Managers
© 2014 MapR Technologies 33 
Famous Vintage Data 
Oakland Athletics 
2002 Season 
20 consecutive 
wins – the current 
record
© 2014 MapR Technologies 34 © 2014 MapR Technologies 
Goal: Detect “Moneyball” 2002 Winning Streak
© 2014 MapR Technologies 35 
Methods: 
Change Point Detection 
Find natural breakpoints in a 
time-series set of data points 
R packages implement this: 
changepoint: more 
sensitve, but not streaming 
bcp: streaming, but less 
sensitive
© 2014 MapR Technologies 36 
GIFs to 
MapR 
Filesystem 
Methods: R+Storm Demo Architecture 
Storm Bolt 
R online 
change 
point 
detector 
Storm Bolt 
(write to Jetty) 
Oakland A’s 
Data 
(accelerated) 
Jetty 
Webserver 
Browser 
(D3.js) Us  
github.com/allenday/hadoop-summit-r-storm-demo-public
© 2014 MapR Technologies 37 © 2014 MapR Technologies 
50-game sliding 
window/buffer to 
detect change points 
Cumulative history 
with detected break 
points 
Raw data (score 
difference between 
A’s and opponent) 
Demo
© 2014 MapR Technologies 38 
Methods Details: How it’s done 
• Uses R-Storm binding github.com/allenday/R-Storm 
– Storm package on CRAN cran.r-project.org/web/packages/Storm 
Storm (dev 
team) 
R 
(stats team) 
Storm 
(dev team, pure 
Java) 
Producer Consumer
© 2014 MapR Technologies 39 
Methods Details: Easy integration 
R: lambda function 
storm = Storm$new(); 
storm$lambda = function(s) { 
t = s$tuple; 
t$output = vector(length=1); t$output[1] = “tada!” 
s$emit(t) 
} 
Storm: extend ShellBolt 
public static class MyRBolt extends ShellBolt implements 
IRichBolt { 
public RBolt() { 
super("Rscript", ”my.R"); 
} 
}
© 2014 MapR Technologies 40 
Results 
• Change points are identified, but none for winning streak 
– Not using score difference, anyway 
• Time to integrate with the modeling team! 
– Send @kunpognr or @allenday a pull request on GitHub 
• Applicable to many other use cases, e.g. 
– Security (fraud detection, intrusion detection) 
– Marketing (intent to purchase / social media streams) 
– Customer Support (help desk voice calls) 
Discussion
Polling Question #5 
How important will Real-Time analytical apps become? (choose one) 
–Uncertain 
–Not important 
–Necessary 
–Critical
Real-Time and Internet of Things: Foundation of a Compelling Trend for 2015 
Big Data Analytics Meets The Internet of Things 
–Transactions + 
–Human Behavior + 
–Internet of Things: Sensors 
… and extracting value using 
–Traditional Statistics 
–Visualization 
–Machine Learning 
… plus adaptability 
–Real-Time –Agile Modeling & Fast Model Execution 
–Production Capable, Stable and Secure 
–Rapidly-Evolving Data Science
Where Does Real Time Impact The Analytical Lifecycle? 
Data Engineering 
–Collection and Ingest 
–“Blending” 
Modeling 
–Aggregation, Segmentation & Exploration 
–Model Development & Optimization 
–Testing & Validation 
Operationalization 
–Deployment & Scoring 
–Delivery 
–Monitoring & Evaluation
Typical Analytical Lifecycle 
Ingest 
Explore 
Model 
Deploy 
Score 
Act 
Measure 
Model 
Score
More Complex Event Driven Analytical Cycle 
Historic Ingest 
Explore 
Model 
Deploy 
Act 
Measure 
Data Analytics & Process Design 
Scoring 
Event Ingest 
Trans- Form 
“Blend” 
Append 
Improve 
Enrich
Real-Time Analytics Best Practices 
Develop a Common Lexicon for Real-Time 
Discriminate Between Needs of Each Stage in Lifecycle 
–Data Ingest & Manipulation and Enrichment 
–Data Source / Repository Integration Needs 
–Processes that “Fill the Lake” 
–Process that “Act on the Stream” 
–Vastly different computationally 
–Big differences in data ingest volume & latency 
Start with Tractable Goals 
–Anticipate Growing Requirements: Microbatched >> Interactive >> Autonomy 
Build for today, Architect for tomorrow
Real-Time Realities 
Plan for Diverse Needs 
–Real-Time Score Retrieval, Scoring, Modeling 
–Wide-Ranging Performance – Microbatch – Interactive - Autonomous 
Fragmentation 
–Data Delivery Systems Pre-Exist 
–Will Vary Widely by Vertical Market 
–Competing Proprietary Solutions 
Growing Demand 
–Numerous high-value targets 
–“The next step”: Put big data analytics to work
What’s Needed 
Real-Time Performance… plus… 
Agility 
–Deployment models 
–Organization 
–Infrastructure 
–Analytics 
Manageable Costs 
–Hadoop 
–Open Source R 
Production Platform(s) 
–Proven 
–Performant
Next Steps… 
www.revolutionanalytcs.com 
Whitepaper: Revolution R for Hadoop: 
—http://www.revolutionanalytics.com/whitepaper/delivering-value-big-data-revolution-r-enterprise-and- hadoop 
—…or http://bit.ly/1ua43bu 
www.maprtech.com 
Resources: 
R foundation URL: www.r-project.org 
Download Revolution R: http://mran.revolutionanalytics.com/download/ 
Learn about Apache Storm: https://storm.apache.org/ 
R-Storm bindings: github.com/allenday/R-Storm 
Storm package on CRAN: cran.r-project.org/web/packages/Storm
Thank you. 
www.revolutionanalytics.com 
1.855.GET.REVO 
Twitter: @RevolutionR

More Related Content

What's hot

Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedIs Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedRevolution Analytics
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R ServicesGregg Barrett
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondRevolution Analytics
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceRevolution Analytics
 
How the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedHow the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedRevolution Analytics
 
Applications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the MarketplaceApplications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the MarketplaceRevolution Analytics
 
Revolution Analytics: a 5-minute history
Revolution Analytics: a 5-minute historyRevolution Analytics: a 5-minute history
Revolution Analytics: a 5-minute historyRevolution Analytics
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopRevolution Analytics
 
Big Data - Analytics with R
Big Data - Analytics with RBig Data - Analytics with R
Big Data - Analytics with RTechsparks
 
What's New in Revolution R Enterprise 6.2
What's New in Revolution R Enterprise 6.2What's New in Revolution R Enterprise 6.2
What's New in Revolution R Enterprise 6.2Revolution Analytics
 
Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013Revolution Analytics
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalRevolution Analytics
 
The Power of Unified Analytics with Ali Ghodsi
The Power of Unified Analytics with Ali Ghodsi The Power of Unified Analytics with Ali Ghodsi
The Power of Unified Analytics with Ali Ghodsi Databricks
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and VerilogGanesan Narayanasamy
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with RGreat Wide Open
 

What's hot (20)

Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedIs Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R Services
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per Second
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data Science
 
How the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedHow the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeed
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
 
Applications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the MarketplaceApplications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the Marketplace
 
R and Data Science
R and Data ScienceR and Data Science
R and Data Science
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
Revolution Analytics: a 5-minute history
Revolution Analytics: a 5-minute historyRevolution Analytics: a 5-minute history
Revolution Analytics: a 5-minute history
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
Applications of R (DataWeek 2014)
Applications of R (DataWeek 2014)Applications of R (DataWeek 2014)
Applications of R (DataWeek 2014)
 
Big Data - Analytics with R
Big Data - Analytics with RBig Data - Analytics with R
Big Data - Analytics with R
 
What's New in Revolution R Enterprise 6.2
What's New in Revolution R Enterprise 6.2What's New in Revolution R Enterprise 6.2
What's New in Revolution R Enterprise 6.2
 
R Then and Now
R Then and NowR Then and Now
R Then and Now
 
Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013Revolution R Enterprise - Portland R User Group, November 2013
Revolution R Enterprise - Portland R User Group, November 2013
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 final
 
The Power of Unified Analytics with Ali Ghodsi
The Power of Unified Analytics with Ali Ghodsi The Power of Unified Analytics with Ali Ghodsi
The Power of Unified Analytics with Ali Ghodsi
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and Verilog
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
 

Viewers also liked

Sloan Sports and Analytics Conference Day 1 Recap
Sloan Sports and Analytics Conference Day 1 RecapSloan Sports and Analytics Conference Day 1 Recap
Sloan Sports and Analytics Conference Day 1 RecapNeil Horowitz
 
Reproducibility with Revolution R Open
Reproducibility with Revolution R OpenReproducibility with Revolution R Open
Reproducibility with Revolution R OpenRevolution Analytics
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorRevolution Analytics
 
MYagonism basketball analytics innovation -public-
MYagonism basketball analytics innovation -public-MYagonism basketball analytics innovation -public-
MYagonism basketball analytics innovation -public-Paolo Raineri
 
2016 Sloan Sports Analytics Conference (Sports Business angle)
2016 Sloan Sports Analytics Conference (Sports Business angle)2016 Sloan Sports Analytics Conference (Sports Business angle)
2016 Sloan Sports Analytics Conference (Sports Business angle)Neil Horowitz
 
Reproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R ConferenceReproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R ConferenceRevolution Analytics
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint packageRevolution Analytics
 
Reproducibility with Revolution R Open and the Checkpoint Package
Reproducibility with Revolution R Open and the Checkpoint PackageReproducibility with Revolution R Open and the Checkpoint Package
Reproducibility with Revolution R Open and the Checkpoint PackageRevolution Analytics
 
In-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and RevolutionIn-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and RevolutionRevolution Analytics
 
Warranty Predictive Analytics solution
Warranty Predictive Analytics solutionWarranty Predictive Analytics solution
Warranty Predictive Analytics solutionRevolution Analytics
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Revolution Analytics
 
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseR + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseAllen Day, PhD
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source CommunitiesRevolution Analytics
 
Data Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RData Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RRadek Maciaszek
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with RRevolution Analytics
 
Predicting Football Using R
Predicting Football Using RPredicting Football Using R
Predicting Football Using RMartin Eastwood
 
2015 Q1 Sports Fan Engagement Conference Day 1 Recap
2015 Q1 Sports Fan Engagement Conference Day 1 Recap2015 Q1 Sports Fan Engagement Conference Day 1 Recap
2015 Q1 Sports Fan Engagement Conference Day 1 RecapNeil Horowitz
 
Clear understanding of conventions and forms
Clear understanding  of conventions and forms Clear understanding  of conventions and forms
Clear understanding of conventions and forms bethjones0312
 

Viewers also liked (20)

Sloan Sports and Analytics Conference Day 1 Recap
Sloan Sports and Analytics Conference Day 1 RecapSloan Sports and Analytics Conference Day 1 Recap
Sloan Sports and Analytics Conference Day 1 Recap
 
Reproducibility with Revolution R Open
Reproducibility with Revolution R OpenReproducibility with Revolution R Open
Reproducibility with Revolution R Open
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductor
 
MYagonism basketball analytics innovation -public-
MYagonism basketball analytics innovation -public-MYagonism basketball analytics innovation -public-
MYagonism basketball analytics innovation -public-
 
Sports Analytics 2015 Brochure
Sports Analytics 2015 BrochureSports Analytics 2015 Brochure
Sports Analytics 2015 Brochure
 
2016 Sloan Sports Analytics Conference (Sports Business angle)
2016 Sloan Sports Analytics Conference (Sports Business angle)2016 Sloan Sports Analytics Conference (Sports Business angle)
2016 Sloan Sports Analytics Conference (Sports Business angle)
 
Analytics - Sports Style, ESPN
Analytics - Sports Style, ESPNAnalytics - Sports Style, ESPN
Analytics - Sports Style, ESPN
 
Reproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R ConferenceReproducibility with Checkpoint & RRO - NYC R Conference
Reproducibility with Checkpoint & RRO - NYC R Conference
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint package
 
Reproducibility with Revolution R Open and the Checkpoint Package
Reproducibility with Revolution R Open and the Checkpoint PackageReproducibility with Revolution R Open and the Checkpoint Package
Reproducibility with Revolution R Open and the Checkpoint Package
 
In-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and RevolutionIn-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and Revolution
 
Warranty Predictive Analytics solution
Warranty Predictive Analytics solutionWarranty Predictive Analytics solution
Warranty Predictive Analytics solution
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
 
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseR + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source Communities
 
Data Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RData Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and R
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
 
Predicting Football Using R
Predicting Football Using RPredicting Football Using R
Predicting Football Using R
 
2015 Q1 Sports Fan Engagement Conference Day 1 Recap
2015 Q1 Sports Fan Engagement Conference Day 1 Recap2015 Q1 Sports Fan Engagement Conference Day 1 Recap
2015 Q1 Sports Fan Engagement Conference Day 1 Recap
 
Clear understanding of conventions and forms
Clear understanding  of conventions and forms Clear understanding  of conventions and forms
Clear understanding of conventions and forms
 

Similar to Batter Up! Advanced Sports Analytics with R and Storm

R and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with HadoopR and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with HadoopRevolution Analytics
 
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar 18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar Revolution Analytics
 
Robert Luong: Analyse prédictive dans Excel
Robert Luong: Analyse prédictive dans ExcelRobert Luong: Analyse prédictive dans Excel
Robert Luong: Analyse prédictive dans ExcelMSDEVMTL
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big DataInfochimps, a CSC Big Data Business
 
MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR Technologies
 
Kristof Coussement - The Debate: the Future of (Big) Data Analytics Software
Kristof Coussement - The Debate: the Future of (Big) Data Analytics SoftwareKristof Coussement - The Debate: the Future of (Big) Data Analytics Software
Kristof Coussement - The Debate: the Future of (Big) Data Analytics SoftwareBAQMaR
 
Creating Value That Scales with Revolution Analytics & Alteryx
Creating Value That Scales with Revolution Analytics & AlteryxCreating Value That Scales with Revolution Analytics & Alteryx
Creating Value That Scales with Revolution Analytics & AlteryxRevolution Analytics
 
Hadoop Summit EU - Crowd Sourcing Reflected Intelligence
Hadoop Summit EU - Crowd Sourcing Reflected IntelligenceHadoop Summit EU - Crowd Sourcing Reflected Intelligence
Hadoop Summit EU - Crowd Sourcing Reflected IntelligenceMapR Technologies
 
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...Revolution Analytics
 
Dba to data scientist -Satyendra
Dba to data scientist -SatyendraDba to data scientist -Satyendra
Dba to data scientist -Satyendrapasalapudi123
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014MapR Technologies
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution ShowcaseInside Analysis
 
Splunk hunkbeta
Splunk hunkbetaSplunk hunkbeta
Splunk hunkbetaAhnku Toh
 
Hadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataHadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataSenturus
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixNicolas Morales
 
Big data analytics on teradata with revolution r enterprise bill jacobs
Big data analytics on teradata with revolution r enterprise   bill jacobsBig data analytics on teradata with revolution r enterprise   bill jacobs
Big data analytics on teradata with revolution r enterprise bill jacobsBill Jacobs
 

Similar to Batter Up! Advanced Sports Analytics with R and Storm (20)

R and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with HadoopR and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with Hadoop
 
Revolution Analytics Podcast
Revolution Analytics PodcastRevolution Analytics Podcast
Revolution Analytics Podcast
 
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar 18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
 
Robert Luong: Analyse prédictive dans Excel
Robert Luong: Analyse prédictive dans ExcelRobert Luong: Analyse prédictive dans Excel
Robert Luong: Analyse prédictive dans Excel
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
 
MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012
 
Kristof Coussement - The Debate: the Future of (Big) Data Analytics Software
Kristof Coussement - The Debate: the Future of (Big) Data Analytics SoftwareKristof Coussement - The Debate: the Future of (Big) Data Analytics Software
Kristof Coussement - The Debate: the Future of (Big) Data Analytics Software
 
Creating Value That Scales with Revolution Analytics & Alteryx
Creating Value That Scales with Revolution Analytics & AlteryxCreating Value That Scales with Revolution Analytics & Alteryx
Creating Value That Scales with Revolution Analytics & Alteryx
 
Hadoop Summit EU - Crowd Sourcing Reflected Intelligence
Hadoop Summit EU - Crowd Sourcing Reflected IntelligenceHadoop Summit EU - Crowd Sourcing Reflected Intelligence
Hadoop Summit EU - Crowd Sourcing Reflected Intelligence
 
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
 
Dba to data scientist -Satyendra
Dba to data scientist -SatyendraDba to data scientist -Satyendra
Dba to data scientist -Satyendra
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution Showcase
 
Splunk hunkbeta
Splunk hunkbetaSplunk hunkbeta
Splunk hunkbeta
 
Hadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataHadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big Data
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with Bluemix
 
Big data analytics on teradata with revolution r enterprise bill jacobs
Big data analytics on teradata with revolution r enterprise   bill jacobsBig data analytics on teradata with revolution r enterprise   bill jacobs
Big data analytics on teradata with revolution r enterprise bill jacobs
 

More from Revolution Analytics

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudRevolution Analytics
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureRevolution Analytics
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15Revolution Analytics
 
A Step Towards Reproducibility in R
A Step Towards Reproducibility in RA Step Towards Reproducibility in R
A Step Towards Reproducibility in RRevolution Analytics
 
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...Revolution Analytics
 

More from Revolution Analytics (11)

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to Azure
 
R in Minecraft
R in Minecraft R in Minecraft
R in Minecraft
 
The case for R for AI developers
The case for R for AI developersThe case for R for AI developers
The case for R for AI developers
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
Reproducible Data Science with R
Reproducible Data Science with RReproducible Data Science with R
Reproducible Data Science with R
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
 
A Step Towards Reproducibility in R
A Step Towards Reproducibility in RA Step Towards Reproducibility in R
A Step Towards Reproducibility in R
 
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
 

Recently uploaded

Structuring and Writing DRL Mckinsey (1).pdf
Structuring and Writing DRL Mckinsey (1).pdfStructuring and Writing DRL Mckinsey (1).pdf
Structuring and Writing DRL Mckinsey (1).pdflaloo_007
 
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...ssuserf63bd7
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
Rice Manufacturers in India | Shree Krishna Exports
Rice Manufacturers in India | Shree Krishna ExportsRice Manufacturers in India | Shree Krishna Exports
Rice Manufacturers in India | Shree Krishna ExportsShree Krishna Exports
 
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All TimeCall 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All Timegargpaaro
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptxnandhinijagan9867
 
Putting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptxPutting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptxCynthia Clay
 
Lucknow Housewife Escorts by Sexy Bhabhi Service 8250092165
Lucknow Housewife Escorts  by Sexy Bhabhi Service 8250092165Lucknow Housewife Escorts  by Sexy Bhabhi Service 8250092165
Lucknow Housewife Escorts by Sexy Bhabhi Service 8250092165meghakumariji156
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwaitdaisycvs
 
Over the Top (OTT) Market Size & Growth Outlook 2024-2030
Over the Top (OTT) Market Size & Growth Outlook 2024-2030Over the Top (OTT) Market Size & Growth Outlook 2024-2030
Over the Top (OTT) Market Size & Growth Outlook 2024-2030tarushabhavsar
 
Falcon Invoice Discounting: Tailored Financial Wings
Falcon Invoice Discounting: Tailored Financial WingsFalcon Invoice Discounting: Tailored Financial Wings
Falcon Invoice Discounting: Tailored Financial WingsFalcon Invoice Discounting
 
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...Falcon Invoice Discounting
 
Mifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in Oman
Mifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in OmanMifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in Oman
Mifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in Omaninstagramfab782445
 
Uneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration PresentationUneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration Presentationuneakwhite
 
TVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdf
TVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdfTVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdf
TVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdfbelieveminhh
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with CultureSeta Wicaksana
 
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 MonthsSEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 MonthsIndeedSEO
 
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGParadip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGpr788182
 

Recently uploaded (20)

Structuring and Writing DRL Mckinsey (1).pdf
Structuring and Writing DRL Mckinsey (1).pdfStructuring and Writing DRL Mckinsey (1).pdf
Structuring and Writing DRL Mckinsey (1).pdf
 
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
Horngren’s Cost Accounting A Managerial Emphasis, Canadian 9th edition soluti...
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Rice Manufacturers in India | Shree Krishna Exports
Rice Manufacturers in India | Shree Krishna ExportsRice Manufacturers in India | Shree Krishna Exports
Rice Manufacturers in India | Shree Krishna Exports
 
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All TimeCall 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
Putting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptxPutting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptx
 
Lucknow Housewife Escorts by Sexy Bhabhi Service 8250092165
Lucknow Housewife Escorts  by Sexy Bhabhi Service 8250092165Lucknow Housewife Escorts  by Sexy Bhabhi Service 8250092165
Lucknow Housewife Escorts by Sexy Bhabhi Service 8250092165
 
Buy gmail accounts.pdf buy Old Gmail Accounts
Buy gmail accounts.pdf buy Old Gmail AccountsBuy gmail accounts.pdf buy Old Gmail Accounts
Buy gmail accounts.pdf buy Old Gmail Accounts
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
 
Over the Top (OTT) Market Size & Growth Outlook 2024-2030
Over the Top (OTT) Market Size & Growth Outlook 2024-2030Over the Top (OTT) Market Size & Growth Outlook 2024-2030
Over the Top (OTT) Market Size & Growth Outlook 2024-2030
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
Falcon Invoice Discounting: Tailored Financial Wings
Falcon Invoice Discounting: Tailored Financial WingsFalcon Invoice Discounting: Tailored Financial Wings
Falcon Invoice Discounting: Tailored Financial Wings
 
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
 
Mifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in Oman
Mifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in OmanMifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in Oman
Mifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in Oman
 
Uneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration PresentationUneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration Presentation
 
TVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdf
TVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdfTVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdf
TVB_The Vietnam Believer Newsletter_May 6th, 2024_ENVol. 006.pdf
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 MonthsSEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
 
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGParadip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Paradip CALL GIRL❤7091819311❤CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
 

Batter Up! Advanced Sports Analytics with R and Storm

  • 1. Batter Up! Advanced Sports Analytics with R and Storm Meeting the Real-Time Analytics Opportunity Head-On Bill Jacobs VP Product Marketing Revolution Analytics @bill_jacobs December 11, 2014 Allen Day Principal Data Scientist MapR Technologies @allenday Vineet Sharma Dir., Partner Marketing MapR Technologies
  • 2. Who Am I? Bill Jacobs, VP Product Marketing Revolution Analytics @bill_jacobs
  • 3. Polling Question #1: Who Are You? (choose one) –Statistician or modeler –Data Scientist –Hadoop Expert –Application builder –Data guru –Business user –Baseball fan
  • 4. Sports Analytics as Analogy. Sports Teams Are Like Other Corporations. –Great Value Achievable With Data –Vast Range of Data Sources –Timely Analysis Amplifies Value And apologies if you came to learn whom to bet upon in next year’s season.
  • 5. Game Changing Big Data Analytics Applications Marketing: Clickstream & Campaign Analyses Digital Media: Recommendation Engines Social Media: Sentiment Analysis Retail: Purchase Prediction Insurance: Fraud Waste and Abuse Healthcare Delivery: Treatment Outcome Prediction Risk Analysis: Insurance Underwriting Manufacturing: Predictive Maintenance Operations: Supply Chain Optimization Econometrics: Market Prediction Marketing: Mix and Price Optimization Life Sciences: Pharmacogenetics Transportation: Asset Utilization
  • 6. Polling Question #2: What Language or Tools Is In Use for Analytics (check all that apply) –R –SAS or SPSS –Python –Java –BI tools including: MSTR, Qlik, Tableau, Business Objects, Cognos –Salford Systems or MATLAB –H20, RapidMiner, KNIME or similar –Other data mining tools –Other programming languages –None or Don’t know
  • 7. WELCOME & INTRODUCTIONS R Open Source -Language, Community, Collaboration -Robert Gentleman & Ross Ihaka, 1993 -Version 1.0 released 2000 -2.5 Million Global Users -Over 4,800 add-on “Packages” -Why R? R in Universities = New Talent Emerging Modeling/Visualization Lower Cost Alternative Open Source = Flexible & Innovative Access to Free Packages
  • 8. R is exploding in popularity & functionality R Usage Growth Rexer Data Miner Survey, 2007-2013 70% of data miners report using R R is the first choice of more data miners than any other software Source: www.rexeranalytics.com
  • 9. Innovate with R Most widely used data analysis software •Used by 2M+ data scientists, statisticians and analysts Most powerful statistical programming language •Flexible, extensible and comprehensive for productivity Create beautiful and unique data visualizations •As seen in New York Times, Twitter and Flowing Data Thriving open-source community •Leading edge of analytics research Fills the talent gap •New graduates prefer R White Paper R is Hot bit.ly/r-is-hot
  • 10. Polling Question #3: How are you using R today? (choose one) –Not using R –Studying R now –Initial R project(s) underway –R is widely used for exploration & modeling –R is deployed into production
  • 11. Revolution Analytics In A Nutshell Our Vision: R is becoming the de- facto standard for enterprise predictive analytics Our Mission: Drive enterprise adoption of R by providing enhanced R products tailored to meet enterprise challenges
  • 12. Revolution Analytics Builds & Delivers: Software Products: Stable Distributions Broad Platform Support Big Data Analytics in R Application Integration Deployment Platforms Agile Development Tooling Future Platform Support Support & Services Commercial Support Programs Training Programs Professional Services Academic Support Programs IP Indemnification
  • 13. Revolution R Advantages for Analytics Professionals: Broadly-used, scalable R language Large (2M+), collaborative, young R analytics community Largest repository of statistical & analytical algorithms Big data analytics capabilities –Scales from workstations to Hadoop –Transparent parallelism –Cross platform compatibility –Multi-platform architectures Broadens career opportunities
  • 14. Revolution R Advantages for Business Executives Viable Alternative to Legacy Analytics Solutions –Predictable Time To Results –Simplified Licensing –All-Inclusive Environment Lower Staffing Costs Controllable Open Source Risks –Support –IP Infringement Protections
  • 15. Revolution R Advantages for IT Organizations Consistency Across Platforms Avoids Sprawl Support for Workstations, Servers, Hadoop, EDWs and Grids Heterogeneous Architecture Capabilities Integrates With Major BI & Application Tools Streamline Model Deployment Run Complex Analytics in the “Data Lake” Be a “Good Citizen” in shared systems Commercial Support Reduces Project Risks Quick Start Programs Accelerate Results Platform Continuity Future-Proofs Architectures
  • 16. YARN Revolution R Enterprise: Predictive Analytics Across Huge Data in Hadoop Exploration Visualization Predictive Modeling HDFS
  • 17. Polling Question #4: Stage of Hadoop Adoption? (choose one) –No Need –Studying –Setting-Up Hadoop –Experimenting with Hadoop –Deploying Hadoop Now –Hadoop in Production
  • 18. © 2014 MapR Technologies 18 Introducing: Vineet Sharma Director, Partner Marketing MapR Technologies
  • 19. © 2014 MapR Technologies 19 MapR + Revolution Leverages MapR As A Scalable Enterprise R Engine. • Plus: – Run RRE Analytics In MapR Hadoop Without Change – Eliminate Need To Design Parallel Software or “Think In MapReduce” – Leverage All Revolution R Enterprise Pre-Parallelized Algorithms – Enable Users To Build Custom Apps That Leverage Hadoop’s Parallelism – Slash Data Movement by Analyzing Data Inside the MapR Data Platform – Expand Deployment and Integration Options Rapid Adoption of R MapR Enterprise Data Platform Capabilities Broad Adoption of Hadoop for Big Data Analytics
  • 20. © 2014 MapR Technologies 20 Predictive Modeling Algorithms MapR FS Data Desktop Users with Analytical Access to Huge Data in Hadoop
  • 21. © 2014 MapR Technologies 21 MapR: Best Solution for Customer Success Top Ranked Exponential Growth 500+ Customers Premier Investors >2x annual bookings 80% of accounts expand 3X 90% software licenses < 1% lifetime churn > $1B in incremental revenue generated by 1 customer
  • 22. © 2014 MapR Technologies 22 Management MapR Data Platform APACHE HADOOP AND OSS ECOSYSTEM Security YARN Pig Cascading Spark Batch Spark Streaming Storm* Streaming HBase Solr NoSQL & Search Juju Provisioning & coordination Savannah* Mahout MLLib ML, Graph GraphX MapReduce v1 & v2 EXECUTION ENGINES DATA GOVERNANCE AND OPERATIONS Workflow & Data Governance Tez* Accumulo* Hive Impala Shark Drill SQL Sqoop Sentry* Oozie ZooKeeper Flume Knox* Falcon* Whirr Data Integration & Access HttpFS Hue MapR-FS MapR-DB * Certification/support planned The Power of the Open Source Community
  • 23. © 2014 MapR Technologies 23 MapR Distribution for Hadoop Management MapR Data Platform APACHE HADOOP AND OSS ECOSYSTEM Security YARN Pig Cascading Spark Batch Spark Streaming Storm* Streaming HBase Solr NoSQL & Search Juju Provisioning & coordination Savannah* Mahout MLLib ML, Graph GraphX MapReduce v1 & v2 EXECUTION ENGINES DATA GOVERNANCE AND OPERATIONS Workflow & Data Governance Tez* Accumulo* Hive Impala Shark Drill SQL Sqoop Sentry* Oozie ZooKeeper Flume Knox* Falcon* Whirr Data Integration & Access HttpFS Hue Enterprise-grade Interoperability Performance Multi-tenancy Security Operational MapR-FS MapR-DB • Standard file access • Standard database access • Pluggable services • Broad developer support • Enterprise security authorization • Wire-level authentication • Data governance • Ability to support predictive analytics, real-time database operations, and support high arrival rate data • Ability to logically divide a cluster to support different use cases, job types, user groups, and administrators • 2X to 7X higher performance • Consistent, low latency • High availability • Data protection • Disaster recovery * Certification/support planned
  • 24. Management APACHE HADOOP & OSS ECOSYSTEM ZooKeeper Oozie Hue Pig Hive Impala Shark Flume HttpFS Cascading Solr Juju Mahout MLLib Storm Spark Streaming Sqoop Whirr HBase YARN Drill Tez Knox Sentry Spark Falcon Revolution R Enterprise and MapR Hadoop Edge Node BI and other Apps R on the Desktop Via Browsers Mobile YARN MapReduce MapR Data Platform
  • 25. © 2014 MapR Technologies 25 Introducing: Allen Day Principal Data Scientist MapR Technologies @allenday
  • 26. © 2014 MapR Technologies 26 Talk Overview • Agile Real-time Stats • R + Storm github.com/allenday/R-Storm • DEMO • How to do it? • Q & A @allenday Agile Methods Advanced Statistics Continuous Real-time Delivery github.com/allenday/hadoop-summit-r-storm-demo-public
  • 27. © 2014 MapR Techno©lo 2g0ie1s4 MapR Technologies 27 Architecting R into the Storm Application Development Process
  • 28. © 2014 MapR Technologies 28 Quick intro • Allen Day, Principal Data Scientist [ @allenday ] 7yr Hadoop dev, 12yr R dev/author PhD, Human Genetics, UCLA Medicine
  • 29. © 2014 MapR Technologies 29 What’s Storm? What’s R? • What’s Storm? – Processes a data stream. Akin to UNIX pipe + tee & merge commands – Runs on a cluster. Fault-tolerant and designed to scale out – Used for: real-time analytics & machine learning • What’s R? – Programming language with advanced statistics libraries – Does not scale out. Can scale up – Used for: prototyping, data modeling, visualization How to combine these?
  • 30. © 2014 MapR Technologies 30 R outside, Storm inside: not practical. Why? • Model-building and QA is done on data snapshots • However, R => Hadoop is realistic. Key difference: referenced data can be static – Use MapR snapshots for dev and QA – See also: RHIPE (Purdue) and RHadoop (RevolutionAnalytics) R Storm User
  • 31. © 2014 MapR Technologies 31 Storm outside, R inside: a good fit • Enables separation of concerns – Independently manage modeling, ops timelines, and version control – Integrate as needed • Enables role specialization – R built-ins allow faster iteration and more concise stats-type code – Do DevOps with specific SW engineering tech, e.g. Java Storm R User
  • 32. © 2014 MapR Technologies 32 © 2014 MapR Technologies Q: Who really likes statistics? A: Baseball fans A: Team Managers = Portfolio Managers
  • 33. © 2014 MapR Technologies 33 Famous Vintage Data Oakland Athletics 2002 Season 20 consecutive wins – the current record
  • 34. © 2014 MapR Technologies 34 © 2014 MapR Technologies Goal: Detect “Moneyball” 2002 Winning Streak
  • 35. © 2014 MapR Technologies 35 Methods: Change Point Detection Find natural breakpoints in a time-series set of data points R packages implement this: changepoint: more sensitve, but not streaming bcp: streaming, but less sensitive
  • 36. © 2014 MapR Technologies 36 GIFs to MapR Filesystem Methods: R+Storm Demo Architecture Storm Bolt R online change point detector Storm Bolt (write to Jetty) Oakland A’s Data (accelerated) Jetty Webserver Browser (D3.js) Us  github.com/allenday/hadoop-summit-r-storm-demo-public
  • 37. © 2014 MapR Technologies 37 © 2014 MapR Technologies 50-game sliding window/buffer to detect change points Cumulative history with detected break points Raw data (score difference between A’s and opponent) Demo
  • 38. © 2014 MapR Technologies 38 Methods Details: How it’s done • Uses R-Storm binding github.com/allenday/R-Storm – Storm package on CRAN cran.r-project.org/web/packages/Storm Storm (dev team) R (stats team) Storm (dev team, pure Java) Producer Consumer
  • 39. © 2014 MapR Technologies 39 Methods Details: Easy integration R: lambda function storm = Storm$new(); storm$lambda = function(s) { t = s$tuple; t$output = vector(length=1); t$output[1] = “tada!” s$emit(t) } Storm: extend ShellBolt public static class MyRBolt extends ShellBolt implements IRichBolt { public RBolt() { super("Rscript", ”my.R"); } }
  • 40. © 2014 MapR Technologies 40 Results • Change points are identified, but none for winning streak – Not using score difference, anyway • Time to integrate with the modeling team! – Send @kunpognr or @allenday a pull request on GitHub • Applicable to many other use cases, e.g. – Security (fraud detection, intrusion detection) – Marketing (intent to purchase / social media streams) – Customer Support (help desk voice calls) Discussion
  • 41. Polling Question #5 How important will Real-Time analytical apps become? (choose one) –Uncertain –Not important –Necessary –Critical
  • 42. Real-Time and Internet of Things: Foundation of a Compelling Trend for 2015 Big Data Analytics Meets The Internet of Things –Transactions + –Human Behavior + –Internet of Things: Sensors … and extracting value using –Traditional Statistics –Visualization –Machine Learning … plus adaptability –Real-Time –Agile Modeling & Fast Model Execution –Production Capable, Stable and Secure –Rapidly-Evolving Data Science
  • 43. Where Does Real Time Impact The Analytical Lifecycle? Data Engineering –Collection and Ingest –“Blending” Modeling –Aggregation, Segmentation & Exploration –Model Development & Optimization –Testing & Validation Operationalization –Deployment & Scoring –Delivery –Monitoring & Evaluation
  • 44. Typical Analytical Lifecycle Ingest Explore Model Deploy Score Act Measure Model Score
  • 45. More Complex Event Driven Analytical Cycle Historic Ingest Explore Model Deploy Act Measure Data Analytics & Process Design Scoring Event Ingest Trans- Form “Blend” Append Improve Enrich
  • 46. Real-Time Analytics Best Practices Develop a Common Lexicon for Real-Time Discriminate Between Needs of Each Stage in Lifecycle –Data Ingest & Manipulation and Enrichment –Data Source / Repository Integration Needs –Processes that “Fill the Lake” –Process that “Act on the Stream” –Vastly different computationally –Big differences in data ingest volume & latency Start with Tractable Goals –Anticipate Growing Requirements: Microbatched >> Interactive >> Autonomy Build for today, Architect for tomorrow
  • 47. Real-Time Realities Plan for Diverse Needs –Real-Time Score Retrieval, Scoring, Modeling –Wide-Ranging Performance – Microbatch – Interactive - Autonomous Fragmentation –Data Delivery Systems Pre-Exist –Will Vary Widely by Vertical Market –Competing Proprietary Solutions Growing Demand –Numerous high-value targets –“The next step”: Put big data analytics to work
  • 48. What’s Needed Real-Time Performance… plus… Agility –Deployment models –Organization –Infrastructure –Analytics Manageable Costs –Hadoop –Open Source R Production Platform(s) –Proven –Performant
  • 49. Next Steps… www.revolutionanalytcs.com Whitepaper: Revolution R for Hadoop: —http://www.revolutionanalytics.com/whitepaper/delivering-value-big-data-revolution-r-enterprise-and- hadoop —…or http://bit.ly/1ua43bu www.maprtech.com Resources: R foundation URL: www.r-project.org Download Revolution R: http://mran.revolutionanalytics.com/download/ Learn about Apache Storm: https://storm.apache.org/ R-Storm bindings: github.com/allenday/R-Storm Storm package on CRAN: cran.r-project.org/web/packages/Storm
  • 50. Thank you. www.revolutionanalytics.com 1.855.GET.REVO Twitter: @RevolutionR