R users already know why the R language is the lingua franca of statisticians today: because it's the most powerful statistical language in the world. Revolution Analytics builds on the power of open source R, and adds performance, productivity and integration features to create Revolution R Enterprise. In this presentation, author and blogger David Smith will introduce the additional capabilities of Revolution R Enterprise.
3. October 19, 2011: Welcome! Revolution Confidential
Thanks for coming.
Slides and replay available (soon) at:
http://bit.ly/p6ulsu
David Smith
VP Marketing, Revolution Analytics
Editor, Revolutions blog
http://blog.revolutionanalytics.com
Twitter: @revodavid
3
4. In today’s webcast: Revolution Confidential
About Revolution Analytics and R
What Revolution R adds to R
Resources for getting more from R
Q&A
Introducing Revolution R 4
5. Download the White PaperConfidential
What is R? R is Hot
Revolution
bit.ly/r-is-hot
Data analysis software
A programming language
Development platform designed by and for statisticians
An environment
Huge library of algorithms for data access, data
manipulation, analysis and graphics
An open-source software project
Free, open, and active
A community
Thousands of contributors, 2 million users
Resources and help in every domain
5
6. R is exploding in popularity and
Revolution Confidential
functionality
Scholarly Activity
Google Scholar hits (’05-’09 CAGR)
R 46% “I’ve been astonished by the rate at which
R has been adopted. Four years ago,
SAS -11%
everyone in my economics department [at
SPSS -27%
the University of Chicago] was using
Stata; now, as far as I can tell, R is the
S-Plus 0% standard tool, and students learn it first.”
Stata 10%
Deputy Editor for New Products at Forbes
Package Growth
Number of R packages listed on CRAN
“A key benefit of R is that it provides near-
2500 instant availability of new and
experimental methods created by its user
2000
base — without waiting for the
1500 development/release cycle of commercial
software. SAS recognizes the value of R
1000 to our customer base…”
500
0 Product Marketing Manager SAS Institute, Inc.
2002 2004 2006 2008 2010
Source: http://r4stats.com/popularity 6
7. “R is the most powerful & flexible statistical
Revolution Confidential
programming language in the world” 1
Capabilities
Sophisticated
statistical analyses
Predictive analytics
Data visualization
Applications
Real-time trading MSFT [2009-01-02/2010-03-31]
Last 29.29
Finance 30
Risk assessment 25
Forecasting 20
Bio-technology 250
200
Volume (millions):
63,760,000
15
150
Drug development 100
50
6
Moving Average Convergence Divergence (12,26,9):
4
MACD: 0.702
Social networks 2
0
-2
Signal: 0.712
-4
.. and more -6
Jan 02 2009 Apr 01 2009 Jul 01 2009 Oct 01 2009 Jan 04 2010 Mar 31 2010
1. Norman Nie, multiple interviews 7
8. From: The R Ecosystem
R User Community bit.ly/R-ecosystem
8
11. R Productivity Environment (Windows)
Revolution Confidential
Script with type
ahead and code Solutions window
snippets for organizing
code and data
Sophisticated
debugging with
breakpoints , variable Objects
values etc. loaded in the
R
Environment
Packages Object
installed and details
loaded
http://www.revolutionanalytics.com/demos/revolution-productivity-environment/demo.htm
11
12. Interactive Debugging Revolution Confidential
One-click to set a breakpoint in an R script
Step in/out/over, inspect variables
Eliminate the edit -> browser -> repair cycle
12
13. Revolution Confidential
Performance: Multi-threaded Math
Open Revolution R
Source R Enterprise
Computation (4-core laptop) Open Source R Revolution R Speedup
Linear Algebra1
Matrix Multiply 327 sec 13.4 sec 23x
Cholesky Factorization 31.3 sec 1.8 sec 17x
Linear Discriminant Analysis 216 sec 74.6 sec 2x
General R Benchmarks2
R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x
R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable
1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php
2. http://r.research.att.com/benchmarks/
13
14. Three Paradigms for Big Data Revolution Confidential
Standard R engine is constrained by
capacity and performance
Revolution R Enterprise offers three
methods for big data with R:
Off-line: high-performance file-based analytics
Off-line, parallel & distributed analytics
On-line, in-database analytics
Hadoop
Netezza
14
15. Revolution R Enterprise with RevoScaleR
Revolution Confidential
Big Data Statistics in R
www.revolutionanalytics.com/bigdata
Every US airline
departure and arrival,
1987-2008
File: AirlineData87to08.xdf
Rows: 123.5 million
Variables: 29
Size on disk: 13.2Gb
arrDelayLm2 <- rxLinMod(ArrDelay ~ DayOfWeek:F(CRSDepTime),cube=TRUE)
15
16. Example: Old Wives Census Analysis Revolution Confidential
http://info.revolutionanalytics.com/Cen
susOldWivesWhitePaper.html
16
17. RevoScaleR – Distributed Computing Revolution Confidential
Compute • Portions of the data source are
Data Node made available to each compute
Partition (RevoScaleR) node
• RevoScaleR on the master node
Compute assigns a task to each compute
Data Node node
Partition (RevoScaleR)
Master • Each compute node independently
Node processes its data, and returns its
Compute (RevoScaleR) intermediate results back to the
Data Node master node
Partition (RevoScaleR)
• master node aggregates all of the
intermediate results from each
Compute compute node and produces the
Data Node final result
Partition (RevoScaleR)
*Available for Microsoft HPC Server, November 2011
Video demo: http://bit.ly/riUBgs
17
19. RevoConnectR for Hadoop Revolution Confidential
Write Map-Reduce analytics using
HBASE only R code with these R
packages:
HDFS
rhdfs - R and HDFS
R
Thrift rhbase - R and HBASE
Map or
Reduce
rmr- R and MapReduce
Task rhbase
rhdfs
Node
Revolution R More information at:
Job Client bit.ly/r-hadoop
Tracker rmr
19
20. Enterprise Readiness:
Revolution Confidential
Revolution R Enterprise Server
Multi-User Support
Production Applications
Integrate R analytics into Web based applications
Data Analysis and Visualization
Reporting
Dashboards
Interactive applications
Revolution R Enterprise Server with RevoDeployR
20
21. Deployment with Revolution R Enterprise Revolution Confidential
End User Desktop Business
Interactive Web
Applications Intelligence
Applications
(e.g. Excel) (e.g. Jaspersoft)
Application
Client libraries (JavaScript, Java, .NET)
Developer
HTTP/HTTPS – JSON/XML
R RevoDeployR Web Services
Programmer
Session Data/Script
Authentication Administration
Management Management
R
21
26. Why R? Revolution Confidential
Every data analysis technique at your fingertips
Create beautiful and unique data visualizations
Get better results faster
Draw on the talents of data scientists worldwide
R is hot, and growing fast
26
27. Revolution R Enterprise Revolution Confidential
Production-Grade Statistical Analysis for the Workplace
High-performance R for multiprocessor systems
Modern Integrated Development Environment
Statistical Analysis of Terabyte-Class Data Sets
In-database R analytics with Hadoop and Netezza
Deploy R Applications via Web Services
Telephone and email technical support
Training and consulting services
100% compatible with R packages
Easy-to-Use GUI1
1 Coming Soon 27
28. Further Reading Revolution Confidential
http://bit.ly/revo-r-pdf http://bit.ly/r-is-hot
28
29. Revolution Confidential
Revolution R Enterprise: Free to Academia
Personal use
Research
Teaching
Package development
Free Academic Download
www.revolutionanalytics.com/downloads/free-academic.php
Discounted Technical Support Subscriptions Available
29
30. Thank You! Revolution Confidential
Download slides, replay (from Oct 20)
http://bit.ly/railcj
Learn more about Revolution R
revolutionanalytics.com/products
Contact Revolution Analytics
http://bit.ly/hey-revo
Special Offer: Revolution R Enterprise Workstation for $499
Including R Productivity Environment (IDE) with visual debugger, multi-processor
capabilities, Big Data analysis with RevoScaleR, and Technical Support
Available until November 15 at http://bit.ly/revo-499
30
32. Revolution Confidential
The leading commercial provider of software and support for the
popular open source R statistics language.
www.revolutionanalytics.com
+1 (650) 646 9545
Twitter: @RevolutionR
32
Editor's Notes
2M+ users, 3000+ packagesFully programmable statistical languageComplete library of statistical functionsUnparalleled representational graphicsSupplanted and replaced SAS/SPSS in the AcademyPenetrated enterprises where sophisticated statistical modeling is mission-criticalFinance, Pharma, etc.
Type ahead: the IDE recognizes an R function as you type in the first few characters and shows the completed formula and parametersCode snippets: Templates for common R functions e.g. for loop, xy plot. These are written in XML and users can add their ownSolution Window: The RPE organizes R scripts and data files in folders by Solution. This facilitates but does not implement versioningThe lists of packages of installed and the list of loaded packages are available for inspection. Clicking on these packages shows their components in the object windowThe top right Object Browser window shows all of the objects available in the R environmentThe bottom right object window shows the details of particular objectsDebugging Tools: when running in debugging mode the RPE supports breakpoints, stepping in and out of code and shows the contents of variables upon “mouse over”.Users may step through all code available in the Solution that is active.