SlideShare a Scribd company logo
1 of 78
Download to read offline
Only one language ?
              R and Python




Polyglot applications with R and Python
                 [BARUG Meeting]


                  Laurent Gautier

                       DMAC / CBS


              November 13th, 2012
Only one language ?
                          R and Python




Disclaimer
  • This is not about:
Only one language ?
                          R and Python




Disclaimer
  • This is not about:
      • The comparative merits of scripting languages
Only one language ?
                          R and Python




Disclaimer
  • This is not about:
      • The comparative merits of scripting languages
      • Only Python & R
Only one language ?
                          R and Python




Disclaimer
  • This is not about:
      • The comparative merits of scripting languages
      • Only Python & R
  • This is about:
Only one language ?
                           R and Python




Disclaimer
  • This is not about:
      • The comparative merits of scripting languages
      • Only Python & R
  • This is about:
      • Accessing natively libraries implemented in a different language
Only one language ?
                           R and Python




Disclaimer
  • This is not about:
      • The comparative merits of scripting languages
      • Only Python & R
  • This is about:
      • Accessing natively libraries implemented in a different language
      • Bridging people and skill sets through a glue language
Only one language ?
                           R and Python




Disclaimer
  • This is not about:
      • The comparative merits of scripting languages
      • Only Python & R
  • This is about:
      • Accessing natively libraries implemented in a different language
      • Bridging people and skill sets through a glue language
      • Going from ideas to prototypes faster
Only one language ?
                           R and Python




Disclaimer
  • This is not about:
      • The comparative merits of scripting languages
      • Only Python & R
  • This is about:
      • Accessing natively libraries implemented in a different language
      • Bridging people and skill sets through a glue language
      • Going from ideas to prototypes faster
      • Putting some production into research
Only one language ?
                           R and Python




Disclaimer
  • This is not about:
      • The comparative merits of scripting languages
      • Only Python & R
  • This is about:
      • Accessing natively libraries implemented in a different language
      • Bridging people and skill sets through a glue language
      • Going from ideas to prototypes faster
      • Putting some production into research
      • Bringing research to production
Only one language ?
                         R and Python




Preamble



 Scenario
  • data people: Statisticians, data analysts
Only one language ?
                         R and Python




Preamble



 Scenario
  • data people: Statisticians, data analysts
  • Data people have a method M
Only one language ?
                         R and Python




Preamble



 Scenario
  • data people: Statisticians, data analysts
  • Data people have a method M
  • Data people want to work on something new
Only one language ?
                         R and Python




Preamble



 Scenario
  • data people: Statisticians, data analysts
  • Data people have a method M
  • Data people want to work on something new
  • Management wants an application for method M
Only one language ?
                         R and Python




Preamble



 Scenario
  • data people: Statisticians, data analysts
  • Data people have a method M
  • Data people want to work on something new
  • Management wants an application for method M
  • Management wants an application that uses method M
Only one language ?
                                         Polyglot programs
                         R and Python




1   Only one language ?
      Polyglot programs


2   R and Python
      Mapping types
      Functions
      Evaluation and memory
      Building an application
Only one language ?
                                     Polyglot programs
                     R and Python




Uniformity and coding standards
Only one language ?
                                     Polyglot programs
                     R and Python




Uniformity and coding standards
Only one language ?
                                     Polyglot programs
                     R and Python




Uniformity and coding standards
Only one language ?
                                     Polyglot programs
                     R and Python




Get the job done
Only one language ?
                                     Polyglot programs
                     R and Python




Get the job done

 Soloist
Only one language ?
                                          Polyglot programs
                          R and Python




Get the job done

 Soloist




   • Multi-talented individual
   • Documentation ?
Only one language ?
                                          Polyglot programs
                          R and Python




Get the job done
                                          Teamwork
 Soloist




   • Multi-talented individual
   • Documentation ?
Only one language ?
                                          Polyglot programs
                          R and Python




Get the job done
                                          Teamwork
 Soloist




                                              • Paired-programming
   • Multi-talented individual
                                              • Use the same tools ?
   • Documentation ?
                                              • Overlapping skills ?
Only one language ?
                                     Polyglot programs
                     R and Python




Monolithic development
Only one language ?
                                            Polyglot programs
                            R and Python




Monolithic development



  • Centralized
  • Top-down
  • Lot of planning
  • Long development, mostly only usable when complete
  • Stand in time
Only one language ?
                                          Polyglot programs
                          R and Python




Maintainability



 Why use one unique language ?
   • A legitimate managerial concern
   • In places Java Certifications replaced general programming
     degrees
   • Could good programmers matter more than the language ?
   • Back to finding a needle in a haystack
Only one language ?
                                           Polyglot programs
                           R and Python




Modularity at the heart of UNIX philosophy.

|><

  • No branching logic, unless going for shell scripts.
  • Shell script no often thought after for applications
  • The birth of scripting languages (Sed, Awk, Perl, . . . )
Only one language ?
                                        Polyglot programs
                        R and Python




• Projects are cross-fields, cross-specialization
• Cost of specification - design - implementation too high
• Especially when the lifespan of the application is too short (or the
  user base too small).
Only one language ?
                                            Polyglot programs
                            R and Python




Example from video games
   • Engines (generally in C++)
   • Scripting language for the ‘story’ and content
       • Python
       • Lua
       • Proprietary, others, . . .

   • Large projects (with a lot of money at stake)
   • Diverse competences (3D engine = story logic)
   • When speed of development is more important than speed of
     execution
 This can apply to other industries
   • Pipelines in visual effects
   • Bioinformatics
Mapping types
                   Only one language ?   Functions
                         R and Python    Evaluation and memory
                                         Building an application




1   Only one language ?
      Polyglot programs


2   R and Python
      Mapping types
      Functions
      Evaluation and memory
      Building an application
Mapping types
                           Only one language ?   Functions
                                 R and Python    Evaluation and memory
                                                 Building an application



R




    • Language for statistics, data analysis, and data visualization
    • Unmatched1 number of libraries for anything having to do with
         data
    • Specialized set of libraries for bioinformatics (Bioconductor)




    1
        Almost certainly
Mapping types
                      Only one language ?   Functions
                            R and Python    Evaluation and memory
                                            Building an application



Python




  • All-purpose scripting language
  • Unmatched2 number of libraries for about anything
  • Specialized sets of libraries for bioinformatics (Biopython, and a
       myriad smaller projects)




  2
      May be
Mapping types
                   Only one language ?   Functions
                         R and Python    Evaluation and memory
                                         Building an application



Python (continued)

  • Machine learning R does not have: PyBrain
  • Visualization tools R does not have: Mayavis, Blender
Mapping types
                   Only one language ?   Functions
                         R and Python    Evaluation and memory
                                         Building an application




Python is popular in Bioinformatics / DNA sequencing.
  • Galaxy pipeline/server framework is in Python
  • Some of the internal tools for the SOLiD are written in Python
  • Ion Torrent Server is a Django server
  • Oxford Nanopore control system is a server running Python
Mapping types
                         Only one language ?   Functions
                               R and Python    Evaluation and memory
                                               Building an application



Why use anything else than R ?




  • Build an application
  • Work with very large data
  • ‘Just because it can be done in R doesn’t mean you should do it’3




   3
       John Dennison, R Meetup presentation
Mapping types
              Only one language ?   Functions
                    R and Python    Evaluation and memory
                                    Building an application



R embedded in Python
Mapping types
                    Only one language ?   Functions
                          R and Python    Evaluation and memory
                                          Building an application



rpy2




  • Feels like a regular Python library
  • Embeds an R process
  • Can be thought of as a stateful library
Mapping types
                    Only one language ?   Functions
                          R and Python    Evaluation and memory
                                          Building an application




Two main parts:
  • Low-level interface
  • High-level interface
Mapping types
                             Only one language ?   Functions
                                   R and Python    Evaluation and memory
                                                   Building an application



Low-level interface




   • Close to R’s C-API
   • Let you do anything safe4 from that API
   • Expose R data structures as Python builtin structures




   4
       or so is the intent
Mapping types
                  Only one language ?   Functions
                        R and Python    Evaluation and memory
                                        Building an application



Types

 R             rpy2                           Python
 numeric       FloatSexpVector                float
 integer       IntSexpVector                  int
 char          StrSexpVector                  str
 logical       BoolSexpVector                 bool
 complex       ComplexSexpVector              complex
 list          ListSexpVector                 list
 environment   SexpEnvironment                dict
 function      SexpClosure                    function
 S4            SexpS4                         object
               SexpLang                       object
               SexpExtPtr                     object
Mapping types
                       Only one language ?   Functions
                             R and Python    Evaluation and memory
                                             Building an application



Vectors and arrays




   • C-like: Contiguous blocks of memory
   • R objects exposed to Python as sequences or C-like arrays, with
     or without copy
Mapping types
                    Only one language ?   Functions
                          R and Python    Evaluation and memory
                                          Building an application


R
    v <- seq(1, 10)
    v[1]   # select the first element
    w <- v + 1 # add 1 to all elts
Mapping types
                    Only one language ?   Functions
                          R and Python    Evaluation and memory
                                          Building an application


R
    v <- seq(1, 10)
    v[1]   # select the first element
    w <- v + 1 # add 1 to all elts



rpy2.rinterface
import rpy2.rinterface as ri; ri.initr()
v = ri.IntSexpVector(range(1, 11))
v[0]   # select the first element
w = ri.IntSexpVector([x+1 for x in v])
Mapping types
                    Only one language ?   Functions
                          R and Python    Evaluation and memory
                                          Building an application


R
    v <- seq(1, 10)
    v[1]   # select the first element
    w <- v + 1 # add 1 to all elts



rpy2.rinterface
import rpy2.rinterface as ri; ri.initr()
v = ri.IntSexpVector(range(1, 11))
v[0]   # select the first element
w = ri.IntSexpVector([x+1 for x in v])


rpy2.robjects
import rpy2.robjects as ro

v = ro.IntVector(range(1, 11))
v[0]   # select the first element
w = v.ro + 1
Mapping types
                      Only one language ?   Functions
                            R and Python    Evaluation and memory
                                            Building an application



Missing values
 NaN:
       numeric data type value representing an undefined or
       unrepresentable value, especially in floating-point
       calculations.

  • Also used for missing values.
  • Is a standard.

 NA:
  • Used for missing values by R.
  • Not a standard.

  • Pitfall when passing data to C without copy/checks
  • Applies to any C libraries (includes rpy2)
Mapping types
                     Only one language ?   Functions
                           R and Python    Evaluation and memory
                                           Building an application



Functions


 R functions can be called as if they were Python functions
 import rpy2.robjects as ro

 f = ro.r("function(x, y) { 2 * (x + y) }")

 f(1, 2)


   • conversion on-the-fly
   • translated signatures (dot-to-underscore)
Mapping types
                     Only one language ?   Functions
                           R and Python    Evaluation and memory
                                           Building an application



Packages and modules
 R
 Namespaces attached to the search path
 > searchpaths()
 [1] ".GlobalEnv"
 [2] "/usr/local/packages/R/2.15/lib/R/library/stats"
 [3] "/usr/local/packages/R/2.15/lib/R/library/graphics"
 [4] "/usr/local/packages/R/2.15/lib/R/library/grDevices"
 [5] "/usr/local/packages/R/2.15/lib/R/library/utils"
 [6] "/usr/local/packages/R/2.15/lib/R/library/datasets"
 [7] "/usr/local/packages/R/2.15/lib/R/library/methods"
 [8] "Autoloads"
 [9] "/usr/local/packages/R/2.15/lib/R/library/base"


 Python
 Python modules as namespaces
 import os
 os.path.basename(’/path/to/a/file’)
Mapping types
                     Only one language ?   Functions
                           R and Python    Evaluation and memory
                                           Building an application



R packages (almost) as Python modules




 from rpy2.robjects.packages import importr
 stats = importr(’stats’)
 # PCA !
 pc = stats.prcomp(m)
Mapping types
                     Only one language ?   Functions
                           R and Python    Evaluation and memory
                                           Building an application



R scripts as modules !



 from rpy2.robjects.packages import SignatureTranslatedAnonymousPackage

 # R code in a file as a package
 with open(’rflib.R’) as f:
     code = ’’.join(f.readlines())
     rf = SignatureTranslatedAnonymousPackage(code, "rf")

 imp = rf.get_importance(dataf, response)
Mapping types
                      Only one language ?   Functions
                            R and Python    Evaluation and memory
                                            Building an application



R environments
     • Associate symbols to objects
     • Exposed as Python dictionaries (key - value)

 R
 env <- new.env()
 assign(’x’, 123, envir = env)

 y <- 456


 Python
 import rpy2.robjects as ro

 env = ro.Environment()
 env[’x’] = 123

 ro.globalenv[’y’] = 456
Mapping types
                    Only one language ?   Functions
                          R and Python    Evaluation and memory
                                          Building an application



R and callback functions
 Common R idiom
    # m: matrix of numerical values
    f <- function(x) sum(x[x > 0])
    res <- apply(m, 1, f)



 How to do that with rpy2 ?
Mapping types
                    Only one language ?   Functions
                          R and Python    Evaluation and memory
                                          Building an application



R and callback functions
 Common R idiom
    # m: matrix of numerical values
    f <- function(x) sum(x[x > 0])
    res <- apply(m, 1, f)



 How to do that with rpy2 ?

   import rpy2.interactive as r
   import rpy2.rinterface as ri
   r_code = """
     function(x)
       sum(x[x > 0])
   """
   tmp = ri.parse(r_code)
   eval = r.packages.base.eval
   r_func = eval(tmp)
   r.base.apply(m, 1, r_func)
Mapping types
                    Only one language ?   Functions
                          R and Python    Evaluation and memory
                                          Building an application



R and callback functions
 Common R idiom
    # m: matrix of numerical values
    f <- function(x) sum(x[x > 0])
    res <- apply(m, 1, f)



 How to do that with rpy2 ?

                                           import rpy2.interactive as r
                                           import rpy2.rinterface as ri
                                           def tmp(x):
                                             gnr = elt for elt in x 
                                                   if elt > 0
                                             return sum(gnr)
                                           r_func = ri.rternalize(tmp)
                                           r.base.apply(m, 1, r_func)
Mapping types
                       Only one language ?   Functions
                             R and Python    Evaluation and memory
                                             Building an application



Evaluation strategies


 R
     • Pass-by-value / Call-by-value
     • Modifying an object locally is always safe
     • Unncessary copies

 Python
     • Pass-by-reference
     • Explicit request if copy

 Rpy2 exposes R as if it was pass-by-reference
Mapping types
                Only one language ?   Functions
                      R and Python    Evaluation and memory
                                      Building an application



Python
from rpy2.robjects.vectors import IntVector

def f(x):
     x[0] = 123
v = ro.IntVector(range(1, 11))
f(v)


R
f <- function(x) {
  x[0] = 123
  return(x)
}
v = seq(1, 11)
v = f(v)
Mapping types
              Only one language ?   Functions
                    R and Python    Evaluation and memory
                                    Building an application



Memory management and garbage collection
Mapping types
                        Only one language ?   Functions
                              R and Python    Evaluation and memory
                                              Building an application



Memory management and garbage collection

 R
     • Tracing GC (check for reachability)
     • R PreciousList
Mapping types
                        Only one language ?   Functions
                              R and Python    Evaluation and memory
                                              Building an application



Memory management and garbage collection

 R
     • Tracing GC (check for reachability)
     • R PreciousList

 Python
     • Reference counting
     • Tracing GC
Mapping types
                        Only one language ?   Functions
                              R and Python    Evaluation and memory
                                              Building an application



Memory management and garbage collection

 R
     • Tracing GC (check for reachability)
     • R PreciousList

 Python
     • Reference counting
     • Tracing GC

     • Bridge different memory models
     • Intermediate reference counting of R objects exposed
     • That part could become very generic.
Mapping types
                     Only one language ?   Functions
                           R and Python    Evaluation and memory
                                           Building an application



R objects exposed to R
 import rpy2.rinterface as ri
 ri.initr()
 baseenv = ri.baseenv
 letters = baseenv.get(’letters’)
Mapping types
                     Only one language ?   Functions
                           R and Python    Evaluation and memory
                                           Building an application



R objects exposed to R (not so simple)
 import rpy2.rinterface as ri
 ri.initr()
 base = ri.baseenv
 letters = base[’letters’]
Mapping types
                   Only one language ?   Functions
                         R and Python    Evaluation and memory
                                         Building an application




>>> letters = base[’letters’]
>>> letters.rid # varies
123456
>>> letters.__sexp_refcount__
1
>>> letters2 = base[’letters’]
>>> letters2.__sexp_refcount__
2
>>> letters.__sexp_refcount__
2
>>> letters_2.rid # same R ID
123456
Mapping types
                    Only one language ?   Functions
                          R and Python    Evaluation and memory
                                          Building an application



Exceptions




 RRuntimeError: error while evaluating R code
   KeyError: symbol not found in an environment
 ValueError: invalid value passed to an rpy2 function
Mapping types
                            Only one language ?   Functions
                                  R and Python    Evaluation and memory
                                                  Building an application



Performances
 function(x) {
   total = 0;
   for (elt in x) {
     total <- total + elt
   }
 }


  Function       Sequence       Speedup
  R                                 1.00
  R compiled                        6.52
  R builtin                       329.29
  pure python    FloatVector        0.51
  builtin python FloatVector        0.54
  pure python    SexpVector         7.45
  builtin python SexpVector        20.92
  builtin python array.array       53.62
  builtin python list              90.47
 R through rpy2 can be faster than R
Mapping types
                    Only one language ?   Functions
                          R and Python    Evaluation and memory
                                          Building an application



Let’s build a web application




   • Why do that ?
      • Allow access to computing ressources
      • Use the UI of the browser
      • Good example
   • Micro web framework: Flask
Mapping types
                Only one language ?   Functions
                      R and Python    Evaluation and memory
                                      Building an application



Hello world with Flask

 from flask import Flask
 app = Flask(__name__)

 @app.route(’/’)
 def hello_world():
     return ’Hello World!’

 if __name__ == ’__main__’:
     app.run()

 python hello.py
Mapping types
                     Only one language ?   Functions
                           R and Python    Evaluation and memory
                                           Building an application



Importance of variables with random forest




  1   Data in a CSV file
  2   Use R to compute a random forest and compute importance of
      variables
  3   Make a pretty plot with ggplot2
Mapping types
Only one language ?   Functions
      R and Python    Evaluation and memory
                      Building an application
Mapping types
                          Only one language ?   Functions
                                R and Python    Evaluation and memory
                                                Building an application

## data
dataf <- read.csv("/some/data/file.csv")
response <- ’var_name’

## importance of variables
library(randomForest)
get_importance <- function(dataf, response) {
  fmla <- formula(paste(response, ’˜ .’))
  dataf_rf <- randomForest(fmla, data = dataf,
                           keep.forest = FALSE,
                           importance = TRUE)
  imp <- importance(dataf_rf, type = 1)
  imp <- as.data.frame(imp[order(imp[,1]), , drop=FALSE])                10

  return(imp)
                                                                                                                                             %IncMSE
}                                                                                                                                               12.5




                                                               %IncMSE
                                                                                                                                                10.0
                                                                                                                                                7.5
imp <- get_importance(dataf, response)                                                                                                          5.0

                                                                          5


## plot
library(ggplot2)
get_plot <- function(imp) {
  rn <- rownames(imp)
                                                                          0
  rn <- factor(rn, levels=rn, ordered=TRUE)
                                                                              gear   qsec   carb   drat   am    vs    cyl   wt   hp   disp
  imp <- cbind(as.data.frame(imp), rn = rn)                                                               Variables

  p = ggplot(imp) +
    geom_bar(aes(y = ‘%IncMSE‘,
                 x = rn,
                 fill = ‘%IncMSE‘)) +
    scale_x_discrete("Variables")
  return(p)
}
p <- get_plot(imp)
print(p)
Mapping types
                         Only one language ?   Functions
                               R and Python    Evaluation and memory
                                               Building an application



R library

 1   get_dataframe <- function(filename) {
 2     return(read.csv(filename))
 3   }
 4
 5   ## importance of variables
 6   library(randomForest)
 7   get_importance <- function(dataf, response) {
 8     fmla <- formula(paste(response, ’˜ .’))
 9     dataf_rf <- randomForest(fmla, data = dataf,
10                              keep.forest = FALSE,
11                              importance = TRUE)
12     imp <- importance(dataf_rf, type = 1)
13     imp <- as.data.frame(imp[order(imp[,1]), , drop=FALSE])
14     return(imp)
15   }
Mapping types
                         Only one language ?   Functions
                               R and Python    Evaluation and memory
                                               Building an application



R library
17   ## plot
18   library(ggplot2)
19   get_plot <- function(imp) {
20     rn <- rownames(imp)
21     rn <- factor(rn, levels=rn, ordered=TRUE)
22     imp <- cbind(as.data.frame(imp), rn = rn)
23     p = ggplot(imp) +
24       geom_bar(aes(y = ‘%IncMSE‘,
25                    x = rn,
26                    fill = ‘%IncMSE‘)) +
27       scale_x_discrete("Variables")
28     return(p)
29   }
30
31   make_PNGplot <- function(imp, dir) {
32     filename <- tempfile(tmpdir = dir, fileext = ’.png’)
33     p <- get_plot(imp)
34     png(filename)
35     print(p)
36     dev.off()
37     return(basename(filename))
38   }
Mapping types
                         Only one language ?   Functions
                               R and Python    Evaluation and memory
                                               Building an application



Python application


 1   import os
 2   from flask import Flask, render_template, flash
 3   from flask import url_for, send_from_directory
 4   from flask import request
 5   from werkzeug import secure_filename
 6   from rpy2.robjects.packages import SignatureTranslatedAnonymousPackage
 7
 8   UPLOAD_FOLDER = ’/tmp’
 9
10   # R code as a package
11   with open(’rflib.R’) as f:
12       code = ’’.join(f.readlines())
13       rf = SignatureTranslatedAnonymousPackage(code, "rf")
Mapping types
                         Only one language ?   Functions
                               R and Python    Evaluation and memory
                                               Building an application




15   # create application
16   app = Flask(__name__)
17   app.secret_key = ’change this !!!’
18   app.config[’UPLOAD_FOLDER’] = UPLOAD_FOLDER
19
20   # serve files
21   @app.route(’/files/<filename>’)
22   def files(filename):
23       return send_from_directory(UPLOAD_FOLDER,
24                                   filename)
Mapping types
                         Only one language ?   Functions
                               R and Python    Evaluation and memory
                                               Building an application




15   # create application
16   app = Flask(__name__)
17   app.secret_key = ’change this !!!’
18   app.config[’UPLOAD_FOLDER’] = UPLOAD_FOLDER
19
20   # serve files
21   @app.route(’/files/<filename>’)
22   def files(filename):
23       return send_from_directory(UPLOAD_FOLDER,
24                                   filename)
25
26   def plot(dataf, response):
27       # compute importance of variables
28       imp = rf.get_importance(dataf, response)
29       # plot into a file
30       plot_fn = rf.make_PNGplot(imp, UPLOAD_FOLDER)[0]
31       return url_for(’files’, filename = plot_fn)
Mapping types
                         Only one language ?   Functions
                               R and Python    Evaluation and memory
                                               Building an application

33   # main function
34   @app.route(’/’, methods=[’GET’, ’POST’])
35   def index():
36       plot_url = None
37       # test if data posted
38       if request.method == ’POST’:
39           f = request.files[’data’]
40           response = request.form[’response’]
41           # test is file ’data’ uploaded
42           if f:
43                # save the uploaded file
44                filename = secure_filename(f.filename)
45                f.save(os.path.join(app.config[’UPLOAD_FOLDER’], filename))
46                # get R data.frame from the file
47                dataf = rf.get_dataframe(f.filename)
48                # check if response variable is present
49                if response in dataf.names:
50                    plot_url = plot(dataf, response)
51                else:
52                    flash(’No such response variable’, category = ’error’)
53           else:
54                flash(’Invalid file extension’, category = ’error’)
55       #
56       return render_template(’index.html’, plot_url = plot_url)
Mapping types
                Only one language ?   Functions
                      R and Python    Evaluation and memory
                                      Building an application




Showtime. . .
Mapping types
                   Only one language ?   Functions
                         R and Python    Evaluation and memory
                                         Building an application



Next steps




  • Generic library to bridge R to anything with a C API
  • Julia and R (hopefully end of 2012)

More Related Content

What's hot

Julia vs Python 2020
Julia vs Python 2020Julia vs Python 2020
Julia vs Python 2020Devathon
 
R programming advantages and disadvantages
R programming advantages and disadvantagesR programming advantages and disadvantages
R programming advantages and disadvantagesPrwaTech
 
Python an-intro v2
Python an-intro v2Python an-intro v2
Python an-intro v2Arulalan T
 
The Ring programming language version 1.5.3 book - Part 5 of 184
The Ring programming language version 1.5.3 book - Part 5 of 184The Ring programming language version 1.5.3 book - Part 5 of 184
The Ring programming language version 1.5.3 book - Part 5 of 184Mahmoud Samir Fayed
 
The Ring programming language version 1.2 book - Part 77 of 84
The Ring programming language version 1.2 book - Part 77 of 84The Ring programming language version 1.2 book - Part 77 of 84
The Ring programming language version 1.2 book - Part 77 of 84Mahmoud Samir Fayed
 
The Ring programming language version 1.6 book - Part 6 of 189
The Ring programming language version 1.6 book - Part 6 of 189The Ring programming language version 1.6 book - Part 6 of 189
The Ring programming language version 1.6 book - Part 6 of 189Mahmoud Samir Fayed
 
IRJET- Python: Simple though an Important Programming Language
IRJET- Python: Simple though an Important Programming LanguageIRJET- Python: Simple though an Important Programming Language
IRJET- Python: Simple though an Important Programming LanguageIRJET Journal
 
Globalization autdi for Fedora Atomic
Globalization autdi for Fedora AtomicGlobalization autdi for Fedora Atomic
Globalization autdi for Fedora AtomicPravin Satpute
 
Introduction to python updated
Introduction to python   updatedIntroduction to python   updated
Introduction to python updatedchakrib5
 
IFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation SystemIFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation SystemGuy De Pauw
 
From Programming to Modeling And Back Again
From Programming to Modeling And Back AgainFrom Programming to Modeling And Back Again
From Programming to Modeling And Back AgainMarkus Voelter
 
Functional conf 2014_schedule
Functional conf 2014_scheduleFunctional conf 2014_schedule
Functional conf 2014_schedulegetdownload
 
Process, Tool, Localization, Successful
Process, Tool, Localization, SuccessfulProcess, Tool, Localization, Successful
Process, Tool, Localization, SuccessfulJulie Song
 
JetBrains MPS: Projectional Editing in Domain-Specific Languages
JetBrains MPS: Projectional Editing in Domain-Specific LanguagesJetBrains MPS: Projectional Editing in Domain-Specific Languages
JetBrains MPS: Projectional Editing in Domain-Specific LanguagesOscar Rodriguez
 
Lenguajes programacionalternativos
Lenguajes programacionalternativosLenguajes programacionalternativos
Lenguajes programacionalternativosseravb
 
Mayflower Capability Deck 03072012
Mayflower Capability Deck   03072012Mayflower Capability Deck   03072012
Mayflower Capability Deck 03072012Rishav.g
 
Localization Project Management
Localization Project ManagementLocalization Project Management
Localization Project Managementbarakdanin
 
65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...
65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...
65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...ESEM 2014
 

What's hot (20)

Julia vs Python 2020
Julia vs Python 2020Julia vs Python 2020
Julia vs Python 2020
 
R programming advantages and disadvantages
R programming advantages and disadvantagesR programming advantages and disadvantages
R programming advantages and disadvantages
 
Python an-intro v2
Python an-intro v2Python an-intro v2
Python an-intro v2
 
The Ring programming language version 1.5.3 book - Part 5 of 184
The Ring programming language version 1.5.3 book - Part 5 of 184The Ring programming language version 1.5.3 book - Part 5 of 184
The Ring programming language version 1.5.3 book - Part 5 of 184
 
The Ring programming language version 1.2 book - Part 77 of 84
The Ring programming language version 1.2 book - Part 77 of 84The Ring programming language version 1.2 book - Part 77 of 84
The Ring programming language version 1.2 book - Part 77 of 84
 
P1 2018 python
P1 2018 pythonP1 2018 python
P1 2018 python
 
The Ring programming language version 1.6 book - Part 6 of 189
The Ring programming language version 1.6 book - Part 6 of 189The Ring programming language version 1.6 book - Part 6 of 189
The Ring programming language version 1.6 book - Part 6 of 189
 
IRJET- Python: Simple though an Important Programming Language
IRJET- Python: Simple though an Important Programming LanguageIRJET- Python: Simple though an Important Programming Language
IRJET- Python: Simple though an Important Programming Language
 
Globalization autdi for Fedora Atomic
Globalization autdi for Fedora AtomicGlobalization autdi for Fedora Atomic
Globalization autdi for Fedora Atomic
 
Introduction to python updated
Introduction to python   updatedIntroduction to python   updated
Introduction to python updated
 
IFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation SystemIFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation System
 
From Programming to Modeling And Back Again
From Programming to Modeling And Back AgainFrom Programming to Modeling And Back Again
From Programming to Modeling And Back Again
 
Functional conf 2014_schedule
Functional conf 2014_scheduleFunctional conf 2014_schedule
Functional conf 2014_schedule
 
Process, Tool, Localization, Successful
Process, Tool, Localization, SuccessfulProcess, Tool, Localization, Successful
Process, Tool, Localization, Successful
 
JetBrains MPS: Projectional Editing in Domain-Specific Languages
JetBrains MPS: Projectional Editing in Domain-Specific LanguagesJetBrains MPS: Projectional Editing in Domain-Specific Languages
JetBrains MPS: Projectional Editing in Domain-Specific Languages
 
Lenguajes programacionalternativos
Lenguajes programacionalternativosLenguajes programacionalternativos
Lenguajes programacionalternativos
 
Mayflower Capability Deck 03072012
Mayflower Capability Deck   03072012Mayflower Capability Deck   03072012
Mayflower Capability Deck 03072012
 
Localization Project Management
Localization Project ManagementLocalization Project Management
Localization Project Management
 
65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...
65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...
65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...
 
P1 2017 python
P1 2017 pythonP1 2017 python
P1 2017 python
 

Similar to Bridge to r

python_interview_questions (1) (1) (1).pptx
python_interview_questions (1) (1) (1).pptxpython_interview_questions (1) (1) (1).pptx
python_interview_questions (1) (1) (1).pptxGajulaYuvaraj
 
Python.pptx
Python.pptxPython.pptx
Python.pptxabclara
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to pythonSoba Arjun
 
Python programming ppt.pptx
Python programming ppt.pptxPython programming ppt.pptx
Python programming ppt.pptxnagendrasai12
 
Python presentation by Monu Sharma
Python presentation by Monu SharmaPython presentation by Monu Sharma
Python presentation by Monu SharmaMayank Sharma
 
R Vs Python – The most trending debate of aspiring Data Scientists
R Vs Python – The most trending debate of aspiring Data ScientistsR Vs Python – The most trending debate of aspiring Data Scientists
R Vs Python – The most trending debate of aspiring Data Scientistsabhishekdf3
 
Advantages of Python Learning | Why Python
Advantages of Python Learning | Why PythonAdvantages of Python Learning | Why Python
Advantages of Python Learning | Why PythonEvoletTechnologiesCo
 
The Ring programming language version 1.8 book - Part 6 of 202
The Ring programming language version 1.8 book - Part 6 of 202The Ring programming language version 1.8 book - Part 6 of 202
The Ring programming language version 1.8 book - Part 6 of 202Mahmoud Samir Fayed
 
Introduction to Python Programming Basics
Introduction  to  Python  Programming BasicsIntroduction  to  Python  Programming Basics
Introduction to Python Programming BasicsDhana malar
 
How community software supports language documentation and data analysis
How community software supports language documentation and data analysisHow community software supports language documentation and data analysis
How community software supports language documentation and data analysisPeter Bouda
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to pythonNikhil Kapoor
 

Similar to Bridge to r (20)

python_interview_questions (1) (1) (1).pptx
python_interview_questions (1) (1) (1).pptxpython_interview_questions (1) (1) (1).pptx
python_interview_questions (1) (1) (1).pptx
 
Python.pptx
Python.pptxPython.pptx
Python.pptx
 
PYTHON UNIT 1
PYTHON UNIT 1PYTHON UNIT 1
PYTHON UNIT 1
 
Go Vs Python.pdf
Go Vs Python.pdfGo Vs Python.pdf
Go Vs Python.pdf
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
 
Python programming ppt.pptx
Python programming ppt.pptxPython programming ppt.pptx
Python programming ppt.pptx
 
Python presentation by Monu Sharma
Python presentation by Monu SharmaPython presentation by Monu Sharma
Python presentation by Monu Sharma
 
R Vs Python – The most trending debate of aspiring Data Scientists
R Vs Python – The most trending debate of aspiring Data ScientistsR Vs Python – The most trending debate of aspiring Data Scientists
R Vs Python – The most trending debate of aspiring Data Scientists
 
Python and r in data science
Python and r in data sciencePython and r in data science
Python and r in data science
 
Advantages of Python Learning | Why Python
Advantages of Python Learning | Why PythonAdvantages of Python Learning | Why Python
Advantages of Python Learning | Why Python
 
The Ring programming language version 1.8 book - Part 6 of 202
The Ring programming language version 1.8 book - Part 6 of 202The Ring programming language version 1.8 book - Part 6 of 202
The Ring programming language version 1.8 book - Part 6 of 202
 
Introduction to Python Programming Basics
Introduction  to  Python  Programming BasicsIntroduction  to  Python  Programming Basics
Introduction to Python Programming Basics
 
How community software supports language documentation and data analysis
How community software supports language documentation and data analysisHow community software supports language documentation and data analysis
How community software supports language documentation and data analysis
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
 
IPT 2.pptx
IPT 2.pptxIPT 2.pptx
IPT 2.pptx
 
R programming
R programmingR programming
R programming
 
Python Programming Part 1.pdf
Python Programming Part 1.pdfPython Programming Part 1.pdf
Python Programming Part 1.pdf
 
Python Programming Part 1.pdf
Python Programming Part 1.pdfPython Programming Part 1.pdf
Python Programming Part 1.pdf
 
Python Programming Part 1.pdf
Python Programming Part 1.pdfPython Programming Part 1.pdf
Python Programming Part 1.pdf
 
Python 101
Python 101Python 101
Python 101
 

More from Dmitry Makarchuk

2012 11-28 rich web data modeling with graphs-1
2012 11-28 rich web data modeling with graphs-12012 11-28 rich web data modeling with graphs-1
2012 11-28 rich web data modeling with graphs-1Dmitry Makarchuk
 
2012 11-28 rich web data modeling with graphs-1
2012 11-28 rich web data modeling with graphs-12012 11-28 rich web data modeling with graphs-1
2012 11-28 rich web data modeling with graphs-1Dmitry Makarchuk
 
Hadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderHadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderDmitry Makarchuk
 
A random forest approach to skin detection with r
A random forest approach to skin detection with rA random forest approach to skin detection with r
A random forest approach to skin detection with rDmitry Makarchuk
 
"Your script just killed my site" by Steve Souders
"Your script just killed my site" by Steve Souders"Your script just killed my site" by Steve Souders
"Your script just killed my site" by Steve SoudersDmitry Makarchuk
 
RBrowserPlugin Project (Gabriel Becker)
RBrowserPlugin Project (Gabriel Becker)RBrowserPlugin Project (Gabriel Becker)
RBrowserPlugin Project (Gabriel Becker)Dmitry Makarchuk
 
Builiding analytical apps on Hadoop
Builiding analytical apps on HadoopBuiliding analytical apps on Hadoop
Builiding analytical apps on HadoopDmitry Makarchuk
 
Jesse Yates: Hbase snapshots patch
Jesse Yates: Hbase snapshots patchJesse Yates: Hbase snapshots patch
Jesse Yates: Hbase snapshots patchDmitry Makarchuk
 
Mongo DB in gaming industry
Mongo DB in gaming industryMongo DB in gaming industry
Mongo DB in gaming industryDmitry Makarchuk
 

More from Dmitry Makarchuk (11)

Linzer slides-barug
Linzer slides-barugLinzer slides-barug
Linzer slides-barug
 
2012 11-28 rich web data modeling with graphs-1
2012 11-28 rich web data modeling with graphs-12012 11-28 rich web data modeling with graphs-1
2012 11-28 rich web data modeling with graphs-1
 
2012 11-28 rich web data modeling with graphs-1
2012 11-28 rich web data modeling with graphs-12012 11-28 rich web data modeling with graphs-1
2012 11-28 rich web data modeling with graphs-1
 
Hadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderHadoop and mysql by Chris Schneider
Hadoop and mysql by Chris Schneider
 
A random forest approach to skin detection with r
A random forest approach to skin detection with rA random forest approach to skin detection with r
A random forest approach to skin detection with r
 
"Your script just killed my site" by Steve Souders
"Your script just killed my site" by Steve Souders"Your script just killed my site" by Steve Souders
"Your script just killed my site" by Steve Souders
 
RBrowserPlugin Project (Gabriel Becker)
RBrowserPlugin Project (Gabriel Becker)RBrowserPlugin Project (Gabriel Becker)
RBrowserPlugin Project (Gabriel Becker)
 
Builiding analytical apps on Hadoop
Builiding analytical apps on HadoopBuiliding analytical apps on Hadoop
Builiding analytical apps on Hadoop
 
Jesse Yates: Hbase snapshots patch
Jesse Yates: Hbase snapshots patchJesse Yates: Hbase snapshots patch
Jesse Yates: Hbase snapshots patch
 
Phoenix h basemeetup
Phoenix h basemeetupPhoenix h basemeetup
Phoenix h basemeetup
 
Mongo DB in gaming industry
Mongo DB in gaming industryMongo DB in gaming industry
Mongo DB in gaming industry
 

Bridge to r

  • 1. Only one language ? R and Python Polyglot applications with R and Python [BARUG Meeting] Laurent Gautier DMAC / CBS November 13th, 2012
  • 2. Only one language ? R and Python Disclaimer • This is not about:
  • 3. Only one language ? R and Python Disclaimer • This is not about: • The comparative merits of scripting languages
  • 4. Only one language ? R and Python Disclaimer • This is not about: • The comparative merits of scripting languages • Only Python & R
  • 5. Only one language ? R and Python Disclaimer • This is not about: • The comparative merits of scripting languages • Only Python & R • This is about:
  • 6. Only one language ? R and Python Disclaimer • This is not about: • The comparative merits of scripting languages • Only Python & R • This is about: • Accessing natively libraries implemented in a different language
  • 7. Only one language ? R and Python Disclaimer • This is not about: • The comparative merits of scripting languages • Only Python & R • This is about: • Accessing natively libraries implemented in a different language • Bridging people and skill sets through a glue language
  • 8. Only one language ? R and Python Disclaimer • This is not about: • The comparative merits of scripting languages • Only Python & R • This is about: • Accessing natively libraries implemented in a different language • Bridging people and skill sets through a glue language • Going from ideas to prototypes faster
  • 9. Only one language ? R and Python Disclaimer • This is not about: • The comparative merits of scripting languages • Only Python & R • This is about: • Accessing natively libraries implemented in a different language • Bridging people and skill sets through a glue language • Going from ideas to prototypes faster • Putting some production into research
  • 10. Only one language ? R and Python Disclaimer • This is not about: • The comparative merits of scripting languages • Only Python & R • This is about: • Accessing natively libraries implemented in a different language • Bridging people and skill sets through a glue language • Going from ideas to prototypes faster • Putting some production into research • Bringing research to production
  • 11. Only one language ? R and Python Preamble Scenario • data people: Statisticians, data analysts
  • 12. Only one language ? R and Python Preamble Scenario • data people: Statisticians, data analysts • Data people have a method M
  • 13. Only one language ? R and Python Preamble Scenario • data people: Statisticians, data analysts • Data people have a method M • Data people want to work on something new
  • 14. Only one language ? R and Python Preamble Scenario • data people: Statisticians, data analysts • Data people have a method M • Data people want to work on something new • Management wants an application for method M
  • 15. Only one language ? R and Python Preamble Scenario • data people: Statisticians, data analysts • Data people have a method M • Data people want to work on something new • Management wants an application for method M • Management wants an application that uses method M
  • 16. Only one language ? Polyglot programs R and Python 1 Only one language ? Polyglot programs 2 R and Python Mapping types Functions Evaluation and memory Building an application
  • 17. Only one language ? Polyglot programs R and Python Uniformity and coding standards
  • 18. Only one language ? Polyglot programs R and Python Uniformity and coding standards
  • 19. Only one language ? Polyglot programs R and Python Uniformity and coding standards
  • 20. Only one language ? Polyglot programs R and Python Get the job done
  • 21. Only one language ? Polyglot programs R and Python Get the job done Soloist
  • 22. Only one language ? Polyglot programs R and Python Get the job done Soloist • Multi-talented individual • Documentation ?
  • 23. Only one language ? Polyglot programs R and Python Get the job done Teamwork Soloist • Multi-talented individual • Documentation ?
  • 24. Only one language ? Polyglot programs R and Python Get the job done Teamwork Soloist • Paired-programming • Multi-talented individual • Use the same tools ? • Documentation ? • Overlapping skills ?
  • 25. Only one language ? Polyglot programs R and Python Monolithic development
  • 26. Only one language ? Polyglot programs R and Python Monolithic development • Centralized • Top-down • Lot of planning • Long development, mostly only usable when complete • Stand in time
  • 27. Only one language ? Polyglot programs R and Python Maintainability Why use one unique language ? • A legitimate managerial concern • In places Java Certifications replaced general programming degrees • Could good programmers matter more than the language ? • Back to finding a needle in a haystack
  • 28. Only one language ? Polyglot programs R and Python Modularity at the heart of UNIX philosophy. |>< • No branching logic, unless going for shell scripts. • Shell script no often thought after for applications • The birth of scripting languages (Sed, Awk, Perl, . . . )
  • 29. Only one language ? Polyglot programs R and Python • Projects are cross-fields, cross-specialization • Cost of specification - design - implementation too high • Especially when the lifespan of the application is too short (or the user base too small).
  • 30. Only one language ? Polyglot programs R and Python Example from video games • Engines (generally in C++) • Scripting language for the ‘story’ and content • Python • Lua • Proprietary, others, . . . • Large projects (with a lot of money at stake) • Diverse competences (3D engine = story logic) • When speed of development is more important than speed of execution This can apply to other industries • Pipelines in visual effects • Bioinformatics
  • 31. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application 1 Only one language ? Polyglot programs 2 R and Python Mapping types Functions Evaluation and memory Building an application
  • 32. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R • Language for statistics, data analysis, and data visualization • Unmatched1 number of libraries for anything having to do with data • Specialized set of libraries for bioinformatics (Bioconductor) 1 Almost certainly
  • 33. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Python • All-purpose scripting language • Unmatched2 number of libraries for about anything • Specialized sets of libraries for bioinformatics (Biopython, and a myriad smaller projects) 2 May be
  • 34. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Python (continued) • Machine learning R does not have: PyBrain • Visualization tools R does not have: Mayavis, Blender
  • 35. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Python is popular in Bioinformatics / DNA sequencing. • Galaxy pipeline/server framework is in Python • Some of the internal tools for the SOLiD are written in Python • Ion Torrent Server is a Django server • Oxford Nanopore control system is a server running Python
  • 36. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Why use anything else than R ? • Build an application • Work with very large data • ‘Just because it can be done in R doesn’t mean you should do it’3 3 John Dennison, R Meetup presentation
  • 37. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R embedded in Python
  • 38. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application rpy2 • Feels like a regular Python library • Embeds an R process • Can be thought of as a stateful library
  • 39. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Two main parts: • Low-level interface • High-level interface
  • 40. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Low-level interface • Close to R’s C-API • Let you do anything safe4 from that API • Expose R data structures as Python builtin structures 4 or so is the intent
  • 41. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Types R rpy2 Python numeric FloatSexpVector float integer IntSexpVector int char StrSexpVector str logical BoolSexpVector bool complex ComplexSexpVector complex list ListSexpVector list environment SexpEnvironment dict function SexpClosure function S4 SexpS4 object SexpLang object SexpExtPtr object
  • 42. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Vectors and arrays • C-like: Contiguous blocks of memory • R objects exposed to Python as sequences or C-like arrays, with or without copy
  • 43. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R v <- seq(1, 10) v[1] # select the first element w <- v + 1 # add 1 to all elts
  • 44. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R v <- seq(1, 10) v[1] # select the first element w <- v + 1 # add 1 to all elts rpy2.rinterface import rpy2.rinterface as ri; ri.initr() v = ri.IntSexpVector(range(1, 11)) v[0] # select the first element w = ri.IntSexpVector([x+1 for x in v])
  • 45. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R v <- seq(1, 10) v[1] # select the first element w <- v + 1 # add 1 to all elts rpy2.rinterface import rpy2.rinterface as ri; ri.initr() v = ri.IntSexpVector(range(1, 11)) v[0] # select the first element w = ri.IntSexpVector([x+1 for x in v]) rpy2.robjects import rpy2.robjects as ro v = ro.IntVector(range(1, 11)) v[0] # select the first element w = v.ro + 1
  • 46. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Missing values NaN: numeric data type value representing an undefined or unrepresentable value, especially in floating-point calculations. • Also used for missing values. • Is a standard. NA: • Used for missing values by R. • Not a standard. • Pitfall when passing data to C without copy/checks • Applies to any C libraries (includes rpy2)
  • 47. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Functions R functions can be called as if they were Python functions import rpy2.robjects as ro f = ro.r("function(x, y) { 2 * (x + y) }") f(1, 2) • conversion on-the-fly • translated signatures (dot-to-underscore)
  • 48. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Packages and modules R Namespaces attached to the search path > searchpaths() [1] ".GlobalEnv" [2] "/usr/local/packages/R/2.15/lib/R/library/stats" [3] "/usr/local/packages/R/2.15/lib/R/library/graphics" [4] "/usr/local/packages/R/2.15/lib/R/library/grDevices" [5] "/usr/local/packages/R/2.15/lib/R/library/utils" [6] "/usr/local/packages/R/2.15/lib/R/library/datasets" [7] "/usr/local/packages/R/2.15/lib/R/library/methods" [8] "Autoloads" [9] "/usr/local/packages/R/2.15/lib/R/library/base" Python Python modules as namespaces import os os.path.basename(’/path/to/a/file’)
  • 49. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R packages (almost) as Python modules from rpy2.robjects.packages import importr stats = importr(’stats’) # PCA ! pc = stats.prcomp(m)
  • 50. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R scripts as modules ! from rpy2.robjects.packages import SignatureTranslatedAnonymousPackage # R code in a file as a package with open(’rflib.R’) as f: code = ’’.join(f.readlines()) rf = SignatureTranslatedAnonymousPackage(code, "rf") imp = rf.get_importance(dataf, response)
  • 51. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R environments • Associate symbols to objects • Exposed as Python dictionaries (key - value) R env <- new.env() assign(’x’, 123, envir = env) y <- 456 Python import rpy2.robjects as ro env = ro.Environment() env[’x’] = 123 ro.globalenv[’y’] = 456
  • 52. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R and callback functions Common R idiom # m: matrix of numerical values f <- function(x) sum(x[x > 0]) res <- apply(m, 1, f) How to do that with rpy2 ?
  • 53. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R and callback functions Common R idiom # m: matrix of numerical values f <- function(x) sum(x[x > 0]) res <- apply(m, 1, f) How to do that with rpy2 ? import rpy2.interactive as r import rpy2.rinterface as ri r_code = """ function(x) sum(x[x > 0]) """ tmp = ri.parse(r_code) eval = r.packages.base.eval r_func = eval(tmp) r.base.apply(m, 1, r_func)
  • 54. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R and callback functions Common R idiom # m: matrix of numerical values f <- function(x) sum(x[x > 0]) res <- apply(m, 1, f) How to do that with rpy2 ? import rpy2.interactive as r import rpy2.rinterface as ri def tmp(x): gnr = elt for elt in x if elt > 0 return sum(gnr) r_func = ri.rternalize(tmp) r.base.apply(m, 1, r_func)
  • 55. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Evaluation strategies R • Pass-by-value / Call-by-value • Modifying an object locally is always safe • Unncessary copies Python • Pass-by-reference • Explicit request if copy Rpy2 exposes R as if it was pass-by-reference
  • 56. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Python from rpy2.robjects.vectors import IntVector def f(x): x[0] = 123 v = ro.IntVector(range(1, 11)) f(v) R f <- function(x) { x[0] = 123 return(x) } v = seq(1, 11) v = f(v)
  • 57. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Memory management and garbage collection
  • 58. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Memory management and garbage collection R • Tracing GC (check for reachability) • R PreciousList
  • 59. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Memory management and garbage collection R • Tracing GC (check for reachability) • R PreciousList Python • Reference counting • Tracing GC
  • 60. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Memory management and garbage collection R • Tracing GC (check for reachability) • R PreciousList Python • Reference counting • Tracing GC • Bridge different memory models • Intermediate reference counting of R objects exposed • That part could become very generic.
  • 61. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R objects exposed to R import rpy2.rinterface as ri ri.initr() baseenv = ri.baseenv letters = baseenv.get(’letters’)
  • 62. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R objects exposed to R (not so simple) import rpy2.rinterface as ri ri.initr() base = ri.baseenv letters = base[’letters’]
  • 63. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application >>> letters = base[’letters’] >>> letters.rid # varies 123456 >>> letters.__sexp_refcount__ 1 >>> letters2 = base[’letters’] >>> letters2.__sexp_refcount__ 2 >>> letters.__sexp_refcount__ 2 >>> letters_2.rid # same R ID 123456
  • 64. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Exceptions RRuntimeError: error while evaluating R code KeyError: symbol not found in an environment ValueError: invalid value passed to an rpy2 function
  • 65. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Performances function(x) { total = 0; for (elt in x) { total <- total + elt } } Function Sequence Speedup R 1.00 R compiled 6.52 R builtin 329.29 pure python FloatVector 0.51 builtin python FloatVector 0.54 pure python SexpVector 7.45 builtin python SexpVector 20.92 builtin python array.array 53.62 builtin python list 90.47 R through rpy2 can be faster than R
  • 66. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Let’s build a web application • Why do that ? • Allow access to computing ressources • Use the UI of the browser • Good example • Micro web framework: Flask
  • 67. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Hello world with Flask from flask import Flask app = Flask(__name__) @app.route(’/’) def hello_world(): return ’Hello World!’ if __name__ == ’__main__’: app.run() python hello.py
  • 68. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Importance of variables with random forest 1 Data in a CSV file 2 Use R to compute a random forest and compute importance of variables 3 Make a pretty plot with ggplot2
  • 69. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application
  • 70. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application ## data dataf <- read.csv("/some/data/file.csv") response <- ’var_name’ ## importance of variables library(randomForest) get_importance <- function(dataf, response) { fmla <- formula(paste(response, ’˜ .’)) dataf_rf <- randomForest(fmla, data = dataf, keep.forest = FALSE, importance = TRUE) imp <- importance(dataf_rf, type = 1) imp <- as.data.frame(imp[order(imp[,1]), , drop=FALSE]) 10 return(imp) %IncMSE } 12.5 %IncMSE 10.0 7.5 imp <- get_importance(dataf, response) 5.0 5 ## plot library(ggplot2) get_plot <- function(imp) { rn <- rownames(imp) 0 rn <- factor(rn, levels=rn, ordered=TRUE) gear qsec carb drat am vs cyl wt hp disp imp <- cbind(as.data.frame(imp), rn = rn) Variables p = ggplot(imp) + geom_bar(aes(y = ‘%IncMSE‘, x = rn, fill = ‘%IncMSE‘)) + scale_x_discrete("Variables") return(p) } p <- get_plot(imp) print(p)
  • 71. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R library 1 get_dataframe <- function(filename) { 2 return(read.csv(filename)) 3 } 4 5 ## importance of variables 6 library(randomForest) 7 get_importance <- function(dataf, response) { 8 fmla <- formula(paste(response, ’˜ .’)) 9 dataf_rf <- randomForest(fmla, data = dataf, 10 keep.forest = FALSE, 11 importance = TRUE) 12 imp <- importance(dataf_rf, type = 1) 13 imp <- as.data.frame(imp[order(imp[,1]), , drop=FALSE]) 14 return(imp) 15 }
  • 72. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application R library 17 ## plot 18 library(ggplot2) 19 get_plot <- function(imp) { 20 rn <- rownames(imp) 21 rn <- factor(rn, levels=rn, ordered=TRUE) 22 imp <- cbind(as.data.frame(imp), rn = rn) 23 p = ggplot(imp) + 24 geom_bar(aes(y = ‘%IncMSE‘, 25 x = rn, 26 fill = ‘%IncMSE‘)) + 27 scale_x_discrete("Variables") 28 return(p) 29 } 30 31 make_PNGplot <- function(imp, dir) { 32 filename <- tempfile(tmpdir = dir, fileext = ’.png’) 33 p <- get_plot(imp) 34 png(filename) 35 print(p) 36 dev.off() 37 return(basename(filename)) 38 }
  • 73. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Python application 1 import os 2 from flask import Flask, render_template, flash 3 from flask import url_for, send_from_directory 4 from flask import request 5 from werkzeug import secure_filename 6 from rpy2.robjects.packages import SignatureTranslatedAnonymousPackage 7 8 UPLOAD_FOLDER = ’/tmp’ 9 10 # R code as a package 11 with open(’rflib.R’) as f: 12 code = ’’.join(f.readlines()) 13 rf = SignatureTranslatedAnonymousPackage(code, "rf")
  • 74. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application 15 # create application 16 app = Flask(__name__) 17 app.secret_key = ’change this !!!’ 18 app.config[’UPLOAD_FOLDER’] = UPLOAD_FOLDER 19 20 # serve files 21 @app.route(’/files/<filename>’) 22 def files(filename): 23 return send_from_directory(UPLOAD_FOLDER, 24 filename)
  • 75. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application 15 # create application 16 app = Flask(__name__) 17 app.secret_key = ’change this !!!’ 18 app.config[’UPLOAD_FOLDER’] = UPLOAD_FOLDER 19 20 # serve files 21 @app.route(’/files/<filename>’) 22 def files(filename): 23 return send_from_directory(UPLOAD_FOLDER, 24 filename) 25 26 def plot(dataf, response): 27 # compute importance of variables 28 imp = rf.get_importance(dataf, response) 29 # plot into a file 30 plot_fn = rf.make_PNGplot(imp, UPLOAD_FOLDER)[0] 31 return url_for(’files’, filename = plot_fn)
  • 76. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application 33 # main function 34 @app.route(’/’, methods=[’GET’, ’POST’]) 35 def index(): 36 plot_url = None 37 # test if data posted 38 if request.method == ’POST’: 39 f = request.files[’data’] 40 response = request.form[’response’] 41 # test is file ’data’ uploaded 42 if f: 43 # save the uploaded file 44 filename = secure_filename(f.filename) 45 f.save(os.path.join(app.config[’UPLOAD_FOLDER’], filename)) 46 # get R data.frame from the file 47 dataf = rf.get_dataframe(f.filename) 48 # check if response variable is present 49 if response in dataf.names: 50 plot_url = plot(dataf, response) 51 else: 52 flash(’No such response variable’, category = ’error’) 53 else: 54 flash(’Invalid file extension’, category = ’error’) 55 # 56 return render_template(’index.html’, plot_url = plot_url)
  • 77. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Showtime. . .
  • 78. Mapping types Only one language ? Functions R and Python Evaluation and memory Building an application Next steps • Generic library to bridge R to anything with a C API • Julia and R (hopefully end of 2012)