SlideShare a Scribd company logo
1 of 35
Performing statistical analyses using 
the Rshell processor 
Stian Soiland-Reyes and Christian Brenninkmeijer 
University of Manchester 
Original material by Peter Li, University of Birmingham, UK 
Adapted by Norman Morrison 
Bonn University, 2014-09-01 
http://www.taverna.org.uk/ 
This work is licensed under a 
Creative Commons Attribution 3.0 Unported License
Introduction 
 R is a popular scripting language oriented towards 
statistical computing 
 There are a large number of modules that add 
functionality to R such as BioConductor and rCDK 
 The Rshell service in Taverna allows workflows to 
include services that run R scripts on an installation of R 
 R can be located on the same machine as you use to 
run the workflow, or on a different machine 
 To allow Taverna to talk to the R installation, Rserve 
must also be running on the same machine as R
R Pages on the Taverna Wiki 
 Installation of a local R Server may be too 
much for today’s tutorial 
 In which case just read through the slides to 
see if Installing R is something you 
need/want 
 More information on Taverna and R can be 
found at: 
 http://dev.mygrid.org.uk/wiki/display/tav250/Rshell
Installation of R 
 Documentation available from: 
 http://cran.r-project.org/doc/manuals/R-admin.html 
 Windows 
 Download executable file 
 http://cran.r-project.org/bin/windows/base/ 
 Linux 
 Depends on the version of linux 
 http://cran.r-project.org/ 
 Mac 
 Download pkg file 
 http://cran.r-project.org/bin/macosx/
Installation of Rserve 
 After installing R, the easiest way to install Rserve is to 
install it from CRAN. Simply use in R: 
 install.packages("Rserve”); 
 Since Rserve comes as an R package, you can start 
Rserve within R by typing: 
 library(Rserve); 
 Rserve(); 
 Please note that if you get an error (Fatal error: you must 
specify '--save', '--no-save' or '--vanilla'), then start Rserve with the 
following command 
 Rserve(args="--no-save")
Configuration of Rserve 
 Rserve is configured by the configuration file located at 
/etc/Rserv.conf 
 Configuration of Rserve on your R installation has 
already done using a Rserv.conf file 
 Documentation on configuring Rserve 
 http://www.rforge.net/Rserve/doc.html#conf
An example statistical analysis 
 To illustrate how to use the Rshell service, we will 
carry out a simple statistical analysis on a small 
hypothetical set of species incidence data from 4 
species measured from 6 sites: 
Site 
Species N1 N2 A1 A2 B1 B2 
Species_A 90 110 190 210 290 310 
Species_B 190 210 390 410 590 610 
Species_C 90 110 110 90 120 80 
Species_D 200 100 400 90 600 200
An example statistical analysis 
 This data set can be found in a comma-separated file 
named biodiv_R_testdata.csv in the myExperiment 
group under “Biodiversity Test Data for Rservice 
Tutorial”. Download the file. 
 To analyse this data set using Rshell, the data has to be 
loaded into memory as part of the workflow 
 This can achieved by using the Read_Text_File service 
available in the Local services/io folder of the service 
palette (as shown on the next slide)
An example statistical analysis 
 Drag this service onto the workflow diagram and 
link it to a String constant containing the path to 
the testdata.csv file (as shown on the next slide)
An example statistical analysis
Adding an Rshell service 
 Test this service works by attaching its output port 
to a workflow output and running the workflow 
 Now add an Rshell service to a workflow by 
locating it under Service templates in the Service 
panel and dragging it onto the workflow diagram
Configuring a Rshell service 
 A window will appear to configure the use of the 
Rshell service 
 The configuration of the Rshell service is split 
into several tabs 
 Each tab has Apply and Close buttons at the 
bottom. Apply saves the configuration as shown 
in the tabs, and Close closes the configuration 
dialog
Configuring a Rshell service
Configuring a Rshell service 
 The first tab of the Rshell configuration is used 
to enter the R script that will be executed 
 We will use an R script that will perform a series 
of t-tests to see if species incidence differs 
significantly between site A and site B. 
 You should be careful about performing a t-test 
on as little as 2 replicates - this example is just 
for illustrative purposes
R script 
#Read in data 
incdata <- read.table(file=data,head=TRUE,sep=","); 
#Perform t-tests on all species between sites A and B 
pvalues <- apply(incdata, 1, function(x) { t.test(x[3:4], x[5:6]) 
$p.value }); 
#Write results into a matrix containing incidence data and p-values 
combined <- cbind(incdata[3:4], incdata[5:6], pvalues); 
#Output data 
write.csv(combined, file = results_table, row.names = TRUE); 
Copy and paste the above script into the Script tab of the 
Rshell configuration box
Rshell input and output ports 
 Input and output ports are the connection points 
between the rest of the workflow and the Rshell 
service 
 Rshell makes input ports available as variables 
in the script named after the port. 
 Output ports read their named variable after 
executing the script. The last assigned value to 
the variable will be the one returned from the 
service via the output port.
Rshell input and output ports 
 To add an input port: 
 Select the Input ports tab from the Rshell 
configuration dialog 
 Click Add port button 
 Enter the name of the input port, for this example 
use ‘data’ 
 Specify the input port type, for this example use 
‘Text-file’
Rshell input and output ports
Rshell input and output ports 
 The input port type indicates the data type this variable will have 
within the R-script. The possible types for R input ports are: 
 Logical 
 Numeric 
 Integer 
 String 
 Logical vector 
 Numeric vector 
 Integer vector 
 String vector 
 Text-file
Rshell input and output ports 
 An output port can be added in a similar way: 
 Select the Output ports tab from the Rshell 
configuration dialog 
 Click Add port button 
 Enter the name of the output port, for this example 
use ‘results_table’ 
 Specify the output port type, for this example use 
‘Text-file’
Rshell input and output ports
Rshell input and output ports 
 The output port type indicates the type this variable has within the 
R-script. The possible types for R output ports are: 
 Numeric 
 Integer 
 String 
 Logical vector 
 Numeric vector 
 Integer vector 
 String vector 
 Text-file
Rshell connection settings 
 Configuration of the connection parameters for Rserve is done 
using the Connection settings tab. This tab can be used to: 
 Configure the Rshell to use an Rserve installation on a different machine to 
where you run the Taverna workbench 
 Configure the access of Rserve on a different port 
 Provide authentication details for accessing Rserve in the form of a 
username and password 
 If you are using Rserve on the same machine that you are running 
Taverna on then you probably do not need to change the 
connection settings
Rshell connection settings
Completing the workflow 
 To complete the workflow: 
 Attach the output port of the Read_text_file to the 
“data” input port of the Rshell service 
 Create a workflow output from the results_table output 
port of the Rshell service 
 Your workflow should now look as follows:
Completing the workflow
Results 
When you run the workflow, you should get the following results:
Results 
Try saving the results as a csv file and opening the file in Excel. You 
see something as follows:
Results 
 The results show that: 
 The incidence of species b is significantly different at 
0.01 level 
 The incidence of species a is not significant at 0.01 
level 
 The incidence of Species c and d have no significant 
difference. For species d that is because even though 
an increasing trend is observed, the variation within 
each category is too high to allow any conclusions.
Output of images from Rshell 
 The results from R scripts may be in the form of images 
such as a plot or a graph 
 These images can be output from an Rshell service
Output of images from Rshell 
 Add another Rshell processor onto the current workflow 
from the service template folder 
 Provide the processor with the R script below:
Output of images from Rshell 
 #Read in data 
 incdata <- read.table(file="data",head=TRUE,sep=","); 
 #Transpose 
 t <- t(incdata); 
 #Calculate means 
 mean_a <- mean(t[,"Species_A"]); 
 mean_b <- mean(t[,"Species_B"]); 
 mean_c <- mean(t[,"Species_C"]); 
 mean_d <- mean(t[,"Species_D"]); 
 #Combine data 
 means <- c(mean_a, mean_b, mean_c, mean_d); 
 #Transform to data frame 
 means <- data.frame(means, row.names = c("Species_A", "Species_B", 
"Species_C", "Species_D")); 
 png(filename=figure, height=400, width=400, bg="white"); 
 #Plot 
 barplot(t(means[1]), main = "mean species incidence levels", xlab = 
"Species"); 
 dev.off(); 

Output of images from Rshell 
 Complete the configuration for this Rshell processor by: 
 Creating an input port called data and associating it with a 
Text-file data type 
 Creating an output port called figure and associating it with a 
png-file data type 
 Also, finish building the workflow by connecting a 
workflow output to the figure output port of the Rshell 
processor 
 Your workflow will now look as follows:
Output of images from Rshell
Output of images from Rshell 
 Now run the workflow. You should get the following 
results: 
 The result is an 
image showing a 
bar plot of the 
mean incidence 
levels of the four 
species

More Related Content

What's hot

2014 Taverna Tutorial Nested workflows
2014 Taverna Tutorial Nested workflows2014 Taverna Tutorial Nested workflows
2014 Taverna Tutorial Nested workflowsmyGrid team
 
2014 Taverna tutorial Advanced Taverna
2014 Taverna tutorial Advanced Taverna2014 Taverna tutorial Advanced Taverna
2014 Taverna tutorial Advanced TavernamyGrid team
 
IMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to TavernaIMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to TavernaIMPACT Centre of Competence
 
6.2\9 SSIS 2008R2_Training - DataFlow Transformations
6.2\9 SSIS 2008R2_Training - DataFlow Transformations6.2\9 SSIS 2008R2_Training - DataFlow Transformations
6.2\9 SSIS 2008R2_Training - DataFlow TransformationsPramod Singla
 
Less12 3 e_loadmodule_2
Less12 3 e_loadmodule_2Less12 3 e_loadmodule_2
Less12 3 e_loadmodule_2Suresh Mishra
 
Less10 2 e_testermodule_9
Less10 2 e_testermodule_9Less10 2 e_testermodule_9
Less10 2 e_testermodule_9Suresh Mishra
 

What's hot (10)

2014 Taverna Tutorial Nested workflows
2014 Taverna Tutorial Nested workflows2014 Taverna Tutorial Nested workflows
2014 Taverna Tutorial Nested workflows
 
2014 Taverna tutorial Advanced Taverna
2014 Taverna tutorial Advanced Taverna2014 Taverna tutorial Advanced Taverna
2014 Taverna tutorial Advanced Taverna
 
Lecture14
Lecture14Lecture14
Lecture14
 
IMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to TavernaIMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to Taverna
 
2310 b 03
2310 b 032310 b 03
2310 b 03
 
70 562
70 56270 562
70 562
 
6.2\9 SSIS 2008R2_Training - DataFlow Transformations
6.2\9 SSIS 2008R2_Training - DataFlow Transformations6.2\9 SSIS 2008R2_Training - DataFlow Transformations
6.2\9 SSIS 2008R2_Training - DataFlow Transformations
 
Less12 3 e_loadmodule_2
Less12 3 e_loadmodule_2Less12 3 e_loadmodule_2
Less12 3 e_loadmodule_2
 
Less10 2 e_testermodule_9
Less10 2 e_testermodule_9Less10 2 e_testermodule_9
Less10 2 e_testermodule_9
 
Django crush course
Django crush course Django crush course
Django crush course
 

Similar to 2014 Taverna Tutorial R script

A Handbook Of Statistical Analyses Using R
A Handbook Of Statistical Analyses Using RA Handbook Of Statistical Analyses Using R
A Handbook Of Statistical Analyses Using RNicole Adams
 
Get started with R lang
Get started with R langGet started with R lang
Get started with R langsenthil0809
 
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfR-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfKabilaArun
 
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfR-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfattalurilalitha
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studioDerek Kane
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchAndrew Lowe
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R ServicesGregg Barrett
 
Modeling in R Programming Language for Beginers.ppt
Modeling in R Programming Language for Beginers.pptModeling in R Programming Language for Beginers.ppt
Modeling in R Programming Language for Beginers.pptanshikagoel52
 
DATA MINING USING R (1).pptx
DATA MINING USING R (1).pptxDATA MINING USING R (1).pptx
DATA MINING USING R (1).pptxmyworld93
 
Workshop presentation hands on r programming
Workshop presentation hands on r programmingWorkshop presentation hands on r programming
Workshop presentation hands on r programmingNimrita Koul
 
1_Introduction.pptx
1_Introduction.pptx1_Introduction.pptx
1_Introduction.pptxranapoonam1
 
Revolution Analytics
Revolution AnalyticsRevolution Analytics
Revolution Analyticstempledf
 
Quantitative Data Analysis using R
Quantitative Data Analysis using RQuantitative Data Analysis using R
Quantitative Data Analysis using RTaddesse Kassahun
 

Similar to 2014 Taverna Tutorial R script (20)

Unit 3
Unit 3Unit 3
Unit 3
 
A Handbook Of Statistical Analyses Using R
A Handbook Of Statistical Analyses Using RA Handbook Of Statistical Analyses Using R
A Handbook Of Statistical Analyses Using R
 
Get started with R lang
Get started with R langGet started with R lang
Get started with R lang
 
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfR-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdf
 
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfR-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdf
 
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfR-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdf
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studio
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
 
Ddl
DdlDdl
Ddl
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R Services
 
Lecture1_R.ppt
Lecture1_R.pptLecture1_R.ppt
Lecture1_R.ppt
 
Lecture1_R.ppt
Lecture1_R.pptLecture1_R.ppt
Lecture1_R.ppt
 
Lecture1 r
Lecture1 rLecture1 r
Lecture1 r
 
Modeling in R Programming Language for Beginers.ppt
Modeling in R Programming Language for Beginers.pptModeling in R Programming Language for Beginers.ppt
Modeling in R Programming Language for Beginers.ppt
 
DATA MINING USING R (1).pptx
DATA MINING USING R (1).pptxDATA MINING USING R (1).pptx
DATA MINING USING R (1).pptx
 
Workshop presentation hands on r programming
Workshop presentation hands on r programmingWorkshop presentation hands on r programming
Workshop presentation hands on r programming
 
Unit 1 - R Programming (Part 2).pptx
Unit 1 - R Programming (Part 2).pptxUnit 1 - R Programming (Part 2).pptx
Unit 1 - R Programming (Part 2).pptx
 
1_Introduction.pptx
1_Introduction.pptx1_Introduction.pptx
1_Introduction.pptx
 
Revolution Analytics
Revolution AnalyticsRevolution Analytics
Revolution Analytics
 
Quantitative Data Analysis using R
Quantitative Data Analysis using RQuantitative Data Analysis using R
Quantitative Data Analysis using R
 

More from myGrid team

2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflowsmyGrid team
 
2014 Taverna Tutorial Biodiversity example
2014 Taverna Tutorial Biodiversity example2014 Taverna Tutorial Biodiversity example
2014 Taverna Tutorial Biodiversity examplemyGrid team
 
2014 Taverna Tutorial Components
2014 Taverna Tutorial Components2014 Taverna Tutorial Components
2014 Taverna Tutorial ComponentsmyGrid team
 
2014 Taverna Tutorial Interactions
2014 Taverna Tutorial Interactions2014 Taverna Tutorial Interactions
2014 Taverna Tutorial InteractionsmyGrid team
 
2014 Taverna tutorial Xpath
2014 Taverna tutorial Xpath2014 Taverna tutorial Xpath
2014 Taverna tutorial XpathmyGrid team
 
SWeDe - Scientific Webservice Description
SWeDe - Scientific Webservice DescriptionSWeDe - Scientific Webservice Description
SWeDe - Scientific Webservice DescriptionmyGrid team
 
Taverna workflows in the cloud
Taverna workflows in the cloudTaverna workflows in the cloud
Taverna workflows in the cloudmyGrid team
 
The Taverna Software Suite
The Taverna Software SuiteThe Taverna Software Suite
The Taverna Software SuitemyGrid team
 
The Taverna Workflow Management Software Suite - Past, Present, Future
The Taverna Workflow Management Software Suite - Past, Present, FutureThe Taverna Workflow Management Software Suite - Past, Present, Future
The Taverna Workflow Management Software Suite - Past, Present, FuturemyGrid team
 
2014-06-03-Taverna-IS-ENES2
2014-06-03-Taverna-IS-ENES22014-06-03-Taverna-IS-ENES2
2014-06-03-Taverna-IS-ENES2myGrid team
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and modelsmyGrid team
 
If we build it will they come?
If we build it will they come?If we build it will they come?
If we build it will they come?myGrid team
 

More from myGrid team (13)

Taverna summary
Taverna summaryTaverna summary
Taverna summary
 
2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows
 
2014 Taverna Tutorial Biodiversity example
2014 Taverna Tutorial Biodiversity example2014 Taverna Tutorial Biodiversity example
2014 Taverna Tutorial Biodiversity example
 
2014 Taverna Tutorial Components
2014 Taverna Tutorial Components2014 Taverna Tutorial Components
2014 Taverna Tutorial Components
 
2014 Taverna Tutorial Interactions
2014 Taverna Tutorial Interactions2014 Taverna Tutorial Interactions
2014 Taverna Tutorial Interactions
 
2014 Taverna tutorial Xpath
2014 Taverna tutorial Xpath2014 Taverna tutorial Xpath
2014 Taverna tutorial Xpath
 
SWeDe - Scientific Webservice Description
SWeDe - Scientific Webservice DescriptionSWeDe - Scientific Webservice Description
SWeDe - Scientific Webservice Description
 
Taverna workflows in the cloud
Taverna workflows in the cloudTaverna workflows in the cloud
Taverna workflows in the cloud
 
The Taverna Software Suite
The Taverna Software SuiteThe Taverna Software Suite
The Taverna Software Suite
 
The Taverna Workflow Management Software Suite - Past, Present, Future
The Taverna Workflow Management Software Suite - Past, Present, FutureThe Taverna Workflow Management Software Suite - Past, Present, Future
The Taverna Workflow Management Software Suite - Past, Present, Future
 
2014-06-03-Taverna-IS-ENES2
2014-06-03-Taverna-IS-ENES22014-06-03-Taverna-IS-ENES2
2014-06-03-Taverna-IS-ENES2
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
 
If we build it will they come?
If we build it will they come?If we build it will they come?
If we build it will they come?
 

Recently uploaded

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 

Recently uploaded (20)

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 

2014 Taverna Tutorial R script

  • 1. Performing statistical analyses using the Rshell processor Stian Soiland-Reyes and Christian Brenninkmeijer University of Manchester Original material by Peter Li, University of Birmingham, UK Adapted by Norman Morrison Bonn University, 2014-09-01 http://www.taverna.org.uk/ This work is licensed under a Creative Commons Attribution 3.0 Unported License
  • 2. Introduction  R is a popular scripting language oriented towards statistical computing  There are a large number of modules that add functionality to R such as BioConductor and rCDK  The Rshell service in Taverna allows workflows to include services that run R scripts on an installation of R  R can be located on the same machine as you use to run the workflow, or on a different machine  To allow Taverna to talk to the R installation, Rserve must also be running on the same machine as R
  • 3. R Pages on the Taverna Wiki  Installation of a local R Server may be too much for today’s tutorial  In which case just read through the slides to see if Installing R is something you need/want  More information on Taverna and R can be found at:  http://dev.mygrid.org.uk/wiki/display/tav250/Rshell
  • 4. Installation of R  Documentation available from:  http://cran.r-project.org/doc/manuals/R-admin.html  Windows  Download executable file  http://cran.r-project.org/bin/windows/base/  Linux  Depends on the version of linux  http://cran.r-project.org/  Mac  Download pkg file  http://cran.r-project.org/bin/macosx/
  • 5. Installation of Rserve  After installing R, the easiest way to install Rserve is to install it from CRAN. Simply use in R:  install.packages("Rserve”);  Since Rserve comes as an R package, you can start Rserve within R by typing:  library(Rserve);  Rserve();  Please note that if you get an error (Fatal error: you must specify '--save', '--no-save' or '--vanilla'), then start Rserve with the following command  Rserve(args="--no-save")
  • 6. Configuration of Rserve  Rserve is configured by the configuration file located at /etc/Rserv.conf  Configuration of Rserve on your R installation has already done using a Rserv.conf file  Documentation on configuring Rserve  http://www.rforge.net/Rserve/doc.html#conf
  • 7. An example statistical analysis  To illustrate how to use the Rshell service, we will carry out a simple statistical analysis on a small hypothetical set of species incidence data from 4 species measured from 6 sites: Site Species N1 N2 A1 A2 B1 B2 Species_A 90 110 190 210 290 310 Species_B 190 210 390 410 590 610 Species_C 90 110 110 90 120 80 Species_D 200 100 400 90 600 200
  • 8. An example statistical analysis  This data set can be found in a comma-separated file named biodiv_R_testdata.csv in the myExperiment group under “Biodiversity Test Data for Rservice Tutorial”. Download the file.  To analyse this data set using Rshell, the data has to be loaded into memory as part of the workflow  This can achieved by using the Read_Text_File service available in the Local services/io folder of the service palette (as shown on the next slide)
  • 9. An example statistical analysis  Drag this service onto the workflow diagram and link it to a String constant containing the path to the testdata.csv file (as shown on the next slide)
  • 11. Adding an Rshell service  Test this service works by attaching its output port to a workflow output and running the workflow  Now add an Rshell service to a workflow by locating it under Service templates in the Service panel and dragging it onto the workflow diagram
  • 12. Configuring a Rshell service  A window will appear to configure the use of the Rshell service  The configuration of the Rshell service is split into several tabs  Each tab has Apply and Close buttons at the bottom. Apply saves the configuration as shown in the tabs, and Close closes the configuration dialog
  • 14. Configuring a Rshell service  The first tab of the Rshell configuration is used to enter the R script that will be executed  We will use an R script that will perform a series of t-tests to see if species incidence differs significantly between site A and site B.  You should be careful about performing a t-test on as little as 2 replicates - this example is just for illustrative purposes
  • 15. R script #Read in data incdata <- read.table(file=data,head=TRUE,sep=","); #Perform t-tests on all species between sites A and B pvalues <- apply(incdata, 1, function(x) { t.test(x[3:4], x[5:6]) $p.value }); #Write results into a matrix containing incidence data and p-values combined <- cbind(incdata[3:4], incdata[5:6], pvalues); #Output data write.csv(combined, file = results_table, row.names = TRUE); Copy and paste the above script into the Script tab of the Rshell configuration box
  • 16. Rshell input and output ports  Input and output ports are the connection points between the rest of the workflow and the Rshell service  Rshell makes input ports available as variables in the script named after the port.  Output ports read their named variable after executing the script. The last assigned value to the variable will be the one returned from the service via the output port.
  • 17. Rshell input and output ports  To add an input port:  Select the Input ports tab from the Rshell configuration dialog  Click Add port button  Enter the name of the input port, for this example use ‘data’  Specify the input port type, for this example use ‘Text-file’
  • 18. Rshell input and output ports
  • 19. Rshell input and output ports  The input port type indicates the data type this variable will have within the R-script. The possible types for R input ports are:  Logical  Numeric  Integer  String  Logical vector  Numeric vector  Integer vector  String vector  Text-file
  • 20. Rshell input and output ports  An output port can be added in a similar way:  Select the Output ports tab from the Rshell configuration dialog  Click Add port button  Enter the name of the output port, for this example use ‘results_table’  Specify the output port type, for this example use ‘Text-file’
  • 21. Rshell input and output ports
  • 22. Rshell input and output ports  The output port type indicates the type this variable has within the R-script. The possible types for R output ports are:  Numeric  Integer  String  Logical vector  Numeric vector  Integer vector  String vector  Text-file
  • 23. Rshell connection settings  Configuration of the connection parameters for Rserve is done using the Connection settings tab. This tab can be used to:  Configure the Rshell to use an Rserve installation on a different machine to where you run the Taverna workbench  Configure the access of Rserve on a different port  Provide authentication details for accessing Rserve in the form of a username and password  If you are using Rserve on the same machine that you are running Taverna on then you probably do not need to change the connection settings
  • 25. Completing the workflow  To complete the workflow:  Attach the output port of the Read_text_file to the “data” input port of the Rshell service  Create a workflow output from the results_table output port of the Rshell service  Your workflow should now look as follows:
  • 27. Results When you run the workflow, you should get the following results:
  • 28. Results Try saving the results as a csv file and opening the file in Excel. You see something as follows:
  • 29. Results  The results show that:  The incidence of species b is significantly different at 0.01 level  The incidence of species a is not significant at 0.01 level  The incidence of Species c and d have no significant difference. For species d that is because even though an increasing trend is observed, the variation within each category is too high to allow any conclusions.
  • 30. Output of images from Rshell  The results from R scripts may be in the form of images such as a plot or a graph  These images can be output from an Rshell service
  • 31. Output of images from Rshell  Add another Rshell processor onto the current workflow from the service template folder  Provide the processor with the R script below:
  • 32. Output of images from Rshell  #Read in data  incdata <- read.table(file="data",head=TRUE,sep=",");  #Transpose  t <- t(incdata);  #Calculate means  mean_a <- mean(t[,"Species_A"]);  mean_b <- mean(t[,"Species_B"]);  mean_c <- mean(t[,"Species_C"]);  mean_d <- mean(t[,"Species_D"]);  #Combine data  means <- c(mean_a, mean_b, mean_c, mean_d);  #Transform to data frame  means <- data.frame(means, row.names = c("Species_A", "Species_B", "Species_C", "Species_D"));  png(filename=figure, height=400, width=400, bg="white");  #Plot  barplot(t(means[1]), main = "mean species incidence levels", xlab = "Species");  dev.off(); 
  • 33. Output of images from Rshell  Complete the configuration for this Rshell processor by:  Creating an input port called data and associating it with a Text-file data type  Creating an output port called figure and associating it with a png-file data type  Also, finish building the workflow by connecting a workflow output to the figure output port of the Rshell processor  Your workflow will now look as follows:
  • 34. Output of images from Rshell
  • 35. Output of images from Rshell  Now run the workflow. You should get the following results:  The result is an image showing a bar plot of the mean incidence levels of the four species