SlideShare a Scribd company logo
1 of 19
An Introduction to Running, Reusing and 
Sharing Workflows with Taverna 
Stian Soiland-Reyes and Christian Brenninkmeijer 
University of Manchester 
materials by Katy Wolstencroft, Aleksandra Pawlik, Alan Williams 
http://orcid.org/0000-0001-9842-9718 
http://orcid.org/0000-0002-2937-7819 
http://orcid.org/0000-0002-1279-5133 
http://orcid.org/0000-0001-8418-6735 
http://orcid.org/0000-0003-3156-2105 
Bonn University, 2014-09-01 / 2014-09-03 
This work is licensed under a http://www.taverna.org.uk/ 
Creative Commons Attribution 3.0 Unported License
 This tutorial will give you a basic introduction to reusing 
workflows in Taverna and my Experiment. 
 Other tutorials will explore nested workflows, the workflow 
engine (iteration, looping, parallel invocation) 
 Like in the previous tutorial workflows in this practical use 
small data-sets and are designed to run in a few minutes. In 
the real world, you would be using larger data sets and 
workflows would typically run for longer
Bigger Workflows: Enrichment Analysis 
The previous examples were trivial, small tasks. Taverna’s real 
power is in iterating over large data sets 
 Many experiments result in a list of genes (e.g. microarray 
analysis, Chip-Seq, SNP identification etc). 
 In this exercise, we will use Taverna to analyse a gene set 
from a Chip-Seq experiment by finding and reusing existing 
workflows 
 We will enrich our dataset by discovering: 
1. Which pathways our genes are involved in 
2. The functions of the genes 
3. Literature evidence for the phenotype/trait of interest
Re-using workflows from myExperiment 
 Go to http://www.myexperiment.org and click on ‘find 
workflows’ 
 You will see a list of the most viewed and downloaded workflow 
– see what the most popular workflow does by reading the 
description 
 Change the rank to ‘Latest’ and see what has been uploaded in 
the last few weeks 
 We will now find and download a workflow to identify the 
pathways each gene in our gene set is involved in
Re-using workflows from myExperiment 
 Find the workflow called “UnigeneID to KEGG Pathways” and look 
at the workflow entry page (uploaded by “Aleksandra Pawlik”) 
 Download the workflow by clicking on the link: “Download 
Workflow” and find out what it does by reading the descriptions in 
myExperiment 
 Open the workflow in Taverna by going to ‘File ->Open Workflow’ 
 Run the workflow using the example values supplied (Hint: when 
you run the workflow you can add the example values by clicking 
on “Use examples”) 
 Look at the workflow output – now you will see pathway 
information and pathway diagrams
Combining workflows from myExperiment 
 To analyse all the genes from our ChipSeq study, we need to 
extract the gene list from our results file 
 To make it easier to work through the example, we have 
provided a Chip-Seq gene list on myExperiment, you can find 
it under “GalaxyGeneList - short : datafile for training” 
 Save this file to your local machine 
 Open the file in Excel 
 Save the file with a .csv extension 
 As you can see, the list of genes is in column D 
 Taverna can process and extract this column automatically
Combining workflows from myExperiment 
 In myExperiment, find and download the workflow called 
“Import and convert gene list” 
 This workflow will extract the list of genes in column D 
using Taverna’s built-in spreadsheet import tool (which can 
be found in the services panel, for future reference) 
 The next step in the workflow converts RefSeq IDs into 
unigene IDs (required for the pathways workflow – 
converting between different types of identifiers is a 
common problem in bioinformatics!) 
 Run the workflow. This time, in the input window, select 
“set file location” and set the location to the saved .csv 
gene list. 
 Look at the workflow results
Combining workflows from myExperiment 
 We will now combine the two 
workflows 
 While you are still in the “import 
and convert” workflow, go to the 
top of the workbench and select 
“insert -> Nested workflow” 
 In the pop-up window, select 
“import from file” and find the 
pathways workflow you 
downloaded earlier. 
 Click on “import workflow” and 
the pathways workflow will 
appear in the main workflow 
diagram.
Combining workflows from myExperiment 
 Connect the workflows up by linking the output of the 
‘Merge_Gene_List’ with the nested workflow input
Combining workflows from myExperiment 
 Create new output ports for the Nested workflow and connect the 
Nested workflow outputs to the new outputs 
NOTE: you don’t need to connect them all, just pathway descriptions, 
pathway images and gene descriptions 
 Save the workflow 
 Run the workflow (it may take a few minutes)
GO Associations 
There are many different tools we could use to find Gene 
Ontology associations for your gene list 
For example, we could simply modify the BioMart/Ensembl 
service in the ‘Import and convert gene list’ workflow we have 
already used 
Reload the ‘Import and Convert gene list’ workflow 
Right-click on the ‘mmusculus_gene_ensembl’ service and 
select ‘Copy’ 
Paste an extra copy of this service into the same workflow 
diagram
GO Associations 
This is a BioMart service. It allows you to retrieve omics data 
from ENSEMBL and other genomics resources. If you are familiar 
with BioMart, you will see the interface in Taverna is very similar 
to the web interface 
We will modify the BioMart query to find all GO associations for 
each gene associated with a Chip-Seq peak 
Right-click on the new copy of the service and select ‘Configure 
BioMart Query’
GO Associations 
The inputs (or filters) already accept RefSeq Ids from our input 
file, but we need to modify the outputs (or attributes) 
Select ‘Attributes’ and expand the …‘External’… section. 
Select ‘Go Term Accession’, ‘GO Term Definition’ and ‘Go 
Domain’ 
Unselect ‘UniGeneID’ and select ‘RefSeq mRNA’… 
 (See screenshot on the next slide for an example)
GO Associations
GO Associations 
Click ‘apply’ to save your changes, and ‘close’, to go back to 
Taverna 
At the top of the workflow diagram, change the workflow view 
to show all ports by clicking on the table icon
Connect your new service to 
the workflow by linking the 
‘D’ output port of the 
spreadsheet service to the 
input of your new service 
Make the new output ports 
and connect them as shown 
to your new service 
GO Associations
GO Associations 
Save the workflow by going to ‘File -> Save Workflow’ 
Run the workflow 
Set file location to the GalaxyGeneList file saved earlier 
(Download and) view the GO report 
In “GO_REPORT” tab select “Value 1” and press “Save Value” 
Save as a .txt or as a .tsf (tab separated) file
Simple Text Mining 
So far we have looked at enriching the genomic information, but 
we could also use workflows for running data analyses (e.g. 
aligning mouse genes with human homologues) or performing 
literature searches 
Think about the ways you could extend this analysis with 
literature searches (e.g. Correlations between pathways, genes, 
GO terms, phenotypes etc) 
Search myExperiment for workflows involving text mining, 
using the search terms “text mining” and “Pubmed”
Text Mining 
 Find and open the workflow “Phenotype to pubmed” 
One of the services is no longer available in the nested workflow 
(the faded-out service). Taverna checks the availability of each 
service when you load the workflow and when you run it 
In this case, the workflow will still run without the final nested 
workflow (clean text) 
Delete the ‘clean text’ nested workflow (by selecting it and 
right-clicking), and reconnect the workflow output 
Run the workflow with the search term ‘erythropoiesis’ (or a 
phenotype term to describe the disease you are studying)

More Related Content

What's hot

performancetestingjmeter-121109061704-phpapp02 (1)
performancetestingjmeter-121109061704-phpapp02 (1)performancetestingjmeter-121109061704-phpapp02 (1)
performancetestingjmeter-121109061704-phpapp02 (1)
QA Programmer
 

What's hot (13)

2014 Taverna tutorial Advanced Taverna
2014 Taverna tutorial Advanced Taverna2014 Taverna tutorial Advanced Taverna
2014 Taverna tutorial Advanced Taverna
 
Less13 3 e_loadmodule_3
Less13 3 e_loadmodule_3Less13 3 e_loadmodule_3
Less13 3 e_loadmodule_3
 
Anypoint Batch Processing and Polling Scope With Mulesoft
Anypoint Batch Processing and Polling Scope With MulesoftAnypoint Batch Processing and Polling Scope With Mulesoft
Anypoint Batch Processing and Polling Scope With Mulesoft
 
Create & Execute First Hadoop MapReduce Project in.pptx
Create & Execute First Hadoop MapReduce Project in.pptxCreate & Execute First Hadoop MapReduce Project in.pptx
Create & Execute First Hadoop MapReduce Project in.pptx
 
Ajax and xml
Ajax and xmlAjax and xml
Ajax and xml
 
Mule batch processing
Mule batch processingMule batch processing
Mule batch processing
 
Change tracking
Change trackingChange tracking
Change tracking
 
performancetestingjmeter-121109061704-phpapp02 (1)
performancetestingjmeter-121109061704-phpapp02 (1)performancetestingjmeter-121109061704-phpapp02 (1)
performancetestingjmeter-121109061704-phpapp02 (1)
 
BD-zero lecture.pptx
BD-zero lecture.pptxBD-zero lecture.pptx
BD-zero lecture.pptx
 
exp-7-pig installation.pptx
exp-7-pig installation.pptxexp-7-pig installation.pptx
exp-7-pig installation.pptx
 
Less14 3 e_loadmodule_4
Less14 3 e_loadmodule_4Less14 3 e_loadmodule_4
Less14 3 e_loadmodule_4
 
Database example
Database exampleDatabase example
Database example
 
Using groovy component
Using groovy componentUsing groovy component
Using groovy component
 

Similar to 2014 Taverna tutorial myExperiment

Project
ProjectProject
Project
Xu Liu
 
Bhagat Myexperiment Bosc2008
Bhagat Myexperiment Bosc2008Bhagat Myexperiment Bosc2008
Bhagat Myexperiment Bosc2008
bosc_2008
 
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
Stian Soiland-Reyes
 
User Manual_SQLVB Report Tool v2
User Manual_SQLVB Report Tool v2User Manual_SQLVB Report Tool v2
User Manual_SQLVB Report Tool v2
Russell Frearson
 

Similar to 2014 Taverna tutorial myExperiment (20)

2014 Taverna Tutorial Biodiversity example
2014 Taverna Tutorial Biodiversity example2014 Taverna Tutorial Biodiversity example
2014 Taverna Tutorial Biodiversity example
 
IMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to TavernaIMPACT/myGrid Hackathon - Introduction to Taverna
IMPACT/myGrid Hackathon - Introduction to Taverna
 
2014 Taverna Tutorial Components
2014 Taverna Tutorial Components2014 Taverna Tutorial Components
2014 Taverna Tutorial Components
 
Taverna tutorial
Taverna tutorialTaverna tutorial
Taverna tutorial
 
Test Strategy Utilising Mc Useful Tools
Test Strategy Utilising Mc Useful ToolsTest Strategy Utilising Mc Useful Tools
Test Strategy Utilising Mc Useful Tools
 
Code transformation With Spoon
Code transformation With SpoonCode transformation With Spoon
Code transformation With Spoon
 
Project
ProjectProject
Project
 
Bhagat Myexperiment Bosc2008
Bhagat Myexperiment Bosc2008Bhagat Myexperiment Bosc2008
Bhagat Myexperiment Bosc2008
 
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
 
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
 
Tools and connectors
Tools and connectorsTools and connectors
Tools and connectors
 
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Michael Reich, GenomeSpace Workshop, fged_seattle_2013Michael Reich, GenomeSpace Workshop, fged_seattle_2013
Michael Reich, GenomeSpace Workshop, fged_seattle_2013
 
Learning to Rank Relevant Files for Bug Reports using Domain Knowledge
Learning to Rank Relevant Files for Bug Reports using Domain KnowledgeLearning to Rank Relevant Files for Bug Reports using Domain Knowledge
Learning to Rank Relevant Files for Bug Reports using Domain Knowledge
 
OpenML NeurIPS2018
OpenML NeurIPS2018OpenML NeurIPS2018
OpenML NeurIPS2018
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENTMULTIPLE SEQUENCE ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
Data Mining using Weka
Data Mining using WekaData Mining using Weka
Data Mining using Weka
 
White Paper On ConCurrency For PCMS Application Architecture
White Paper On ConCurrency For PCMS Application ArchitectureWhite Paper On ConCurrency For PCMS Application Architecture
White Paper On ConCurrency For PCMS Application Architecture
 
User Manual_SQLVB Report Tool v2
User Manual_SQLVB Report Tool v2User Manual_SQLVB Report Tool v2
User Manual_SQLVB Report Tool v2
 
Java 7 Features and Enhancements
Java 7 Features and EnhancementsJava 7 Features and Enhancements
Java 7 Features and Enhancements
 

More from myGrid team

More from myGrid team (12)

Taverna summary
Taverna summaryTaverna summary
Taverna summary
 
2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows
 
2014 Taverna Tutorial Interactions
2014 Taverna Tutorial Interactions2014 Taverna Tutorial Interactions
2014 Taverna Tutorial Interactions
 
2014 Taverna Tutorial Nested workflows
2014 Taverna Tutorial Nested workflows2014 Taverna Tutorial Nested workflows
2014 Taverna Tutorial Nested workflows
 
2014 Taverna tutorial Xpath
2014 Taverna tutorial Xpath2014 Taverna tutorial Xpath
2014 Taverna tutorial Xpath
 
SWeDe - Scientific Webservice Description
SWeDe - Scientific Webservice DescriptionSWeDe - Scientific Webservice Description
SWeDe - Scientific Webservice Description
 
Taverna workflows in the cloud
Taverna workflows in the cloudTaverna workflows in the cloud
Taverna workflows in the cloud
 
The Taverna Software Suite
The Taverna Software SuiteThe Taverna Software Suite
The Taverna Software Suite
 
The Taverna Workflow Management Software Suite - Past, Present, Future
The Taverna Workflow Management Software Suite - Past, Present, FutureThe Taverna Workflow Management Software Suite - Past, Present, Future
The Taverna Workflow Management Software Suite - Past, Present, Future
 
2014-06-03-Taverna-IS-ENES2
2014-06-03-Taverna-IS-ENES22014-06-03-Taverna-IS-ENES2
2014-06-03-Taverna-IS-ENES2
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
 
If we build it will they come?
If we build it will they come?If we build it will they come?
If we build it will they come?
 

Recently uploaded

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 

Recently uploaded (20)

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions Presentation
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 

2014 Taverna tutorial myExperiment

  • 1. An Introduction to Running, Reusing and Sharing Workflows with Taverna Stian Soiland-Reyes and Christian Brenninkmeijer University of Manchester materials by Katy Wolstencroft, Aleksandra Pawlik, Alan Williams http://orcid.org/0000-0001-9842-9718 http://orcid.org/0000-0002-2937-7819 http://orcid.org/0000-0002-1279-5133 http://orcid.org/0000-0001-8418-6735 http://orcid.org/0000-0003-3156-2105 Bonn University, 2014-09-01 / 2014-09-03 This work is licensed under a http://www.taverna.org.uk/ Creative Commons Attribution 3.0 Unported License
  • 2.  This tutorial will give you a basic introduction to reusing workflows in Taverna and my Experiment.  Other tutorials will explore nested workflows, the workflow engine (iteration, looping, parallel invocation)  Like in the previous tutorial workflows in this practical use small data-sets and are designed to run in a few minutes. In the real world, you would be using larger data sets and workflows would typically run for longer
  • 3. Bigger Workflows: Enrichment Analysis The previous examples were trivial, small tasks. Taverna’s real power is in iterating over large data sets  Many experiments result in a list of genes (e.g. microarray analysis, Chip-Seq, SNP identification etc).  In this exercise, we will use Taverna to analyse a gene set from a Chip-Seq experiment by finding and reusing existing workflows  We will enrich our dataset by discovering: 1. Which pathways our genes are involved in 2. The functions of the genes 3. Literature evidence for the phenotype/trait of interest
  • 4. Re-using workflows from myExperiment  Go to http://www.myexperiment.org and click on ‘find workflows’  You will see a list of the most viewed and downloaded workflow – see what the most popular workflow does by reading the description  Change the rank to ‘Latest’ and see what has been uploaded in the last few weeks  We will now find and download a workflow to identify the pathways each gene in our gene set is involved in
  • 5. Re-using workflows from myExperiment  Find the workflow called “UnigeneID to KEGG Pathways” and look at the workflow entry page (uploaded by “Aleksandra Pawlik”)  Download the workflow by clicking on the link: “Download Workflow” and find out what it does by reading the descriptions in myExperiment  Open the workflow in Taverna by going to ‘File ->Open Workflow’  Run the workflow using the example values supplied (Hint: when you run the workflow you can add the example values by clicking on “Use examples”)  Look at the workflow output – now you will see pathway information and pathway diagrams
  • 6. Combining workflows from myExperiment  To analyse all the genes from our ChipSeq study, we need to extract the gene list from our results file  To make it easier to work through the example, we have provided a Chip-Seq gene list on myExperiment, you can find it under “GalaxyGeneList - short : datafile for training”  Save this file to your local machine  Open the file in Excel  Save the file with a .csv extension  As you can see, the list of genes is in column D  Taverna can process and extract this column automatically
  • 7. Combining workflows from myExperiment  In myExperiment, find and download the workflow called “Import and convert gene list”  This workflow will extract the list of genes in column D using Taverna’s built-in spreadsheet import tool (which can be found in the services panel, for future reference)  The next step in the workflow converts RefSeq IDs into unigene IDs (required for the pathways workflow – converting between different types of identifiers is a common problem in bioinformatics!)  Run the workflow. This time, in the input window, select “set file location” and set the location to the saved .csv gene list.  Look at the workflow results
  • 8. Combining workflows from myExperiment  We will now combine the two workflows  While you are still in the “import and convert” workflow, go to the top of the workbench and select “insert -> Nested workflow”  In the pop-up window, select “import from file” and find the pathways workflow you downloaded earlier.  Click on “import workflow” and the pathways workflow will appear in the main workflow diagram.
  • 9. Combining workflows from myExperiment  Connect the workflows up by linking the output of the ‘Merge_Gene_List’ with the nested workflow input
  • 10. Combining workflows from myExperiment  Create new output ports for the Nested workflow and connect the Nested workflow outputs to the new outputs NOTE: you don’t need to connect them all, just pathway descriptions, pathway images and gene descriptions  Save the workflow  Run the workflow (it may take a few minutes)
  • 11. GO Associations There are many different tools we could use to find Gene Ontology associations for your gene list For example, we could simply modify the BioMart/Ensembl service in the ‘Import and convert gene list’ workflow we have already used Reload the ‘Import and Convert gene list’ workflow Right-click on the ‘mmusculus_gene_ensembl’ service and select ‘Copy’ Paste an extra copy of this service into the same workflow diagram
  • 12. GO Associations This is a BioMart service. It allows you to retrieve omics data from ENSEMBL and other genomics resources. If you are familiar with BioMart, you will see the interface in Taverna is very similar to the web interface We will modify the BioMart query to find all GO associations for each gene associated with a Chip-Seq peak Right-click on the new copy of the service and select ‘Configure BioMart Query’
  • 13. GO Associations The inputs (or filters) already accept RefSeq Ids from our input file, but we need to modify the outputs (or attributes) Select ‘Attributes’ and expand the …‘External’… section. Select ‘Go Term Accession’, ‘GO Term Definition’ and ‘Go Domain’ Unselect ‘UniGeneID’ and select ‘RefSeq mRNA’…  (See screenshot on the next slide for an example)
  • 15. GO Associations Click ‘apply’ to save your changes, and ‘close’, to go back to Taverna At the top of the workflow diagram, change the workflow view to show all ports by clicking on the table icon
  • 16. Connect your new service to the workflow by linking the ‘D’ output port of the spreadsheet service to the input of your new service Make the new output ports and connect them as shown to your new service GO Associations
  • 17. GO Associations Save the workflow by going to ‘File -> Save Workflow’ Run the workflow Set file location to the GalaxyGeneList file saved earlier (Download and) view the GO report In “GO_REPORT” tab select “Value 1” and press “Save Value” Save as a .txt or as a .tsf (tab separated) file
  • 18. Simple Text Mining So far we have looked at enriching the genomic information, but we could also use workflows for running data analyses (e.g. aligning mouse genes with human homologues) or performing literature searches Think about the ways you could extend this analysis with literature searches (e.g. Correlations between pathways, genes, GO terms, phenotypes etc) Search myExperiment for workflows involving text mining, using the search terms “text mining” and “Pubmed”
  • 19. Text Mining  Find and open the workflow “Phenotype to pubmed” One of the services is no longer available in the nested workflow (the faded-out service). Taverna checks the availability of each service when you load the workflow and when you run it In this case, the workflow will still run without the final nested workflow (clean text) Delete the ‘clean text’ nested workflow (by selecting it and right-clicking), and reconnect the workflow output Run the workflow with the search term ‘erythropoiesis’ (or a phenotype term to describe the disease you are studying)

Editor's Notes

  1. ###