SlideShare a Scribd company logo
1 of 54
Download to read offline
 
Sandra	
  Gesing	
  
Center	
  for	
  Research	
  Compu6ng	
  
sandra.gesing@nd.edu	
  
	
  
19	
  August	
  2016	
  
Increasing	
  the	
  Efficiency	
  of	
  
Workflows:	
  	
  
Use	
  Cases	
  in	
  the	
  Life	
  Sciences	
  
University	
  of	
  Notre	
  Dame	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  2	
  
• 	
  In	
  the	
  middle	
  of	
  nowhere	
  of	
  northern	
  Indiana	
  	
  
	
  (1.5	
  h	
  from	
  Chicago)	
  
• 	
  4	
  undergraduate	
  colleges	
  	
  
• 	
  ~35	
  research	
  ins6tutes	
  and	
  centers	
  
• 	
  ~12,000	
  students	
  
Center	
  for	
  Research	
  Compu6ng	
  
Sandra	
  Gesing 	
   	
   	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  3	
  
• 	
  SoUware	
  development	
  and	
  profiling	
  
• 	
  Cyberinfrastructure/science	
  gateway	
  development	
  
• 	
  Computa6onal	
  Scien6st	
  support	
  
• 	
  Collabora6ve	
  research/	
  
	
  grant	
  development	
  	
  	
  	
  
• 	
  System	
  administra6on/	
  
	
  prototype	
  architectures	
  
• 	
  Computa6onal	
  resources:	
  
	
  	
  25,000	
  cores+	
  
• 	
  Storage	
  resources:	
  3	
  PB	
  
• 	
  Na6onal	
  resources	
  (e.g.,	
  XSEDE)	
  
• 	
  ~40	
  researchers,	
  	
  
	
  	
  	
  research	
  programmers,	
  	
  
	
  	
  HPC	
  specialists	
  
CRC	
  and	
  OIT	
  building	
  
h`p://crc.nd.edu	
   CRC	
  HPC	
  Center	
  (old	
  Union	
  Sta6on)	
  
Life	
  Sciences	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  4	
  
•  Genomics	
  
•  Proteomics	
  
•  Metabolomics	
  
•  Immunomics	
  
•  System	
  biology	
  
•  Molecular	
  simula6ons	
  
•  Docking	
  
•  Epidemiology	
  
•  …	
  
Black	
  Swallowtail	
  –	
  	
  
larvae	
  and	
  bu`erfly	
  
The	
  Genomics	
  Boom	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  5	
  
February	
  16,	
  2001	
  
	
  biotech	
  company	
  Celera	
  	
  
February	
  15,	
  2001	
  
The	
  Human	
  Genome	
  Project	
  	
  
The	
  Genomics	
  Boom	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  6	
  
Craig	
  Venter	
  (leU)	
  and	
  Francis	
  Collins	
  (right)	
  
Big	
  Data	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  7	
  
• 	
  Explosion	
  in	
  the	
  quan6ty,	
  variety	
  and	
  complexity	
  of	
  
	
  data	
  	
  
• 	
  Ques6ons	
  can	
  be	
  answered	
  impossible	
  to	
  even	
  ask	
  
	
  about	
  10	
  years	
  ago	
  
• 	
  Costs	
  far	
  reduced	
  (e.g.,	
  Human	
  Genome	
  project,	
  15	
  
	
  years,	
  ~$2	
  billion;	
  today	
  ~3	
  days,	
  $1000)	
  
Big	
  Data	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  8	
  
h`p://www.genome.gov/images/content/cost_per_genome_oct2015.jpg	
  
Analysis	
  of	
  Data	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  9	
  
12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt
12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt
12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct
12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt
12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt
12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt
12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg
12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga
12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc
12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa
12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa
Slide	
  copied	
  from:	
  Stuart	
  Owen	
  „Workflows	
  with	
  Taverna“	
  
A	
  sequence	
  of	
  connected	
  steps	
  in	
  a	
  defined	
  order	
  	
  
based	
  on	
  their	
  control	
  and	
  data	
  dependencies	
  
Analysis	
  of	
  data	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  10	
  
12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt
12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt
12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct
12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt
12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt
12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt
12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg
12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga
12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc
12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa
12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa
Slide	
  copied	
  from:	
  Stuart	
  Owen	
  „Workflows	
  with	
  Taverna“	
  
A	
  sequence	
  of	
  connected	
  steps	
  in	
  a	
  defined	
  order	
  	
  
based	
  on	
  their	
  control	
  and	
  data	
  dependencies	
  
Workflows!	
  
Workflows	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  11	
  
• 	
  Different	
  workflow	
  concepts	
  
• 	
  Different	
  workflow	
  languages	
  
• 	
  Different	
  workflow	
  constructs	
  	
  
	
  
	
  
Taverna	
  WorkWays	
  
Workflow	
  Editors	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  12	
  
• 	
  Different	
  technologies	
  (workbenches,	
  web-­‐based)	
  	
  
• 	
  Different	
  look-­‐and-­‐feel	
  
	
  
An	
  Award-­‐Winning	
  Solu6on	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  13	
  
State	
  of	
  the	
  Art	
  	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  14	
  
Data	
  and	
  compute-­‐	
  
intensive	
  problems	
  
High-­‐speed	
  networks	
  
Users	
  generally	
  not	
  
IT	
  specialists	
  Tools	
  and	
  workflow	
  
engines	
  
Web-­‐based	
  	
  
agile	
  frameworks	
   Distributed	
  data	
  and	
  	
  
compu6ng	
  infrastructures	
  
Challenge	
  for	
  Developers	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  15	
  
Data	
  and	
  compute-­‐	
  
intensive	
  problems	
  
High-­‐speed	
  networks	
  Tools	
  and	
  workflow	
  
engines	
  
Web-­‐based	
  	
  
agile	
  frameworks	
   Distributed	
  data	
  and	
  	
  
compu6ng	
  infrastructures	
  
Users	
  generally	
  not	
  
IT	
  specialists	
  
Need	
  for	
  intui6ve	
  and	
  efficient	
  workflows!	
  
Challenge	
  for	
  Developers	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  16	
  
Data	
  and	
  compute-­‐	
  
intensive	
  problems	
  
High-­‐speed	
  networks	
  Tools	
  and	
  workflow	
  
engines	
  
Web-­‐based	
  	
  
agile	
  frameworks	
   Distributed	
  data	
  and	
  	
  
compu6ng	
  infrastructures	
  
Users	
  generally	
  not	
  
IT	
  specialists	
  
Usability	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  17	
  
	
  
	
  
“AUer	
  all,	
  usability	
  really	
  just	
  means	
  that	
  making	
  sure	
  
that	
  something	
  works	
  well:	
  that	
  a	
  person	
  …	
  can	
  use	
  the	
  
thing	
  -­‐	
  whether	
  it's	
  a	
  Web	
  site,	
  a	
  fighter	
  jet,	
  or	
  a	
  
revolving	
  door	
  -­‐	
  for	
  its	
  intended	
  purpose	
  without	
  geung	
  
hopelessly	
  frustrated.”	
  	
  
(Steve	
  Krug	
  in	
  “Don't	
  make	
  me	
  	
  
think!:	
  A	
  Common	
  Sense	
  Approach	
  
to	
  Web	
  Usability”,	
  2005)	
  
Efficiency	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  18	
  
•  Time	
  
•  Computa6onal	
  resources	
  
•  Money	
  
	
  
Workflow	
  Enhancements	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  19	
  
•  Logical	
  level:	
  Meta-­‐workflows	
  
Herres-­‐Pawlis,	
  S.,	
  Hoffmann,	
  A.,	
  Rösener,	
  T.,	
  Krüger,	
  J.,	
  Grunzke,	
  R.,	
  and	
  Gesing,	
  S.	
  “Mul6-­‐layer	
  Meta-­‐
metaworkflows	
  for	
  the	
  Evalua6on	
  of	
  Solvent	
  and	
  Dispersion	
  Effects	
  in	
  Transi6on	
  Metal	
  Systems	
  Using	
  the	
  
MoSGrid	
  Science	
  Gateways”Science	
  Gateways	
  (IWSG),	
  2015	
  7th	
  Interna6onal	
  Workshop	
  on,	
  pp.47-­‐52,	
  3-­‐5	
  June	
  
2015,	
  IEEE	
  Xplore,	
  doi:	
  10.1109/IWSG.2015.13	
  	
  
•  System	
  level:	
  Combina6on	
  of	
  strengths	
  of	
  workflow	
  
systems	
  
Hazekamp,	
  N.,	
  Sarro,	
  J.,	
  Choudhury,	
  O.,	
  Gesing,	
  S.,	
  Sco`	
  Emrich	
  and	
  Thain,	
  D.	
  “Scaling	
  Up	
  Bioinforma6cs	
  
Workflows	
  with	
  Dynamic	
  Job	
  Expansion:	
  A	
  Case	
  Study	
  Using	
  Galaxy	
  and	
  Makeflow”,	
  e-­‐Science	
  (e-­‐Science),	
  2015	
  
IEEE	
  11th	
  Interna6onal	
  Conference	
  on,	
  pp.332-­‐341,	
  Aug.	
  31	
  2015-­‐Sept.	
  4	
  2015	
  
•  Predic6on:	
  Model	
  for	
  op6miza6on	
  of	
  tasks	
  and	
  
threads	
  
Choudhury,	
  O.,	
  Rajan,	
  D.,	
  Hazekamp,	
  N.,	
  Gesing,	
  S.,	
  Thain,	
  D.,	
  and	
  Emrich,	
  S.	
  “Balancing	
  Thread-­‐level	
  and	
  Task-­‐
level	
  Parallelism	
  for	
  Data-­‐Intensive	
  Workloads	
  on	
  Clusters	
  and	
  Clouds”,	
  Cluster	
  Compu6ng	
  (CLUSTER),	
  2015	
  IEEE	
  
Interna6onal	
  Conference	
  on,	
  pp.390-­‐393,	
  8-­‐11	
  Sept.	
  2015,	
  doi:10.1109/CLUSTER.2015.60	
  
	
  
MoSGrid	
  Science	
  Gateway	
  
Molecular	
  Simula6on	
  Grid	
  
• 	
  	
  Science	
  gateway	
  integrated	
  with	
  underlying	
  
	
  compute	
  and	
  data	
  management	
  infrastructure	
  	
  	
  
• 	
  	
  Distributed	
  workflow	
  management	
  
• 	
  	
  Data	
  repository	
  
• 	
  	
  Metadata	
  management	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  20	
  
MoSGrid	
  Science	
  Gateway	
  
User	
  Interface	
  
WS-­‐PGRADE	
  
Liferay	
  
DCI	
  Resources	
  	
  
Middleware	
  Layer	
  
UNICORE	
  
XtreemFS	
  
High-­‐Level	
  
Middleware	
  
Service	
  Layer	
  
gUSE	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  21	
  
Quantum	
  Chemistry	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
   	
   	
   	
   	
   	
  	
  
	
  
	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  22	
  
Quantum	
  Chemistry	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
   	
   	
   	
   	
   	
  	
  
	
  
	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  23	
  
Molecular	
  Dynamics	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
   	
   	
   	
   	
   	
  	
  
	
  
	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  24	
  
Docking	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
   	
   	
   	
   	
   	
  	
  
	
  
	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  25	
  
Science	
  Case:	
  Polymerisa6on	
  catalysts	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  26	
  
Transla6on	
  into	
  Workflows	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  27	
  
Transla6on	
  into	
  Workflows	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  28	
  
Meta-­‐Workflows	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  29	
  
Transla6on	
  into	
  Meta-­‐Workflows	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  30	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  31	
  
#	
  Machines	
   #	
  Cores	
  
Data	
  
ParCConing	
  
Save	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  32	
  
Galaxy	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  33	
  
Simple	
  Workflow	
  in	
  Galaxy	
  
Problem:	
  As	
  Size	
  increases	
  so	
  does	
  Time	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  34	
  
Workflow	
  with	
  Parallelism	
  added	
  in	
  Galaxy	
  
Problem:	
  Tools	
  must	
  be	
  updated	
  every	
  
change	
  in	
  Parallelism/Relies	
  on	
  Scien6st	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  35	
  
Workflow	
  Dynamically	
  Expanded	
  behind	
  Galaxy	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  36	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  37	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  38	
  
Makeflow	
  
•  Task	
  Structure	
  
INPUTS	
  :	
  OUTPUTS	
  
	
  COMMAND	
  
	
  
•  Directed	
  Acyclic	
  Graph	
  (DAG)	
  
•  Programma6cally	
  Generated	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  39	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  40	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  41	
  
Job	
  Sandbox	
  –	
  Log	
  file	
  crea6on	
  for	
  cleanup	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  42	
  
Dynamic	
  Job	
  Expansion	
  	
  
	
  
•  Work	
  Queue:	
  we	
  u6lized	
  100s	
  of	
  cores	
  from	
  a	
  	
  
Condor	
  Pool	
  
•  Cleaning	
  Sandbox	
  using	
  knowledge	
  of	
  intermediates	
  
	
  and	
  logging	
  
•  Explored	
  methods	
  to	
  transmit	
  needed	
  environments	
  	
  
such	
  as	
  	
  
executables	
  and	
  Java	
  
	
  
61.5X	
  speed-­‐up	
  on	
  32	
  GB	
  dataset	
  u6lizing	
  these	
  methods	
  
Thread-­‐level	
  and	
  Task-­‐level	
  Parallelism	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  43	
  
•  Develop predictive performance models for an
application domain
•  Achieve acceptable performance the first time
•  Optimize resource utilization
•  Execution time
•  Memory usage
Thread-­‐level	
  and	
  Task-­‐level	
  Parallelism	
  	
  
•  WorkQueue master-worker framework
•  Sun Grid Engine (SGE) batch system
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  44	
  
Thread-­‐level	
  and	
  Task-­‐level	
  Parallelism	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  45	
  
1.  Applica6on-­‐level	
  model	
  for	
  6me:
𝑇(𝑅, 𝑄, 𝑁)=   𝛽1​ 𝑅 𝑄/𝑁 +   𝛽2	
  
2.  Applica6on-­‐level	
  model	
  for	
  memory:	
  	
  
𝑀(𝑅, 𝑁)=  γ1R  +γ2N	
  
3.  System-­‐level	
  model	
  for	
  6me:	
  	
  
𝑇𝑇𝑜𝑡𝑎𝑙= 𝜂1​ 𝑄 𝐾/𝐷 + 𝜂2(​ 𝑄/𝐵 +​ 𝑅 𝐾𝑁/𝐵𝐶 )+ 𝜂3T(R,​ 𝑄/𝐾 , 𝑁)∗​ 𝐾 𝑁/𝑀𝐶 +	
   𝜂4​ 𝑂/𝐵 + 𝜂5​ 𝑂 𝐾/𝐷   	
  
4.  System-­‐level	
  model	
  for	
  memory:	
  	
  
𝑀𝑀𝑎𝑠𝑡𝑒𝑟(𝑅, 𝑄)=ϕ1R  +ϕ2Q	
  
Thread-­‐level	
  and	
  Task-­‐level	
  Parallelism	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  46	
  
7	
  data	
  
points	
  
(R)	
  
7	
  data	
  
points	
  
(Q)	
  
7	
  data	
  
points	
  
(N)	
  
343	
  data	
  
points	
  
Data	
  
CollecCon	
  
Training	
  
data	
  
Regression	
  	
  
Model	
  
Training	
  
Accuracy	
  
Test	
  
MAPE	
  
TesCng	
  
Regression	
  
Coefficient
s	
  
Tes6ng	
  
data	
  
Thread-­‐level	
  and	
  Task-­‐level	
  Parallelism	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  47	
  
Avg.  MAPE
=  3.1
MAPE	
  =	
  Mean	
  Absolute	
  Percentage	
  Error	
  
Thread-­‐level	
  and	
  Task-­‐level	
  Parallelism	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  48	
  
For the given dataset, K* = 90, N* = 4
Result	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  49	
  
# Cores/
Task
#Tasks
Predicted
Time
(min)
Speedup
Estimated
EC2
Cost ($)
Estimated
Azure
Cost ($)
1 360 70 6.6 50.4 64.8
2 180 38 12.3 25.2 32.4
4 90 24 19.5 18.9 32.4
8 45 27 17.3 18.9 32.4
WORKS	
  Workshop	
  	
  	
  	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  50	
  
h`p://works.cs.cardiff.ac.uk/	
  
WORKS	
  Workshop	
  	
  	
  	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  51	
  
Keynote	
  Speaker	
  
Acknowledgements	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  52	
  
Logical	
  level	
  
•  Richard	
  Grunzke	
  
•  Sonja	
  Herres-­‐Pawlis	
  
•  Alexander	
  Hoffmann	
  
•  Jens	
  Krüger	
  
•  MTA	
  SZTAKI	
  
•  University	
  of	
  Westminster	
  
	
  
System	
  level	
  and	
  predic6on	
  
•  Nicholas Hazekamp
•  Olivia Choudhury
•  Douglas Thain
•  Scott Emrich
•  Notre Dame Bioinformatics Lab
•  The Cooperative Computing Lab, University of Notre
Dame	
  
Science	
  Gateways	
  Community	
  Ins6tute	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  53	
  
01	
  August	
  2016	
  –	
  31	
  July	
  2021	
  
1.  Incubator:	
  shared	
  exper6se	
  in	
  	
  
business	
  and	
  sustainability	
  planning,	
  	
  
cybersecurity,	
  user	
  interface	
  design,	
  	
  
and	
  soUware	
  engineering	
  prac6ces.	
  
2.  Extended	
  Developer	
  Support:	
  	
  
expert	
  developers	
  for	
  up	
  to	
  one	
  year	
  	
  
3.  Scien6fic	
  SoUware	
  Collabora6ve:	
  	
  	
  
open-­‐source,	
  extensible	
  framework	
  for	
  	
  
gateway	
  	
  design,	
  integra6on,	
  and	
  services	
  	
  
4.  Community	
  Engagement	
  and	
  Exchange:	
  	
  
forum	
  for	
  communica6on	
  and	
  shared	
  experiences	
  	
  
5.  Workforce	
  Development:	
  training	
  programs	
  and	
  helping	
  universi6es	
  
form	
  gateway	
  support	
  groups	
  
	
  	
  
h`p://sciencegateways.org/	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  54	
  
sandra.gesing@nd.edu	
  

More Related Content

What's hot

2017 bio it world
2017 bio it world2017 bio it world
2017 bio it worldChris Dwan
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forumChris Dwan
 
2015 09 emc lsug
2015 09 emc lsug2015 09 emc lsug
2015 09 emc lsugChris Dwan
 
2016 05 sanger
2016 05 sanger2016 05 sanger
2016 05 sangerChris Dwan
 
Gaasterland Laboratory Simplifies Genomics Research with Panasas
Gaasterland Laboratory Simplifies Genomics Research with PanasasGaasterland Laboratory Simplifies Genomics Research with Panasas
Gaasterland Laboratory Simplifies Genomics Research with PanasasPanasas
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
 
CaseStudy_Labguru_AstraZeneca
CaseStudy_Labguru_AstraZenecaCaseStudy_Labguru_AstraZeneca
CaseStudy_Labguru_AstraZeneca123taltal
 
How Safe and Always-Available Data as Lifeblood Helps a University Medical Ce...
How Safe and Always-Available Data as Lifeblood Helps a University Medical Ce...How Safe and Always-Available Data as Lifeblood Helps a University Medical Ce...
How Safe and Always-Available Data as Lifeblood Helps a University Medical Ce...Dana Gardner
 
Real time streaming analytics
Real time streaming analyticsReal time streaming analytics
Real time streaming analyticsAnirudh
 
Training Seminar - The Data Design Process
Training Seminar - The Data Design ProcessTraining Seminar - The Data Design Process
Training Seminar - The Data Design ProcessMaxwell Taylor
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Anita de Waard
 
Chapman university-big-data-healthcare-final
Chapman university-big-data-healthcare-finalChapman university-big-data-healthcare-final
Chapman university-big-data-healthcare-finalColin McNamara
 
Hadoop Training For Beginners | Hadoop Tutorial | Big Data Training |Edureka
Hadoop Training For Beginners | Hadoop Tutorial | Big Data Training |EdurekaHadoop Training For Beginners | Hadoop Tutorial | Big Data Training |Edureka
Hadoop Training For Beginners | Hadoop Tutorial | Big Data Training |EdurekaEdureka!
 
Taming the Big Data Beast - Together
Taming the Big Data Beast - TogetherTaming the Big Data Beast - Together
Taming the Big Data Beast - TogetherKennisalliantie
 
Scalable Predictive Analysis and The Trend with Big Data & AI
Scalable Predictive Analysis and The Trend with Big Data & AIScalable Predictive Analysis and The Trend with Big Data & AI
Scalable Predictive Analysis and The Trend with Big Data & AIJongwook Woo
 
Opportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckOpportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckPistoia Alliance
 
Panasas ® UCLA Customer Success Story
Panasas ®  UCLA Customer Success StoryPanasas ®  UCLA Customer Success Story
Panasas ® UCLA Customer Success StoryPanasas
 
Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides SlideTeam
 

What's hot (20)

2017 bio it world
2017 bio it world2017 bio it world
2017 bio it world
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forum
 
2015 09 emc lsug
2015 09 emc lsug2015 09 emc lsug
2015 09 emc lsug
 
2016 05 sanger
2016 05 sanger2016 05 sanger
2016 05 sanger
 
Gaasterland Laboratory Simplifies Genomics Research with Panasas
Gaasterland Laboratory Simplifies Genomics Research with PanasasGaasterland Laboratory Simplifies Genomics Research with Panasas
Gaasterland Laboratory Simplifies Genomics Research with Panasas
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 
CaseStudy_Labguru_AstraZeneca
CaseStudy_Labguru_AstraZenecaCaseStudy_Labguru_AstraZeneca
CaseStudy_Labguru_AstraZeneca
 
How Safe and Always-Available Data as Lifeblood Helps a University Medical Ce...
How Safe and Always-Available Data as Lifeblood Helps a University Medical Ce...How Safe and Always-Available Data as Lifeblood Helps a University Medical Ce...
How Safe and Always-Available Data as Lifeblood Helps a University Medical Ce...
 
Real time streaming analytics
Real time streaming analyticsReal time streaming analytics
Real time streaming analytics
 
Training Seminar - The Data Design Process
Training Seminar - The Data Design ProcessTraining Seminar - The Data Design Process
Training Seminar - The Data Design Process
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
 
Panacea H4D Stanford 2019
Panacea H4D Stanford 2019Panacea H4D Stanford 2019
Panacea H4D Stanford 2019
 
Chapman university-big-data-healthcare-final
Chapman university-big-data-healthcare-finalChapman university-big-data-healthcare-final
Chapman university-big-data-healthcare-final
 
Hadoop Training For Beginners | Hadoop Tutorial | Big Data Training |Edureka
Hadoop Training For Beginners | Hadoop Tutorial | Big Data Training |EdurekaHadoop Training For Beginners | Hadoop Tutorial | Big Data Training |Edureka
Hadoop Training For Beginners | Hadoop Tutorial | Big Data Training |Edureka
 
Data science for developers
Data science for developersData science for developers
Data science for developers
 
Taming the Big Data Beast - Together
Taming the Big Data Beast - TogetherTaming the Big Data Beast - Together
Taming the Big Data Beast - Together
 
Scalable Predictive Analysis and The Trend with Big Data & AI
Scalable Predictive Analysis and The Trend with Big Data & AIScalable Predictive Analysis and The Trend with Big Data & AI
Scalable Predictive Analysis and The Trend with Big Data & AI
 
Opportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckOpportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deck
 
Panasas ® UCLA Customer Success Story
Panasas ®  UCLA Customer Success StoryPanasas ®  UCLA Customer Success Story
Panasas ® UCLA Customer Success Story
 
Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides
 

Viewers also liked

Advancing life sciences with IBM reference architecture for genomics
Advancing life sciences with IBM reference architecture for genomicsAdvancing life sciences with IBM reference architecture for genomics
Advancing life sciences with IBM reference architecture for genomicsPatrick Berghaeger
 
L’EXÉCUTION STRATÉGIQUE Une affaire d’implication
L’EXÉCUTION STRATÉGIQUE  Une affaire d’implicationL’EXÉCUTION STRATÉGIQUE  Une affaire d’implication
L’EXÉCUTION STRATÉGIQUE Une affaire d’implicationCoefficience
 
Doc3 - BDLS Company Overview Sept2015
Doc3 - BDLS Company Overview Sept2015Doc3 - BDLS Company Overview Sept2015
Doc3 - BDLS Company Overview Sept2015Sarah Van der Kelft
 
Tour sx 01 - importancia del sistema
Tour sx    01 - importancia del sistemaTour sx    01 - importancia del sistema
Tour sx 01 - importancia del sistemafuxionlatino
 
Tour sx 02 - herramientas
Tour sx    02 - herramientasTour sx    02 - herramientas
Tour sx 02 - herramientasfuxionlatino
 
Business Development
Business DevelopmentBusiness Development
Business DevelopmentRobert Meza
 
Plastic Surgeon Letters Of Recommendation
Plastic Surgeon Letters Of RecommendationPlastic Surgeon Letters Of Recommendation
Plastic Surgeon Letters Of Recommendationkeridavis
 
"Curious Learning: using a mobile platform for early literacy education as a ...
"Curious Learning: using a mobile platform for early literacy education as a ..."Curious Learning: using a mobile platform for early literacy education as a ...
"Curious Learning: using a mobile platform for early literacy education as a ...diannepatricia
 
2. 10 factores humanos que causan accidentes
2. 10 factores humanos que  causan accidentes2. 10 factores humanos que  causan accidentes
2. 10 factores humanos que causan accidentesTVPerú
 

Viewers also liked (17)

Advancing life sciences with IBM reference architecture for genomics
Advancing life sciences with IBM reference architecture for genomicsAdvancing life sciences with IBM reference architecture for genomics
Advancing life sciences with IBM reference architecture for genomics
 
L’EXÉCUTION STRATÉGIQUE Une affaire d’implication
L’EXÉCUTION STRATÉGIQUE  Une affaire d’implicationL’EXÉCUTION STRATÉGIQUE  Une affaire d’implication
L’EXÉCUTION STRATÉGIQUE Une affaire d’implication
 
Doc3 - BDLS Company Overview Sept2015
Doc3 - BDLS Company Overview Sept2015Doc3 - BDLS Company Overview Sept2015
Doc3 - BDLS Company Overview Sept2015
 
ESCRITOS
ESCRITOSESCRITOS
ESCRITOS
 
Tour sx 01 - importancia del sistema
Tour sx    01 - importancia del sistemaTour sx    01 - importancia del sistema
Tour sx 01 - importancia del sistema
 
Sdsc pi-mtg-ecss-sgci-7-12-16
Sdsc pi-mtg-ecss-sgci-7-12-16Sdsc pi-mtg-ecss-sgci-7-12-16
Sdsc pi-mtg-ecss-sgci-7-12-16
 
1811
18111811
1811
 
Acuerdo final
Acuerdo finalAcuerdo final
Acuerdo final
 
Tour sx 02 - herramientas
Tour sx    02 - herramientasTour sx    02 - herramientas
Tour sx 02 - herramientas
 
Mec solo
Mec solo Mec solo
Mec solo
 
Business Development
Business DevelopmentBusiness Development
Business Development
 
Célula4º
Célula4ºCélula4º
Célula4º
 
Plastic Surgeon Letters Of Recommendation
Plastic Surgeon Letters Of RecommendationPlastic Surgeon Letters Of Recommendation
Plastic Surgeon Letters Of Recommendation
 
Life sciences Predictions 2016
Life sciences Predictions 2016Life sciences Predictions 2016
Life sciences Predictions 2016
 
"Curious Learning: using a mobile platform for early literacy education as a ...
"Curious Learning: using a mobile platform for early literacy education as a ..."Curious Learning: using a mobile platform for early literacy education as a ...
"Curious Learning: using a mobile platform for early literacy education as a ...
 
2. 10 factores humanos que causan accidentes
2. 10 factores humanos que  causan accidentes2. 10 factores humanos que  causan accidentes
2. 10 factores humanos que causan accidentes
 
Law and morality
Law and moralityLaw and morality
Law and morality
 

Similar to Increasing the Efficiency of Workflows: Use Cases in the Life Sciences

Science Gateways for Life Sciences – Balancing Usability and Re-Usability
Science Gateways for Life Sciences – Balancing Usability and Re-Usability Science Gateways for Life Sciences – Balancing Usability and Re-Usability
Science Gateways for Life Sciences – Balancing Usability and Re-Usability Sandra Gesing
 
strata_ny_2016_version_final_no_animation
strata_ny_2016_version_final_no_animationstrata_ny_2016_version_final_no_animation
strata_ny_2016_version_final_no_animationTaposh Dutta Roy
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesKrishna Sankar
 
Follow the money with graphs
Follow the money with graphsFollow the money with graphs
Follow the money with graphsStanka Dalekova
 
New Trends and Directions in Data Science - MIT Information Quality Conferenc...
New Trends and Directions in Data Science - MIT Information Quality Conferenc...New Trends and Directions in Data Science - MIT Information Quality Conferenc...
New Trends and Directions in Data Science - MIT Information Quality Conferenc...Mario Faria
 
Time to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the CloudTime to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the CloudAmazon Web Services
 
Ticer summer school_24_aug06
Ticer summer school_24_aug06Ticer summer school_24_aug06
Ticer summer school_24_aug06SayDotCom.com
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsDhruv Saxena
 
Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Ola Spjuth
 
Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data CommonsSimon Twigger
 
Open Science and Executable Papers
Open Science and Executable PapersOpen Science and Executable Papers
Open Science and Executable PapersJose Enrique Ruiz
 
Real-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfReal-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfAlbert Wong
 
Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402vrij
 
Is one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical researchIs one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical researchGreg Landrum
 
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsSTING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsJason Riedy
 
Urika-GD Product Brief Online 5-page
Urika-GD Product Brief Online 5-pageUrika-GD Product Brief Online 5-page
Urika-GD Product Brief Online 5-pageAdnan Khaleel
 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker, Inc.
 
SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017
SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017
SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017Sandra Gesing
 

Similar to Increasing the Efficiency of Workflows: Use Cases in the Life Sciences (20)

Science Gateways for Life Sciences – Balancing Usability and Re-Usability
Science Gateways for Life Sciences – Balancing Usability and Re-Usability Science Gateways for Life Sciences – Balancing Usability and Re-Usability
Science Gateways for Life Sciences – Balancing Usability and Re-Usability
 
strata_ny_2016_version_final_no_animation
strata_ny_2016_version_final_no_animationstrata_ny_2016_version_final_no_animation
strata_ny_2016_version_final_no_animation
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
 
Follow the money with graphs
Follow the money with graphsFollow the money with graphs
Follow the money with graphs
 
New Trends and Directions in Data Science - MIT Information Quality Conferenc...
New Trends and Directions in Data Science - MIT Information Quality Conferenc...New Trends and Directions in Data Science - MIT Information Quality Conferenc...
New Trends and Directions in Data Science - MIT Information Quality Conferenc...
 
Time to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the CloudTime to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the Cloud
 
Ticer summer school_24_aug06
Ticer summer school_24_aug06Ticer summer school_24_aug06
Ticer summer school_24_aug06
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...
 
Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data Commons
 
Open Science and Executable Papers
Open Science and Executable PapersOpen Science and Executable Papers
Open Science and Executable Papers
 
Summary of 3DPAS
Summary of 3DPASSummary of 3DPAS
Summary of 3DPAS
 
Real-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfReal-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdf
 
Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402
 
Is one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical researchIs one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical research
 
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsSTING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
 
Urika-GD Product Brief Online 5-page
Urika-GD Product Brief Online 5-pageUrika-GD Product Brief Online 5-page
Urika-GD Product Brief Online 5-page
 
Sgci nsf-2-22-17
Sgci nsf-2-22-17Sgci nsf-2-22-17
Sgci nsf-2-22-17
 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce Hoff
 
SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017
SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017
SGCI Science Gateways: Harnessing Big Data and Open Data 03-19-2017
 

More from Sandra Gesing

The Reasons Why the Science Gateways Community Needs an Institute
The Reasons Why the Science Gateways Community Needs an InstituteThe Reasons Why the Science Gateways Community Needs an Institute
The Reasons Why the Science Gateways Community Needs an InstituteSandra Gesing
 
Bridging Gaps and Broadening Participation in Today's and Future Research Com...
Bridging Gaps and Broadening Participation inToday's and Future Research Com...Bridging Gaps and Broadening Participation inToday's and Future Research Com...
Bridging Gaps and Broadening Participation in Today's and Future Research Com...Sandra Gesing
 
SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...
SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...
SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...Sandra Gesing
 
Sustainability of HPC Research Computing: Fostering career paths for facilit...
Sustainability of HPC Research Computing:  Fostering career paths for facilit...Sustainability of HPC Research Computing:  Fostering career paths for facilit...
Sustainability of HPC Research Computing: Fostering career paths for facilit...Sandra Gesing
 
URSSI - SGCI - PresQT: Research Software and Science Gateways: Addressing Su...
URSSI - SGCI - PresQT: Research Software and Science Gateways:  Addressing Su...URSSI - SGCI - PresQT: Research Software and Science Gateways:  Addressing Su...
URSSI - SGCI - PresQT: Research Software and Science Gateways: Addressing Su...Sandra Gesing
 
SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...
SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...
SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...Sandra Gesing
 
The Conceptualization of URSSI - How You Can Engage
The Conceptualization of URSSI - How You Can EngageThe Conceptualization of URSSI - How You Can Engage
The Conceptualization of URSSI - How You Can EngageSandra Gesing
 
SGCI-URSSI-Sustainability in Research Computing
SGCI-URSSI-Sustainability in Research ComputingSGCI-URSSI-Sustainability in Research Computing
SGCI-URSSI-Sustainability in Research ComputingSandra Gesing
 
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...Sandra Gesing
 
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...Sandra Gesing
 
SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...Sandra Gesing
 
SGCI - Science Gateways: An Overview
SGCI - Science Gateways: An OverviewSGCI - Science Gateways: An Overview
SGCI - Science Gateways: An OverviewSandra Gesing
 
SGCI - Science Gateways Community Institute: Software Registry
SGCI - Science Gateways Community Institute: Software RegistrySGCI - Science Gateways Community Institute: Software Registry
SGCI - Science Gateways Community Institute: Software RegistrySandra Gesing
 
SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...
SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...
SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...Sandra Gesing
 
SGCI - RDA - Sustainability of Collaborative Platforms
SGCI - RDA - Sustainability of Collaborative PlatformsSGCI - RDA - Sustainability of Collaborative Platforms
SGCI - RDA - Sustainability of Collaborative PlatformsSandra Gesing
 
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...Sandra Gesing
 
SGCI - The Science Gateways Community Institute: Going Beyond Borders
SGCI - The Science Gateways Community Institute: Going Beyond BordersSGCI - The Science Gateways Community Institute: Going Beyond Borders
SGCI - The Science Gateways Community Institute: Going Beyond BordersSandra Gesing
 
SGCI Science Gateways: Ushering in a New Era of Sustainability
SGCI Science Gateways: Ushering in a New Era of Sustainability SGCI Science Gateways: Ushering in a New Era of Sustainability
SGCI Science Gateways: Ushering in a New Era of Sustainability Sandra Gesing
 
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...Sandra Gesing
 
SGCI Science Gateways: Addressing Data Management Challenges
SGCI Science Gateways: Addressing Data Management ChallengesSGCI Science Gateways: Addressing Data Management Challenges
SGCI Science Gateways: Addressing Data Management ChallengesSandra Gesing
 

More from Sandra Gesing (20)

The Reasons Why the Science Gateways Community Needs an Institute
The Reasons Why the Science Gateways Community Needs an InstituteThe Reasons Why the Science Gateways Community Needs an Institute
The Reasons Why the Science Gateways Community Needs an Institute
 
Bridging Gaps and Broadening Participation in Today's and Future Research Com...
Bridging Gaps and Broadening Participation inToday's and Future Research Com...Bridging Gaps and Broadening Participation inToday's and Future Research Com...
Bridging Gaps and Broadening Participation in Today's and Future Research Com...
 
SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...
SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...
SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...
 
Sustainability of HPC Research Computing: Fostering career paths for facilit...
Sustainability of HPC Research Computing:  Fostering career paths for facilit...Sustainability of HPC Research Computing:  Fostering career paths for facilit...
Sustainability of HPC Research Computing: Fostering career paths for facilit...
 
URSSI - SGCI - PresQT: Research Software and Science Gateways: Addressing Su...
URSSI - SGCI - PresQT: Research Software and Science Gateways:  Addressing Su...URSSI - SGCI - PresQT: Research Software and Science Gateways:  Addressing Su...
URSSI - SGCI - PresQT: Research Software and Science Gateways: Addressing Su...
 
SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...
SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...
SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...
 
The Conceptualization of URSSI - How You Can Engage
The Conceptualization of URSSI - How You Can EngageThe Conceptualization of URSSI - How You Can Engage
The Conceptualization of URSSI - How You Can Engage
 
SGCI-URSSI-Sustainability in Research Computing
SGCI-URSSI-Sustainability in Research ComputingSGCI-URSSI-Sustainability in Research Computing
SGCI-URSSI-Sustainability in Research Computing
 
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
 
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
 
SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...
 
SGCI - Science Gateways: An Overview
SGCI - Science Gateways: An OverviewSGCI - Science Gateways: An Overview
SGCI - Science Gateways: An Overview
 
SGCI - Science Gateways Community Institute: Software Registry
SGCI - Science Gateways Community Institute: Software RegistrySGCI - Science Gateways Community Institute: Software Registry
SGCI - Science Gateways Community Institute: Software Registry
 
SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...
SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...
SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...
 
SGCI - RDA - Sustainability of Collaborative Platforms
SGCI - RDA - Sustainability of Collaborative PlatformsSGCI - RDA - Sustainability of Collaborative Platforms
SGCI - RDA - Sustainability of Collaborative Platforms
 
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
 
SGCI - The Science Gateways Community Institute: Going Beyond Borders
SGCI - The Science Gateways Community Institute: Going Beyond BordersSGCI - The Science Gateways Community Institute: Going Beyond Borders
SGCI - The Science Gateways Community Institute: Going Beyond Borders
 
SGCI Science Gateways: Ushering in a New Era of Sustainability
SGCI Science Gateways: Ushering in a New Era of Sustainability SGCI Science Gateways: Ushering in a New Era of Sustainability
SGCI Science Gateways: Ushering in a New Era of Sustainability
 
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
 
SGCI Science Gateways: Addressing Data Management Challenges
SGCI Science Gateways: Addressing Data Management ChallengesSGCI Science Gateways: Addressing Data Management Challenges
SGCI Science Gateways: Addressing Data Management Challenges
 

Recently uploaded

Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Increasing the Efficiency of Workflows: Use Cases in the Life Sciences

  • 1.   Sandra  Gesing   Center  for  Research  Compu6ng   sandra.gesing@nd.edu     19  August  2016   Increasing  the  Efficiency  of   Workflows:     Use  Cases  in  the  Life  Sciences  
  • 2. University  of  Notre  Dame   Sandra  Gesing                                                  2   •   In  the  middle  of  nowhere  of  northern  Indiana      (1.5  h  from  Chicago)   •   4  undergraduate  colleges     •   ~35  research  ins6tutes  and  centers   •   ~12,000  students  
  • 3. Center  for  Research  Compu6ng   Sandra  Gesing                                            3   •   SoUware  development  and  profiling   •   Cyberinfrastructure/science  gateway  development   •   Computa6onal  Scien6st  support   •   Collabora6ve  research/    grant  development         •   System  administra6on/    prototype  architectures   •   Computa6onal  resources:      25,000  cores+   •   Storage  resources:  3  PB   •   Na6onal  resources  (e.g.,  XSEDE)   •   ~40  researchers,          research  programmers,        HPC  specialists   CRC  and  OIT  building   h`p://crc.nd.edu   CRC  HPC  Center  (old  Union  Sta6on)  
  • 4. Life  Sciences   Sandra  Gesing                                                  4   •  Genomics   •  Proteomics   •  Metabolomics   •  Immunomics   •  System  biology   •  Molecular  simula6ons   •  Docking   •  Epidemiology   •  …   Black  Swallowtail  –     larvae  and  bu`erfly  
  • 5. The  Genomics  Boom   Sandra  Gesing                                                  5   February  16,  2001    biotech  company  Celera     February  15,  2001   The  Human  Genome  Project    
  • 6. The  Genomics  Boom   Sandra  Gesing                                                  6   Craig  Venter  (leU)  and  Francis  Collins  (right)  
  • 7. Big  Data   Sandra  Gesing                                                                  7   •   Explosion  in  the  quan6ty,  variety  and  complexity  of    data     •   Ques6ons  can  be  answered  impossible  to  even  ask    about  10  years  ago   •   Costs  far  reduced  (e.g.,  Human  Genome  project,  15    years,  ~$2  billion;  today  ~3  days,  $1000)  
  • 8. Big  Data   Sandra  Gesing                                                              8   h`p://www.genome.gov/images/content/cost_per_genome_oct2015.jpg  
  • 9. Analysis  of  Data   Sandra  Gesing                                                              9   12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt 12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt 12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg 12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga 12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc 12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa 12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa Slide  copied  from:  Stuart  Owen  „Workflows  with  Taverna“   A  sequence  of  connected  steps  in  a  defined  order     based  on  their  control  and  data  dependencies  
  • 10. Analysis  of  data   Sandra  Gesing                                                              10   12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt 12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt 12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg 12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga 12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc 12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa 12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa Slide  copied  from:  Stuart  Owen  „Workflows  with  Taverna“   A  sequence  of  connected  steps  in  a  defined  order     based  on  their  control  and  data  dependencies   Workflows!  
  • 11. Workflows   Sandra  Gesing                                              11   •   Different  workflow  concepts   •   Different  workflow  languages   •   Different  workflow  constructs         Taverna  WorkWays  
  • 12. Workflow  Editors   Sandra  Gesing                                              12   •   Different  technologies  (workbenches,  web-­‐based)     •   Different  look-­‐and-­‐feel    
  • 13. An  Award-­‐Winning  Solu6on   Sandra  Gesing                                              13  
  • 14. State  of  the  Art       Sandra  Gesing                                                14   Data  and  compute-­‐   intensive  problems   High-­‐speed  networks   Users  generally  not   IT  specialists  Tools  and  workflow   engines   Web-­‐based     agile  frameworks   Distributed  data  and     compu6ng  infrastructures  
  • 15. Challenge  for  Developers     Sandra  Gesing                                              15   Data  and  compute-­‐   intensive  problems   High-­‐speed  networks  Tools  and  workflow   engines   Web-­‐based     agile  frameworks   Distributed  data  and     compu6ng  infrastructures   Users  generally  not   IT  specialists   Need  for  intui6ve  and  efficient  workflows!  
  • 16. Challenge  for  Developers     Sandra  Gesing                                              16   Data  and  compute-­‐   intensive  problems   High-­‐speed  networks  Tools  and  workflow   engines   Web-­‐based     agile  frameworks   Distributed  data  and     compu6ng  infrastructures   Users  generally  not   IT  specialists  
  • 17. Usability   Sandra  Gesing                                              17       “AUer  all,  usability  really  just  means  that  making  sure   that  something  works  well:  that  a  person  …  can  use  the   thing  -­‐  whether  it's  a  Web  site,  a  fighter  jet,  or  a   revolving  door  -­‐  for  its  intended  purpose  without  geung   hopelessly  frustrated.”     (Steve  Krug  in  “Don't  make  me     think!:  A  Common  Sense  Approach   to  Web  Usability”,  2005)  
  • 18. Efficiency   Sandra  Gesing                                              18   •  Time   •  Computa6onal  resources   •  Money    
  • 19. Workflow  Enhancements   Sandra  Gesing                                              19   •  Logical  level:  Meta-­‐workflows   Herres-­‐Pawlis,  S.,  Hoffmann,  A.,  Rösener,  T.,  Krüger,  J.,  Grunzke,  R.,  and  Gesing,  S.  “Mul6-­‐layer  Meta-­‐ metaworkflows  for  the  Evalua6on  of  Solvent  and  Dispersion  Effects  in  Transi6on  Metal  Systems  Using  the   MoSGrid  Science  Gateways”Science  Gateways  (IWSG),  2015  7th  Interna6onal  Workshop  on,  pp.47-­‐52,  3-­‐5  June   2015,  IEEE  Xplore,  doi:  10.1109/IWSG.2015.13     •  System  level:  Combina6on  of  strengths  of  workflow   systems   Hazekamp,  N.,  Sarro,  J.,  Choudhury,  O.,  Gesing,  S.,  Sco`  Emrich  and  Thain,  D.  “Scaling  Up  Bioinforma6cs   Workflows  with  Dynamic  Job  Expansion:  A  Case  Study  Using  Galaxy  and  Makeflow”,  e-­‐Science  (e-­‐Science),  2015   IEEE  11th  Interna6onal  Conference  on,  pp.332-­‐341,  Aug.  31  2015-­‐Sept.  4  2015   •  Predic6on:  Model  for  op6miza6on  of  tasks  and   threads   Choudhury,  O.,  Rajan,  D.,  Hazekamp,  N.,  Gesing,  S.,  Thain,  D.,  and  Emrich,  S.  “Balancing  Thread-­‐level  and  Task-­‐ level  Parallelism  for  Data-­‐Intensive  Workloads  on  Clusters  and  Clouds”,  Cluster  Compu6ng  (CLUSTER),  2015  IEEE   Interna6onal  Conference  on,  pp.390-­‐393,  8-­‐11  Sept.  2015,  doi:10.1109/CLUSTER.2015.60    
  • 20. MoSGrid  Science  Gateway   Molecular  Simula6on  Grid   •     Science  gateway  integrated  with  underlying    compute  and  data  management  infrastructure       •     Distributed  workflow  management   •     Data  repository   •     Metadata  management   Sandra  Gesing                                                                                              20  
  • 21. MoSGrid  Science  Gateway   User  Interface   WS-­‐PGRADE   Liferay   DCI  Resources     Middleware  Layer   UNICORE   XtreemFS   High-­‐Level   Middleware   Service  Layer   gUSE   Sandra  Gesing                                                                                              21  
  • 22. Quantum  Chemistry                                       Sandra  Gesing                                              22  
  • 23. Quantum  Chemistry                                       Sandra  Gesing                                              23  
  • 24. Molecular  Dynamics                                       Sandra  Gesing                                              24  
  • 25. Docking                                       Sandra  Gesing                                              25  
  • 26. Science  Case:  Polymerisa6on  catalysts   Sandra  Gesing                                              26  
  • 27. Transla6on  into  Workflows   Sandra  Gesing                                              27  
  • 28. Transla6on  into  Workflows   Sandra  Gesing                                              28  
  • 29. Meta-­‐Workflows   Sandra  Gesing                                              29  
  • 30. Transla6on  into  Meta-­‐Workflows   Sandra  Gesing                                              30  
  • 31. Scaling  Up  Workflows     Sandra  Gesing                                                                                              31   #  Machines   #  Cores   Data   ParCConing   Save  
  • 32. Scaling  Up  Workflows     Sandra  Gesing                                                                                              32   Galaxy  
  • 33. Scaling  Up  Workflows     Sandra  Gesing                                              33   Simple  Workflow  in  Galaxy   Problem:  As  Size  increases  so  does  Time  
  • 34. Scaling  Up  Workflows     Sandra  Gesing                                              34   Workflow  with  Parallelism  added  in  Galaxy   Problem:  Tools  must  be  updated  every   change  in  Parallelism/Relies  on  Scien6st  
  • 35. Scaling  Up  Workflows     Sandra  Gesing                                              35   Workflow  Dynamically  Expanded  behind  Galaxy  
  • 36. Scaling  Up  Workflows     Sandra  Gesing                                              36  
  • 37. Scaling  Up  Workflows     Sandra  Gesing                                              37  
  • 38. Scaling  Up  Workflows     Sandra  Gesing                                              38   Makeflow   •  Task  Structure   INPUTS  :  OUTPUTS    COMMAND     •  Directed  Acyclic  Graph  (DAG)   •  Programma6cally  Generated  
  • 39. Scaling  Up  Workflows     Sandra  Gesing                                              39  
  • 40. Scaling  Up  Workflows     Sandra  Gesing                                              40  
  • 41. Scaling  Up  Workflows     Sandra  Gesing                                              41   Job  Sandbox  –  Log  file  crea6on  for  cleanup  
  • 42. Scaling  Up  Workflows     Sandra  Gesing                                              42   Dynamic  Job  Expansion       •  Work  Queue:  we  u6lized  100s  of  cores  from  a     Condor  Pool   •  Cleaning  Sandbox  using  knowledge  of  intermediates    and  logging   •  Explored  methods  to  transmit  needed  environments     such  as     executables  and  Java     61.5X  speed-­‐up  on  32  GB  dataset  u6lizing  these  methods  
  • 43. Thread-­‐level  and  Task-­‐level  Parallelism     Sandra  Gesing                                              43   •  Develop predictive performance models for an application domain •  Achieve acceptable performance the first time •  Optimize resource utilization •  Execution time •  Memory usage
  • 44. Thread-­‐level  and  Task-­‐level  Parallelism     •  WorkQueue master-worker framework •  Sun Grid Engine (SGE) batch system Sandra  Gesing                                              44  
  • 45. Thread-­‐level  and  Task-­‐level  Parallelism     Sandra  Gesing                                              45   1.  Applica6on-­‐level  model  for  6me: 𝑇(𝑅, 𝑄, 𝑁)=   𝛽1​ 𝑅 𝑄/𝑁 +   𝛽2   2.  Applica6on-­‐level  model  for  memory:     𝑀(𝑅, 𝑁)=  γ1R  +γ2N   3.  System-­‐level  model  for  6me:     𝑇𝑇𝑜𝑡𝑎𝑙= 𝜂1​ 𝑄 𝐾/𝐷 + 𝜂2(​ 𝑄/𝐵 +​ 𝑅 𝐾𝑁/𝐵𝐶 )+ 𝜂3T(R,​ 𝑄/𝐾 , 𝑁)∗​ 𝐾 𝑁/𝑀𝐶 +   𝜂4​ 𝑂/𝐵 + 𝜂5​ 𝑂 𝐾/𝐷      4.  System-­‐level  model  for  memory:     𝑀𝑀𝑎𝑠𝑡𝑒𝑟(𝑅, 𝑄)=ϕ1R  +ϕ2Q  
  • 46. Thread-­‐level  and  Task-­‐level  Parallelism     Sandra  Gesing                                              46   7  data   points   (R)   7  data   points   (Q)   7  data   points   (N)   343  data   points   Data   CollecCon   Training   data   Regression     Model   Training   Accuracy   Test   MAPE   TesCng   Regression   Coefficient s   Tes6ng   data  
  • 47. Thread-­‐level  and  Task-­‐level  Parallelism     Sandra  Gesing                                              47   Avg.  MAPE =  3.1 MAPE  =  Mean  Absolute  Percentage  Error  
  • 48. Thread-­‐level  and  Task-­‐level  Parallelism     Sandra  Gesing                                              48   For the given dataset, K* = 90, N* = 4
  • 49. Result     Sandra  Gesing                                              49   # Cores/ Task #Tasks Predicted Time (min) Speedup Estimated EC2 Cost ($) Estimated Azure Cost ($) 1 360 70 6.6 50.4 64.8 2 180 38 12.3 25.2 32.4 4 90 24 19.5 18.9 32.4 8 45 27 17.3 18.9 32.4
  • 50. WORKS  Workshop           Sandra  Gesing                                              50   h`p://works.cs.cardiff.ac.uk/  
  • 51. WORKS  Workshop           Sandra  Gesing                                              51   Keynote  Speaker  
  • 52. Acknowledgements   Sandra  Gesing                                              52   Logical  level   •  Richard  Grunzke   •  Sonja  Herres-­‐Pawlis   •  Alexander  Hoffmann   •  Jens  Krüger   •  MTA  SZTAKI   •  University  of  Westminster     System  level  and  predic6on   •  Nicholas Hazekamp •  Olivia Choudhury •  Douglas Thain •  Scott Emrich •  Notre Dame Bioinformatics Lab •  The Cooperative Computing Lab, University of Notre Dame  
  • 53. Science  Gateways  Community  Ins6tute   Sandra  Gesing                                              53   01  August  2016  –  31  July  2021   1.  Incubator:  shared  exper6se  in     business  and  sustainability  planning,     cybersecurity,  user  interface  design,     and  soUware  engineering  prac6ces.   2.  Extended  Developer  Support:     expert  developers  for  up  to  one  year     3.  Scien6fic  SoUware  Collabora6ve:       open-­‐source,  extensible  framework  for     gateway    design,  integra6on,  and  services     4.  Community  Engagement  and  Exchange:     forum  for  communica6on  and  shared  experiences     5.  Workforce  Development:  training  programs  and  helping  universi6es   form  gateway  support  groups       h`p://sciencegateways.org/  
  • 54. Sandra  Gesing                                              54   sandra.gesing@nd.edu