SlideShare a Scribd company logo
1 of 21
Blastn + Jupyter on Docker
Examples from Bioinformatics
Samantha & Lynn Langit
“
”
Jupyter - Inspired by
Mathematica
Thanks Steven Wolfram
If you can SEE it (your data and code), you can work with it better
@lynnlangit
Next terminal <- a better Python REPL
• Fernando Perez in 2001
• IPython (interactive)
• Modeled - Mathematica
Notebooks
• IP(y): Notebook -> in a browser
• 2012 IPython -> Jupyter
Notebook
@lynnlangit
Enter Jupyter Notebooks
@lynnlangit
Jupyter Notebooks supports ML Lifecycle
1. Collect
Data
Retrieve Files
Query SQL Databases
Call Web Services
“Scrape” Web Pages
2.
Prepare
Data
Explore Data
Validate Data
Clean Data
Features / Data
4.
Evaluate
Model
Test Performance
Compare Models
Validate Model
Visualize
5. Deploy
Model
Export Model File
Prepare Job
Deploy Container
Re-package Model
Execute code blocks:
- Python, R… code
- SQL queries
- Shell commands
3. Train
Model
Prepare Training Set
Experiment
Test Model
Visualize
Write Documentation:
- Markdown language
Visualize Data
- Viz tools…
Jupyter Visualizations –
so many possibilities
Notebook Customizations
Multiple
Runtimes
Languages
Share output
Code or
Equations
LaTex
Math
Comments
Markdown
Wiki-like
Graphics
Visualizations
Charting
Results
LIVE
DOCUMENTATION
Reproducible
Research
@lynnlangit
Example
Jupyter locally
@lynnlangit
Mathematica evolved…
Jupyter Notebook
Market leader
Started for single use
Academic community
GitHub integration
Added Jupyter Hub for
collaboration
Zeppelin Notebook
Start for collaboration
Enterprise
Security
Vendor Notebook
Databricks for Apache Spark
Jupyter-like, but proprietary
format
@lynnlangit
Running Notebooks
Desktop
Install and run
Local Server
Can use Jupyter Hub for groups
Cloud
Large number of options
@lynnlangit
Docker
Start a container
Extending, Refactoring Open Notebooks
• Write functions in one notebook
• Link to another notebook
• Write extensions (nbextensions.com)
Up the bar
Personalized medicine via genomic analysis
@lynnlangit
Reproducible Research – Experiments as Code
@lynnlangit
What is Blastn?
Basic Local Alignment Search Tool - BLAST finds regions of similarity
between biological sequences. The program compares nucleotide or
protein sequences to sequence databases and calculates the
statistical significance.
Cloud-based Jupyter
PaaS
• AWS SageMaker
• Azure Notebooks
• Google Colabs
Wireframe that
first the role of UX
in agencies
@lynnlangit
Tools for Jupyter
• Binder for GitHub
• Point to your GitHub Repo
• Jupyter Notebooks
• Requirements.txt
• It builds a Docker image
• You can run your Notebooks
@lynnlangit
Example
Binder
@lynnlangit
Example - GT-Scan2
Jupyter for Genomics Research
@lynnlangit
Future of Jupyter for Research
Academic
Institutions
and
Research
Labs
UC Berkeley, Davis, San Diego
Cal Poly San Luis Obispo
Clemson University
UC Boulder
U of Illinois, Minnesota, Missouri, Rochester, Texas
MIT
Michigan State U
Texas A & M
@lynnlangit

More Related Content

What's hot

Hadoop at Yahoo! -- Hadoop World NY 2009
Hadoop at Yahoo! -- Hadoop World NY 2009Hadoop at Yahoo! -- Hadoop World NY 2009
Hadoop at Yahoo! -- Hadoop World NY 2009
yhadoop
 
Charles_Qian_Resume
Charles_Qian_ResumeCharles_Qian_Resume
Charles_Qian_Resume
Charles Qian
 
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Seldon
 
AWS Enterprise Day | Closing Keynote - Data Without Limits, Dr Werner Vogels
AWS Enterprise Day | Closing Keynote - Data Without Limits, Dr Werner VogelsAWS Enterprise Day | Closing Keynote - Data Without Limits, Dr Werner Vogels
AWS Enterprise Day | Closing Keynote - Data Without Limits, Dr Werner Vogels
Amazon Web Services
 

What's hot (20)

Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloudHive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
 
Hadoop at Yahoo! -- Hadoop World NY 2009
Hadoop at Yahoo! -- Hadoop World NY 2009Hadoop at Yahoo! -- Hadoop World NY 2009
Hadoop at Yahoo! -- Hadoop World NY 2009
 
Research in the Cloud
Research in the CloudResearch in the Cloud
Research in the Cloud
 
Recommender Systems at Scale
Recommender Systems at ScaleRecommender Systems at Scale
Recommender Systems at Scale
 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
 
Webinar kubernetes and-spark
Webinar  kubernetes and-sparkWebinar  kubernetes and-spark
Webinar kubernetes and-spark
 
Homologous Apache Spark Clusters Using Nomad with Alex Dadgar
Homologous Apache Spark Clusters Using Nomad with Alex DadgarHomologous Apache Spark Clusters Using Nomad with Alex Dadgar
Homologous Apache Spark Clusters Using Nomad with Alex Dadgar
 
Serverless spark
Serverless sparkServerless spark
Serverless spark
 
AWS Customer Presentation - AideRss
AWS Customer Presentation - AideRss AWS Customer Presentation - AideRss
AWS Customer Presentation - AideRss
 
Charles_Qian_Resume
Charles_Qian_ResumeCharles_Qian_Resume
Charles_Qian_Resume
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
 
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
Tensorflow London 13: Barbara Fusinska 'Hassle Free, Scalable, Machine Learni...
 
Big data ecosystem
Big data ecosystemBig data ecosystem
Big data ecosystem
 
Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2
 
3rd Hivemall meetup
3rd Hivemall meetup3rd Hivemall meetup
3rd Hivemall meetup
 
AWS Enterprise Day | Closing Keynote - Data Without Limits, Dr Werner Vogels
AWS Enterprise Day | Closing Keynote - Data Without Limits, Dr Werner VogelsAWS Enterprise Day | Closing Keynote - Data Without Limits, Dr Werner Vogels
AWS Enterprise Day | Closing Keynote - Data Without Limits, Dr Werner Vogels
 
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
 
Fast and Reliable Apache Spark SQL Engine
Fast and Reliable Apache Spark SQL EngineFast and Reliable Apache Spark SQL Engine
Fast and Reliable Apache Spark SQL Engine
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in Python
 
Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0
 

Similar to Blastn plus jupyter on Docker

MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
Databricks
 
Automation for the Modern Enterprise - 18 October 2017
Automation for the Modern Enterprise - 18 October 2017Automation for the Modern Enterprise - 18 October 2017
Automation for the Modern Enterprise - 18 October 2017
Claire Priester Papas
 

Similar to Blastn plus jupyter on Docker (20)

2019 03-11 bio it-world west genepattern notebook slides
2019 03-11 bio it-world west genepattern notebook slides2019 03-11 bio it-world west genepattern notebook slides
2019 03-11 bio it-world west genepattern notebook slides
 
03_aiops-1.pptx
03_aiops-1.pptx03_aiops-1.pptx
03_aiops-1.pptx
 
Building analytical microservices powered by jupyter kernels
Building analytical microservices powered by jupyter kernelsBuilding analytical microservices powered by jupyter kernels
Building analytical microservices powered by jupyter kernels
 
Building Reproducible Network Data Analysis / Visualization Workflows
Building Reproducible Network Data Analysis / Visualization WorkflowsBuilding Reproducible Network Data Analysis / Visualization Workflows
Building Reproducible Network Data Analysis / Visualization Workflows
 
04 open source_tools
04 open source_tools04 open source_tools
04 open source_tools
 
SplunkLive London 2014 Developer Presentation
SplunkLive London 2014  Developer PresentationSplunkLive London 2014  Developer Presentation
SplunkLive London 2014 Developer Presentation
 
Building and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowBuilding and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache Airflow
 
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for ReproducibilityRob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in WakariIntro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
Parallel Programming in Python: Speeding up your analysis
Parallel Programming in Python: Speeding up your analysisParallel Programming in Python: Speeding up your analysis
Parallel Programming in Python: Speeding up your analysis
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
 
December 2013 HUG: Hunk - Splunk over Hadoop
December 2013 HUG: Hunk - Splunk over HadoopDecember 2013 HUG: Hunk - Splunk over Hadoop
December 2013 HUG: Hunk - Splunk over Hadoop
 
G3 talk rld_2
G3 talk rld_2G3 talk rld_2
G3 talk rld_2
 
IBM Strategy for Spark
IBM Strategy for SparkIBM Strategy for Spark
IBM Strategy for Spark
 
Deep Learning with CNTK
Deep Learning with CNTKDeep Learning with CNTK
Deep Learning with CNTK
 
Automation for the Modern Enterprise - 18 October 2017
Automation for the Modern Enterprise - 18 October 2017Automation for the Modern Enterprise - 18 October 2017
Automation for the Modern Enterprise - 18 October 2017
 

More from Lynn Langit

More from Lynn Langit (20)

Serverless Architectures
Serverless ArchitecturesServerless Architectures
Serverless Architectures
 
10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming
 
Testing in Ballerina Language
Testing in Ballerina LanguageTesting in Ballerina Language
Testing in Ballerina Language
 
Teaching Kids to create Alexa Skills
Teaching Kids to create Alexa SkillsTeaching Kids to create Alexa Skills
Teaching Kids to create Alexa Skills
 
Practical cloud
Practical cloudPractical cloud
Practical cloud
 
Teaching Kids Programming
Teaching Kids ProgrammingTeaching Kids Programming
Teaching Kids Programming
 
Practical Cloud
Practical CloudPractical Cloud
Practical Cloud
 
Serverless Reality
Serverless RealityServerless Reality
Serverless Reality
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data Pipelines
 
Bioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSBioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWS
 
Serverless Reality
Serverless RealityServerless Reality
Serverless Reality
 
Beyond Relational
Beyond RelationalBeyond Relational
Beyond Relational
 
New AWS Services for Bioinformatics
New AWS Services for BioinformaticsNew AWS Services for Bioinformatics
New AWS Services for Bioinformatics
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline Patterns
 
Scaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformScaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud Platform
 
SQL Server on Google Cloud Platform
SQL Server on Google Cloud PlatformSQL Server on Google Cloud Platform
SQL Server on Google Cloud Platform
 
Redis Labs and SQL Server
Redis Labs and SQL ServerRedis Labs and SQL Server
Redis Labs and SQL Server
 
Building a data warehouse with AWS Redshift, Matillion and Yellowfin
Building a data warehouse with AWS Redshift, Matillion and YellowfinBuilding a data warehouse with AWS Redshift, Matillion and Yellowfin
Building a data warehouse with AWS Redshift, Matillion and Yellowfin
 
What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'What is 'Teaching Kids Programming'
What is 'Teaching Kids Programming'
 
Teaching Kids Programming for Developers
Teaching Kids Programming for DevelopersTeaching Kids Programming for Developers
Teaching Kids Programming for Developers
 

Recently uploaded

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Recently uploaded (20)

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

Blastn plus jupyter on Docker

  • 1. Blastn + Jupyter on Docker Examples from Bioinformatics Samantha & Lynn Langit
  • 2. “ ” Jupyter - Inspired by Mathematica Thanks Steven Wolfram If you can SEE it (your data and code), you can work with it better @lynnlangit
  • 3. Next terminal <- a better Python REPL • Fernando Perez in 2001 • IPython (interactive) • Modeled - Mathematica Notebooks • IP(y): Notebook -> in a browser • 2012 IPython -> Jupyter Notebook @lynnlangit
  • 5. Jupyter Notebooks supports ML Lifecycle 1. Collect Data Retrieve Files Query SQL Databases Call Web Services “Scrape” Web Pages 2. Prepare Data Explore Data Validate Data Clean Data Features / Data 4. Evaluate Model Test Performance Compare Models Validate Model Visualize 5. Deploy Model Export Model File Prepare Job Deploy Container Re-package Model Execute code blocks: - Python, R… code - SQL queries - Shell commands 3. Train Model Prepare Training Set Experiment Test Model Visualize Write Documentation: - Markdown language Visualize Data - Viz tools…
  • 6. Jupyter Visualizations – so many possibilities
  • 7. Notebook Customizations Multiple Runtimes Languages Share output Code or Equations LaTex Math Comments Markdown Wiki-like Graphics Visualizations Charting Results LIVE DOCUMENTATION Reproducible Research @lynnlangit
  • 9. Mathematica evolved… Jupyter Notebook Market leader Started for single use Academic community GitHub integration Added Jupyter Hub for collaboration Zeppelin Notebook Start for collaboration Enterprise Security Vendor Notebook Databricks for Apache Spark Jupyter-like, but proprietary format @lynnlangit
  • 10. Running Notebooks Desktop Install and run Local Server Can use Jupyter Hub for groups Cloud Large number of options @lynnlangit Docker Start a container
  • 11. Extending, Refactoring Open Notebooks • Write functions in one notebook • Link to another notebook • Write extensions (nbextensions.com)
  • 12. Up the bar Personalized medicine via genomic analysis @lynnlangit
  • 13. Reproducible Research – Experiments as Code @lynnlangit
  • 14. What is Blastn? Basic Local Alignment Search Tool - BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.
  • 15.
  • 16. Cloud-based Jupyter PaaS • AWS SageMaker • Azure Notebooks • Google Colabs Wireframe that first the role of UX in agencies @lynnlangit
  • 17.
  • 18. Tools for Jupyter • Binder for GitHub • Point to your GitHub Repo • Jupyter Notebooks • Requirements.txt • It builds a Docker image • You can run your Notebooks @lynnlangit
  • 20. Example - GT-Scan2 Jupyter for Genomics Research @lynnlangit
  • 21. Future of Jupyter for Research Academic Institutions and Research Labs UC Berkeley, Davis, San Diego Cal Poly San Luis Obispo Clemson University UC Boulder U of Illinois, Minnesota, Missouri, Rochester, Texas MIT Michigan State U Texas A & M @lynnlangit

Editor's Notes

  1. History talk from Cristian Prieto (NDC Oslo 2016) -- https://vimeo.com/223984769 http://blog.fperez.org/2012/01/ipython-notebook-historical.html
  2. Local install pip install –iPython all -OR- can use anaconda, which installs Jupyter notebooks by default pip install jupyter[all] and you can pip install R You can use Docker – 2.1 GB image contains all libraries or you can use Azure Notebooks or AWS SageMaker Notebooks Only Python2 is installed by default, you can install other runtimes Start and run in local browser (no database, uses local .json files) IPython notebook -> localhost:8888/tree Use GitHub-flavor Markdown (by default) https://dwhsys.com/2017/03/25/apache-zeppelin-vs-jupyter-notebook/
  3. https://github.com/ipython-contrib/jupyter_contrib_nbextensions pip install jupyter_contrib_nbextensions –OR- conda install -c conda-forge jupyter_contrib_nbextensions
  4. https://github.com/Microsoft/Elevation/blob/master/notebooks/aggregation.ipynb https://www.microsoft.com/en-us/research/project/crispr/
  5. https://blast.ncbi.nlm.nih.gov/Blast.cgi
  6. https://hub.docker.com/r/lynnlangit/blastn-jupyter-docker/
  7. https://medium.com/@lynnlangit/aws-sagemaker-for-bioinformatics-b8e8a96479d8 Jupyter on GCE VM -- https://towardsdatascience.com/running-jupyter-notebook-in-google-cloud-platform-in-15-min-61e16da34d52
  8. https://mybinder.org/ -ALSO- https://nbviewer.jupyter.org/ - allows you to run notebooks stored in GitHub
  9. http://jupyterhub-tutorial.readthedocs.io/en/latest/ https://github.com/jupyterhub/jupyterhub-tutorial/blob/master/JupyterHub.pdf http://jupyterhub.readthedocs.io/en/latest/gallery-jhub-deployments.html