Building Notebook-based AI Pipelines with Elyra and Kubeflow

Notebook-based AI Pipelines with
Elyra and Kubeflow
Nick Pentreath
Principal Engineer, IBM
@MLnick

About
DEG / Nov 18, 2020 / © 2020 IBM Corporation
– @MLnick on Twitter, Github, LinkedIn
– Principal Engineer, IBM CODAIT (Center for
Open-Source Data & AI Technologies)
– Machine Learning & AI
– Apache Spark committer & PMC
– Author of Machine Learning with Spark
– Various conferences & meetups
2

Improving the Enterprise AI Lifecycle in Open Source
DEG / Nov 18, 2020 / © 2020 IBM Corporation 3
– CODAIT aims to make AI solutions
dramatically easier to create,
deploy, and manage in the
enterprise.
– We contribute to and advocate for
the open-source technologies that
are foundational to IBM’s AI
offerings.
– 30+ open-source developers!
Center for Open Source Data & AI Technologies
codait.org
CODAIT
Open Source @ IBM

Agenda
4
– Machine learning workflow
– JupyerLab & Elyra
– Demo
– Conclusion

Machine Learning
Workflow
5
Data Analyze Process Train Deploy
Predict
&
Maintain

Workflow spans teams …
6
Predict
&
Maintain
Data Engineers Data Scientists & Researchers
Machine Learning &
Production Engineers

… and tools
7
Data formats
• CSV, SQL
• JSON,
Parquet,
AVRO
• Binary
(image,
audio)
• …
Data Engineers Data Scientists & Researchers
Machine Learning &
Production Engineers
Analysis & data
viz
• ggplot
• dplyr
• matplotlib
• Pandas
• SparkSQL
• …
Pre-processing
& pipelines
• dplyr
• pandas
• scikit-learn
• SparkSQL /
SparkML
• …
Frameworks
• R, scikit-
learn
• SparkML
• TensorFlow
• PyTorch
• LightGBM,
XGBoost
• …
Formats &
mechanisms
• Variety of
formats
• Containers
• …

Iteration &
Experimentation
8
Data Scientists & Researchers
Load Clean Explore Interpret
Refine

Iteration &
Experimentation
9
Data Process Train Deploy
Data Scientists & Researchers
Extract
features
Pre-
process
Train Evaluate
Refine
Analyze

Interactive Notebooks
Notebooks have become
the de-facto standard
for content-rich,
interactive & iterative
work
* Logos trademarks of their respective projects

Elyra Overview
Elyra is a set of AI-
centric extensions to
JupyterLab Notebooks
* Logos trademarks of their respective projects

Elyra Key Features
– Visual Pipeline Editor
Visual editor for building AI pipelines,
enabling the conversion of multiple
notebooks into batch jobs or workflows.
– Notebooks as batch jobs
– Python script execution
– Automated Table of Contents
– Code Snippets
– Git integration

Elyra Key Features
Extends the notebook UI to simplify the
submission of notebooks as a batch job
for model training
– Code Snippets
– Git integration

Elyra Key Features
Edit and execute python scripts against
local or cloud-based resources
– Code Snippets
– Git integration

Elyra Key Features
Generate & navigate table of contents
from notebooks & python scripts
– Code Snippets
– Git integration

Elyra Key Features
– Code Snippets
Easy creation and insertion of reusable
code snippets for various languages
– Git integration

Elyra Key Features
– Code Snippets
– Git integration
Track project changes and share among
teammates

Getting started with Elyra
1. Try Elyra from Binder
ibm.biz/elyra-demo
2. Run Elyra from Docker
ibm.biz/elyra-docker-installation
3. Install Elyra on your local machine
ibm.biz/elyra-installation
18

Start using Elyra today!
Getting started with Elyra
ibm.biz/elyra-installation
Elyra on Github
github.com/elyra-ai/elyra
Elyra Notebook projects on Github
github.com/CODAIT/flight-delay-notebooks
github.com/CODAIT/covid-notebooks
Contributing to the projects
• Star and fork, submit bug reports, suggest improvements,
help with code reviews, join our community meetings
ibm.biz/elyra-demo
gitter.im/elyra-ai/community

Thank you
codait.org
twitter.com/codait_org
github.com/CODAIT
developer.ibm.com
21DEG / Nov 18, 2020 / © 2020 IBM Corporation
Check out the Data Asset Exchange
https://ibm.biz/data-exchange
Sign up for IBM Cloud
https://ibm.biz/Bdqkfg

Feedback
Your feedback is important to us.
Don’t forget to rate
and review the sessions.

Building Notebook-based AI Pipelines with Elyra and Kubeflow

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Building Notebook-based AI Pipelines with Elyra and Kubeflow

Similar to Building Notebook-based AI Pipelines with Elyra and Kubeflow (20)

More from Databricks

More from Databricks (20)

Recently uploaded

Recently uploaded (20)

Building Notebook-based AI Pipelines with Elyra and Kubeflow