AI and Machine Learning have become must-haves for almost all industries and companies. H2O.ai's goal is to help companies all over the world to use Machine Learning.
H2O.ai's opensource toolset, which includes packages R, Python and Spark, starts from offering products which can accelerate the data preparation, then help with ML model building and finally make the deployment easier and platform agnostic!
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
1. Make any company
an AI company!
Greg Fousas
Customer Data Scientist
6 Jun 2020
2. Who am I?
https://www.linkedin.com/in/greg-fousas-04252135/
I am a Data Scientist with work experience that spans more than 20 projects, 15
brands, 5 industries and 5 countries and still counting! I studied Production
Engineering and Management, I have a MSc in Operational Research from the
University of Edinburgh and studied a bit of Cognitive Science. Recently I
completed a self-driving car nanodegree. I also run an amateur online Python
course entitled “your 10 minutes of Python per day”, he is a huge cinema and
basketball fan.
3. Confidential3
Founded in Silicon Valley 2012
Funding: $147M | Series D
Investors: Goldman Sachs, Ping An,
Wells Fargo, NVIDIA, Nexus Ventures
We are Established
We Make World-class AI Platforms
We are Global
H2O Open Source Machine Learning
H2O Driverless AI: Automatic Machine Learning
H2O Q: AI platform for business users
Mountain View, NYC, London, Paris, Ottawa,
Prague, Chennai, Singapore
220+ 1K
20K 180K
Universities
Companies Using
H2O Open Source
Meetup Members
Experts
H2O.ai Snapshot
We are Passionate about Customers
4X customers, 2 years, all industries, all continents
Aetna/CVS, Allergan, AT&T, CapitalOne, CBA, Citi,
Coca Cola, Bredesco, Dish, Disney, Franklin
Templeton, Genentech, Kaiser Permanente, Lego,
Merck, Pepsi, Reckitt Benckiser, Roche
4. Confidential4 Confidential4
• Automatic feature engineering,
machine learning and interpretability
• Fully automated machine learning
from ingest to deployment
• User licenses on a per seat basis
annually
• GUI-based interface for end-to-end
data science
• A new and innovated platform to
make your own AI apps
• Enterprise commercial software
• Easy and intuitive platform to have
AI answer your question
H2O.ai: AI and ML Platforms
In-memory, distributed
machine learning algorithms
with H2O Flow GUI
Open Source
H2O AI open source engine
integration with Spark
H2O Driverless AI H2O Q
• 100% open source – Apache V2 Licensed
• Enterprise support subscriptions
• Interface using R, Python on H2O Flow
5. Rapid Model Deployment Cloud Integration
H2O Open Source AI Platform
GPU Enablement
• Highly portable models deployed in Java
(POJO)
and Model Object Optimized (MOJO)
• Automated and streamlined scoring
service
deployment with
Rest API
• Distributed in-memory computing
platform
• Distributed algorithms
• Fine-grain MapReduce
Big Data EcosystemOpen Source Flexible Interface
Scalability and Performance
Smart and Fast Algorithms
H2O Flow100% open source
6. A typical predictive model building process
Data Preparation
Data Cleansing
Variable
creation
Feature
engineering
Model building
• Modelling technique selection
• Variables selection
• Cross validation
• Hyper parameter tuning
• Ensemble
Visualisation
Output
• Test the model
• Understand the model
Model Productionisation
9. A typical predictive model building process
Data Preparation
Data Cleansing
Variable
creation
Feature
engineering
Model building
• Modelling technique selection
• Variables selection
• Cross validation
• Hyper parameter tuning
• Ensemble
Visualisation
Output
• Test the model
• Understand the model
Model Productionisation
10. Make any company an AI company!
Data preparation
datatable
●
R
●
https://github.com/Rdatatable/data.table, Cheat sheet
●
Python
●
https://github.com/h2oai/datatable
Benchmark
11. Make any company an AI company!
Data preparation
datatable
●
R
●
https://github.com/Rdatatable/data.table, Cheat sheet
●
Python
●
https://github.com/h2oai/datatable
Benchmark
DEMOhttps://www.kaggle.com/kernels/scriptcontent/13259895/download
12. A typical predictive model building process
Data Preparation
Data Cleansing
Variable
creation
Feature
engineering
Model building
• Modelling technique selection
• Variables selection
• Cross validation
• Hyper parameter tuning
• Ensemble
Visualisation
Output
• Test the model
• Understand the model
Model Productionisation
13. Make any company an AI company!
Data preparation
datatable
●
R
●
https://github.com/Rdatatable/data.table, Cheat sheet
●
Python
●
https://github.com/h2oai/datatable
●
Benchmark
Model Building
Automl (documentation:http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html)
R, Python, Flow
14. Make any company an AI company!
Data preparation
datatable
●
R
●
https://github.com/Rdatatable/data.table, Cheat sheet
●
Python
●
https://github.com/h2oai/datatable
●
Benchmark
Model Building
Automl (documentation: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html)
R, Python, Flow DEMO
15. A typical predictive model building process
Data Preparation
Data Cleansing
Variable
creation
Feature
engineering
Model building
• Modelling technique selection
• Variables selection
• Cross validation
• Hyper parameter tuning
• Ensemble
Visualisation
Output
• Test the model
• Understand the model
Model Productionisation
16. Make any company an AI company!
Data preparation
datatable
●
R
●
https://github.com/Rdatatable/data.table, Cheat sheet
●
Python
●
https://github.com/h2oai/datatable
●
Benchmark
Model Building
Automl (documentation: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html)
R, Python, Flow, Sparkling Water
Model Deployment - Productionisation
R ↔ Python ↔ Sparkling Water
Other (http://docs.h2o.ai/h2o/latest-stable/h2o-docs/productionizing.html):
17. Make any company an AI company!
Data preparation
datatable
●
R
●
https://github.com/Rdatatable/data.table, Cheat sheet
●
Python
●
https://github.com/h2oai/datatable
●
Benchmark
Model Building
Automl (documentation: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html)
R, Python, Flow, Sparkling Water
Model Deployment - Productionisation
R ↔ Python ↔ Sparkling Water
Other (http://docs.h2o.ai/h2o/latest-stable/h2o-docs/productionizing.html):
DEMO
18. Make any company an AI company!
Resources
Training environment: Aquarium - http://aquarium.h2o.ai
Documentation:
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/welcome.html
●
Tutorials:
●
http://docs.h2o.ai/h2o-tutorials/latest-stable/index.html
●
https://github.com/h2oai/h2o-tutorials
Videos: https://www.youtube.com/user/0xdata/videos
19. A typical predictive model building process
Data Preparation
Data
Cleansing
Variable
creation
Feature
engineering
Model building
• Modelling technique selection
• Variables selection
• Cross validation
• Hyper parameter tuning
• Ensemble
Visualisation
Output
• Test the model
• Understand the model
Model Productionisation
Driverless AI
21. Make any company an AI company!
DRIVERLESS AI
Demo
21-days trial license - https://www.h2o.ai/download/
22. Make any company an AI company!
DRIVERLESS AI
Demo
21-days trial license - https://www.h2o.ai/download/
Open documentation - http://docs.h2o.ai/driverless-ai/latest-stable/docs/userguide/index.html
23. Make any company an AI company!
DRIVERLESS AI
Demo
21-days trial license - https://www.h2o.ai/download
Open documentation - http://docs.h2o.ai/driverless-ai/latest-stable/docs/userguide/index.html
MLI book - https://www.h2o.ai/oreilly-mli-booklet-2019/
25. Confidential25
Time
Time to Insights Slow
Talent
Lack of AI Talent
Trust
Lack of Trust in AI
Why Driverless AI for Enterprise AI Adoption
~100
Data Science Experts
in the World
Time for a Data Scientist
to Build a Model
Months
Explainable AI
?
Data is a Team Sport