SlideShare a Scribd company logo
1 of 43
Download to read offline
Analytics in the Cloud
Natalino Busa - Head of Data Science
2 Natalino Busa - @natbusa
Distributed computing Machine Learning
Statistics Big/Fast Data Streaming Computing
Head of Applied Data Science at
Teradata
On most networks:
@natbusa
3 Natalino Busa - @natbusa
Let’s define Cloud Services
4 Natalino Busa - @natbusa
Analytics in the cloud: stacking layers
Bare Metal: Physical Machines
5 Natalino Busa - @natbusa
Analytics in the cloud: stacking layers
Bare Metal: Physical Machines
IAAS: Virtual Resources
6 Natalino Busa - @natbusa
Analytics in the cloud: stacking layers
Bare Metal: Physical Machines
IAAS: Virtual Resources
CAAS: Containers,
7 Natalino Busa - @natbusa
Analytics in the cloud: stacking layers
Bare Metal: Physical Machines
IAAS: Virtual Resources
CAAS: Containers,
dPAAS: Datastores, Data Engines
iPAAS: Tools Integration, Flows & Processes
8 Natalino Busa - @natbusa
Bare Metal: Physical Machines
IAAS: Virtual Resources
CAAS: Containers,
dPAAS: Datastores, Data Engines
iPAAS: Tools Integration, Flows & Processes
DAAAS: Data Analytics as a Service
Watson
Services
Azure ML
Google
Cloud MLBigML
Analytics in the cloud: stacking layers
9 Natalino Busa - @natbusa
Analytics in the cloud: today’s talk
Bare Metal: Physical Machines
IAAS: Virtual Resources
CAAS: Containers,
dPAAS: Datastores, Data Engines
iPAAS: Tools Integration, Flows & Processes
DAAAS: Data Analytics as a Service
10 Natalino Busa - @natbusa
“we live in an age of open source datacenters, so
we can stack all these things together and we
have open source from the ground to ceiling.”
Sam Ramji, CEO of Cloud Foundry
https://www.youtube.com/watch?v=7oCSFcUW-Qk
11 Natalino Busa - @natbusa
Containers vs VMs
12 Natalino Busa - @natbusa
Techs based on Containers
YARN
13 Natalino Busa - @natbusa
Containers as a Service
https://aws.amazon.com/ecs/
For example: Amazon ECS
14 Natalino Busa - @natbusa
CaaS: 6 offerings
https://www.linux.com/news/5-container-service-tools-you-should-know-about
Project Magnum
Amazon ECS
Docker DataCenterGoogle
Container Engine
15 Natalino Busa - @natbusa
Most new PaaS solutions are containerized
16 Natalino Busa - @natbusa
PaaS: Big Data SQL Queries
Batch Oriented
Large Aggregations
Interactive Queries
Data Exploration
Interactive Queries
Machine Learning
Streaming:
Micro-batching
Interactive Queries
Machine Learning
Streaming:
Event-driven
17 Natalino Busa - @natbusa
Advanced Analytics: models and algorithms
18 Natalino Busa - @natbusa
PaaS: Advanced Analytics
Graph analytics:
- Cluster items
- Extract similarities
- Detect patterns
19 Natalino Busa - @natbusa
PaaS: Advanced Analytics
Text analytics:
- Sentiment Analysis
- Language Detection
- Summarization
- Entity extraction
20 Natalino Busa - @natbusa
PaaS: Advanced Analytics
Machine Learning:
- Classification
- Regression
- Clustering
- Forecasting
- Anomaly detection
21 Natalino Busa - @natbusa
PaaS: Advanced Analytics
AI and Deep Learning
- Unstructured Data
- Object Detection
- Natural Language Processing
- Video Summarization
- Speech Recognition
22 Natalino Busa - @natbusa
PaaS: Advanced Analytics
SQL + Graph + Text + Machine Learning + Voice/Image/Video
23 Natalino Busa - @natbusa
dPaaS: Machine (deep) Learning
… this are just a few examples ...
24 Natalino Busa - @natbusa
Analytics Everywhere
Public Cloud Managed Cloud Private Cloud Private Infra
25 Natalino Busa - @natbusa
iPaas: Components for Analytics in the Cloud
SQL : Big Data
Data Warehousing
NoSQL
Machine LearningObjects Stores
Streaming
Computing
SQL: Relational
Transactional DB
26 Natalino Busa - @natbusa
iPaas, dPaaS:
Objects
Stores
HDFS
GlusterFS
CephFS
NFS
Swift
Nova
Cassandra
Redis
S3 (AWS)
Storage (GCP)
...
27 Natalino Busa - @natbusa
iPaas, dPaaS:
NoSQLObjects
Stores
HDFS
GlusterFS
CephFS
NFS
Swift
Nova
Cassandra
Redis
S3 (AWS)
Storage (GCP)
...
Cassandra
Redis
HBase
Accumulo
Neo4J
ElasticSearch
MongoDB
Couchbase
BigTable (GCP)
DynamoDB
28 Natalino Busa - @natbusa
iPaas, dPaaS:
NoSQLObjects
Stores
SQL: Relational
Transactional DB
HDFS
GlusterFS
CephFS
NFS
Swift
Nova
Cassandra
Redis
S3 (AWS)
Storage (GCP)
...
MySQL
PostgreSQL
MariaDB
Oracle (AWS MP)
Cassandra
Redis
HBase
Accumulo
Neo4J
ElasticSearch
MongoDB
Couchbase
BigTable (GCP)
DynamoDB
29 Natalino Busa - @natbusa
iPaas, dPaaS:
SQL : Big Data
Data Warehousing
NoSQLObjects
Stores
SQL: Relational
Transactional DB
HDFS
GlusterFS
CephFS
NFS
Swift
Nova
Cassandra
Redis
S3 (AWS)
Storage (GCP)
...
MySQL
PostgreSQL
MariaDB
Oracle (AWS MP)
Hive
Presto
Spark SQL
Impala
Redshift (AWS)
BigQuery (GCP)
Big SQL (IBM)
Teradata (AWS MP)
SAP Hana(AWS MP)
Vertica (AWS MP)
Cassandra
Redis
HBase
Accumulo
Neo4J
ElasticSearch
MongoDB
Couchbase
BigTable (GCP)
DynamoDB
30 Natalino Busa - @natbusa
iPaas, dPaaS:
SQL : Big Data
Data Warehousing
NoSQL Machine
Learning
Objects
Stores
SQL: Relational
Transactional DB
HDFS
GlusterFS
CephFS
NFS
Swift
Nova
Cassandra
Redis
S3 (AWS)
Storage (GCP)
...
MySQL
PostgreSQL
MariaDB
Oracle (AWS MP)
Hive
Presto
Spark SQL
Impala
Redshift (AWS)
BigQuery (GCP)
Big SQL (IBM)
Teradata (AWS MP)
SAP Hana(AWS MP)
Vertica (AWS MP)
Cassandra
Redis
HBase
Accumulo
Neo4J
ElasticSearch
MongoDB
Couchbase
BigTable (GCP)
DynamoDB
Spark ML
H2O
Flink
Areosolve
Theano
Tensorflow
XGboost
Azure ML
AWS ML
Google ML
IBM Watson
31 Natalino Busa - @natbusa
iPaas, dPaaS:
SQL : Big Data
Data Warehousing
NoSQL Machine
Learning
Objects
Stores
Streaming
Computing
SQL: Relational
Transactional DB
HDFS
GlusterFS
CephFS
NFS
Swift
Nova
Cassandra
Redis
S3 (AWS)
Storage (GCP)
...
MySQL
PostgreSQL
MariaDB
Oracle (AWS MP)
Hive
Presto
Spark SQL
Impala
Redshift (AWS)
BigQuery (GCP)
Big SQL (IBM)
Teradata (AWS MP)
SAP Hana(AWS MP)
Vertica (AWS MP)
Cassandra
Redis
HBase
Accumulo
Neo4J
ElasticSearch
MongoDB
Couchbase
BigTable (GCP)
DynamoDB
Spark ML
H2O
Flink
Areosolve
Theano
Tensorflow
XGboost
Azure ML
AWS ML
Google ML
IBM Watson
Heron (Storm)
NiFi
Spark Streaming
Flink
Kafka Streams
Logstash
StreamSQL
Google DataFlow
(GCP)
32 Natalino Busa - @natbusa
iPaaS: Selecting your Analytical Stack
Flexible. Powerful.
- Combinations for this example:
8 * 3 * 4 * 8 * 7 * 7 = 37632
Right tool for the right job
- Fit for purpose
- Multi-Genre Analytics
Hard to maintain and upgrade:
- Extended Skills and Know-how
- Components upgrades must be compatible
Hard to configure:
- no matter if cloud or bare or vms
- complex stacks with many tools and services
33 Natalino Busa - @natbusa
iPaaS: Deploy & Manage your own Analytics
How to simplify? Select a bundle!
34 Natalino Busa - @natbusa
iPaaS: bundled recipes & stacks
Select a recipe:
- Hortonworks Data Platform
- Cloudera Data Platform
- Reactive Platform
- Smack Stack
- Pancake Stack
- ELK Stack
- Select your own
35 Natalino Busa - @natbusa
iPaaS: my favs analytical stacks
Objects
Stores
NoSQL SQL : Big Data
Data Warehousing
Machine Learning Streaming
Computing
All Hadoop (5) HDFS Hbase Hive Spark Storm
Smack stack (2) Cassandra Cassandra Spark Spark Spark
Elastic (5) HDFS ElasticSearch Hive H2O Kafka
Data Science (8) HDFS ElasticSearch Hive, Presto Spark, H2O, Tensorflow Flink
Real Time (2) Cassandra Cassandra Flink Flink Flink
36 Natalino Busa - @natbusa
dPaaS: Managed Analytics
This is hard ! Can we access it as a service?
37 Natalino Busa - @natbusa
dPaaS: Managed Hadoop & Spark
HDInsight: Hadoop, Spark, and R as services
Managed Spark Clusters, BigInsight (Hadoop)
DataFlow and DataProc: Flink, Spark and
Hadoop Clusters as a Service
EMR: Hadoop components a la carte
38 Natalino Busa - @natbusa
PaaS: Analytical clusters
Ephemeral
Create then Dispose
Clusters are Short-Lived
Data Exploration
Isolated, Personal
Simple Access Management
Interactive Analytics
Permanent
Clusters are Long Lived
Scheduled Operations
Production ETL
Co-Ordinated
Complex Access Management
Batch Analytics
vs
39 Natalino Busa - @natbusa
DAaaS: Microsoft’s Cortana and ML Studio
40 Natalino Busa - @natbusa
DAaaS: IBM Watson
41 Natalino Busa - @natbusa
DAaaS: Google ML and AI as a service
Cloud Computing for
Deep Neural Networks
> Train, Score, Data
AI and ML models for:
● Speech (audio)
● Language (text)
● Vision (images/video)
42 Natalino Busa - @natbusa
Summary
• Analytics in the Cloud:
The dawn of a new computing era
• IPaas, dPaas:
complexity vs flexibility, it’s a tradeoff
• Computing clusters:
Ephemeral and Persistent
43 Natalino Busa - @natbusa
Head of Applied Data Science at
Teradata
Distributed computing Machine Learning
Statistics Big/Fast Data Streaming Computing
Linkedin and Twitter:
natbusa

More Related Content

Viewers also liked

Apache Flume - DataDayTexas
Apache Flume - DataDayTexasApache Flume - DataDayTexas
Apache Flume - DataDayTexasArvind Prabhakar
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 
Elasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlibElasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlibJen Aman
 
Feb 2013 HUG: Large Scale Data Ingest Using Apache Flume
Feb 2013 HUG: Large Scale Data Ingest Using Apache FlumeFeb 2013 HUG: Large Scale Data Ingest Using Apache Flume
Feb 2013 HUG: Large Scale Data Ingest Using Apache FlumeYahoo Developer Network
 
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...Spark Summit
 
Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Evan Chan
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Spark Summit
 
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Spark Summit
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 
Imperative Induced Innovation - Patrick W. Dowd, Ph. D
Imperative Induced Innovation - Patrick W. Dowd, Ph. DImperative Induced Innovation - Patrick W. Dowd, Ph. D
Imperative Induced Innovation - Patrick W. Dowd, Ph. Dscoopnewsgroup
 
Apache Accumulo 1.8.0 Overview
Apache Accumulo 1.8.0 OverviewApache Accumulo 1.8.0 Overview
Apache Accumulo 1.8.0 OverviewJosh Elser
 
Setting up a mini big data architecture, just for you! - Bas Geerdink
Setting up a mini big data architecture, just for you! - Bas GeerdinkSetting up a mini big data architecture, just for you! - Bas Geerdink
Setting up a mini big data architecture, just for you! - Bas GeerdinkNLJUG
 
Ready for smart data banking?
Ready for smart data banking?Ready for smart data banking?
Ready for smart data banking?Patrick Barnert
 
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)Matt Fuller
 
Presto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop MeetupPresto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop MeetupWojciech Biela
 
Integrating Elastic and Apache Spark - Elastic London Meetup (2015-09-24)
Integrating Elastic and Apache Spark - Elastic London Meetup (2015-09-24)Integrating Elastic and Apache Spark - Elastic London Meetup (2015-09-24)
Integrating Elastic and Apache Spark - Elastic London Meetup (2015-09-24)Neil Andrassy
 

Viewers also liked (19)

Apache Flume - DataDayTexas
Apache Flume - DataDayTexasApache Flume - DataDayTexas
Apache Flume - DataDayTexas
 
Introducing Akka
Introducing AkkaIntroducing Akka
Introducing Akka
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Elasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlibElasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlib
 
Feb 2013 HUG: Large Scale Data Ingest Using Apache Flume
Feb 2013 HUG: Large Scale Data Ingest Using Apache FlumeFeb 2013 HUG: Large Scale Data Ingest Using Apache Flume
Feb 2013 HUG: Large Scale Data Ingest Using Apache Flume
 
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
 
Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
 
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Imperative Induced Innovation - Patrick W. Dowd, Ph. D
Imperative Induced Innovation - Patrick W. Dowd, Ph. DImperative Induced Innovation - Patrick W. Dowd, Ph. D
Imperative Induced Innovation - Patrick W. Dowd, Ph. D
 
Apache Accumulo 1.8.0 Overview
Apache Accumulo 1.8.0 OverviewApache Accumulo 1.8.0 Overview
Apache Accumulo 1.8.0 Overview
 
SQRRL threat hunting platform
SQRRL threat hunting platformSQRRL threat hunting platform
SQRRL threat hunting platform
 
Near Real-Time Outlier Detection and Interpretation
Near Real-Time Outlier Detection and InterpretationNear Real-Time Outlier Detection and Interpretation
Near Real-Time Outlier Detection and Interpretation
 
Setting up a mini big data architecture, just for you! - Bas Geerdink
Setting up a mini big data architecture, just for you! - Bas GeerdinkSetting up a mini big data architecture, just for you! - Bas Geerdink
Setting up a mini big data architecture, just for you! - Bas Geerdink
 
Ready for smart data banking?
Ready for smart data banking?Ready for smart data banking?
Ready for smart data banking?
 
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
 
Presto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop MeetupPresto for the Enterprise @ Hadoop Meetup
Presto for the Enterprise @ Hadoop Meetup
 
Integrating Elastic and Apache Spark - Elastic London Meetup (2015-09-24)
Integrating Elastic and Apache Spark - Elastic London Meetup (2015-09-24)Integrating Elastic and Apache Spark - Elastic London Meetup (2015-09-24)
Integrating Elastic and Apache Spark - Elastic London Meetup (2015-09-24)
 

More from Natalino Busa

Data Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovationData Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovationNatalino Busa
 
Data science apps powered by Jupyter Notebooks
Data science apps powered by Jupyter NotebooksData science apps powered by Jupyter Notebooks
Data science apps powered by Jupyter NotebooksNatalino Busa
 
7 steps for highly effective deep neural networks
7 steps for highly effective deep neural networks7 steps for highly effective deep neural networks
7 steps for highly effective deep neural networksNatalino Busa
 
Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooksNatalino Busa
 
[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditingNatalino Busa
 
Strata London 16: sightseeing, venues, and friends
Strata  London 16: sightseeing, venues, and friendsStrata  London 16: sightseeing, venues, and friends
Strata London 16: sightseeing, venues, and friendsNatalino Busa
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and CassandraNatalino Busa
 
The evolution of data analytics
The evolution of data analyticsThe evolution of data analytics
The evolution of data analyticsNatalino Busa
 
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...Natalino Busa
 
Streaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and SprayStreaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and SprayNatalino Busa
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.Natalino Busa
 
Big data solutions for advanced marketing analytics
Big data solutions for advanced marketing analyticsBig data solutions for advanced marketing analytics
Big data solutions for advanced marketing analyticsNatalino Busa
 
Awesome Banking API's
Awesome Banking API'sAwesome Banking API's
Awesome Banking API'sNatalino Busa
 
Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.Natalino Busa
 
Big and fast a quest for relevant and real-time analytics
Big and fast a quest for relevant and real-time analyticsBig and fast a quest for relevant and real-time analytics
Big and fast a quest for relevant and real-time analyticsNatalino Busa
 
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analyticsBig Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analyticsNatalino Busa
 
Strata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topicsStrata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topicsNatalino Busa
 
Streaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesStreaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesNatalino Busa
 

More from Natalino Busa (20)

Data Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovationData Production Pipelines: Legacy, practices, and innovation
Data Production Pipelines: Legacy, practices, and innovation
 
Data science apps powered by Jupyter Notebooks
Data science apps powered by Jupyter NotebooksData science apps powered by Jupyter Notebooks
Data science apps powered by Jupyter Notebooks
 
7 steps for highly effective deep neural networks
7 steps for highly effective deep neural networks7 steps for highly effective deep neural networks
7 steps for highly effective deep neural networks
 
Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooks
 
[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing
 
Strata London 16: sightseeing, venues, and friends
Strata  London 16: sightseeing, venues, and friendsStrata  London 16: sightseeing, venues, and friends
Strata London 16: sightseeing, venues, and friends
 
Data in Action
Data in ActionData in Action
Data in Action
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
 
The evolution of data analytics
The evolution of data analyticsThe evolution of data analytics
The evolution of data analytics
 
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
 
Streaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and SprayStreaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and Spray
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
 
Big data solutions for advanced marketing analytics
Big data solutions for advanced marketing analyticsBig data solutions for advanced marketing analytics
Big data solutions for advanced marketing analytics
 
Awesome Banking API's
Awesome Banking API'sAwesome Banking API's
Awesome Banking API's
 
Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.
 
Big and fast a quest for relevant and real-time analytics
Big and fast a quest for relevant and real-time analyticsBig and fast a quest for relevant and real-time analytics
Big and fast a quest for relevant and real-time analytics
 
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analyticsBig Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
 
Strata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topicsStrata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topics
 
Streaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesStreaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologies
 
Big data landscape
Big data landscapeBig data landscape
Big data landscape
 

Recently uploaded

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 

Analytics in the cloud

  • 1. Analytics in the Cloud Natalino Busa - Head of Data Science
  • 2. 2 Natalino Busa - @natbusa Distributed computing Machine Learning Statistics Big/Fast Data Streaming Computing Head of Applied Data Science at Teradata On most networks: @natbusa
  • 3. 3 Natalino Busa - @natbusa Let’s define Cloud Services
  • 4. 4 Natalino Busa - @natbusa Analytics in the cloud: stacking layers Bare Metal: Physical Machines
  • 5. 5 Natalino Busa - @natbusa Analytics in the cloud: stacking layers Bare Metal: Physical Machines IAAS: Virtual Resources
  • 6. 6 Natalino Busa - @natbusa Analytics in the cloud: stacking layers Bare Metal: Physical Machines IAAS: Virtual Resources CAAS: Containers,
  • 7. 7 Natalino Busa - @natbusa Analytics in the cloud: stacking layers Bare Metal: Physical Machines IAAS: Virtual Resources CAAS: Containers, dPAAS: Datastores, Data Engines iPAAS: Tools Integration, Flows & Processes
  • 8. 8 Natalino Busa - @natbusa Bare Metal: Physical Machines IAAS: Virtual Resources CAAS: Containers, dPAAS: Datastores, Data Engines iPAAS: Tools Integration, Flows & Processes DAAAS: Data Analytics as a Service Watson Services Azure ML Google Cloud MLBigML Analytics in the cloud: stacking layers
  • 9. 9 Natalino Busa - @natbusa Analytics in the cloud: today’s talk Bare Metal: Physical Machines IAAS: Virtual Resources CAAS: Containers, dPAAS: Datastores, Data Engines iPAAS: Tools Integration, Flows & Processes DAAAS: Data Analytics as a Service
  • 10. 10 Natalino Busa - @natbusa “we live in an age of open source datacenters, so we can stack all these things together and we have open source from the ground to ceiling.” Sam Ramji, CEO of Cloud Foundry https://www.youtube.com/watch?v=7oCSFcUW-Qk
  • 11. 11 Natalino Busa - @natbusa Containers vs VMs
  • 12. 12 Natalino Busa - @natbusa Techs based on Containers YARN
  • 13. 13 Natalino Busa - @natbusa Containers as a Service https://aws.amazon.com/ecs/ For example: Amazon ECS
  • 14. 14 Natalino Busa - @natbusa CaaS: 6 offerings https://www.linux.com/news/5-container-service-tools-you-should-know-about Project Magnum Amazon ECS Docker DataCenterGoogle Container Engine
  • 15. 15 Natalino Busa - @natbusa Most new PaaS solutions are containerized
  • 16. 16 Natalino Busa - @natbusa PaaS: Big Data SQL Queries Batch Oriented Large Aggregations Interactive Queries Data Exploration Interactive Queries Machine Learning Streaming: Micro-batching Interactive Queries Machine Learning Streaming: Event-driven
  • 17. 17 Natalino Busa - @natbusa Advanced Analytics: models and algorithms
  • 18. 18 Natalino Busa - @natbusa PaaS: Advanced Analytics Graph analytics: - Cluster items - Extract similarities - Detect patterns
  • 19. 19 Natalino Busa - @natbusa PaaS: Advanced Analytics Text analytics: - Sentiment Analysis - Language Detection - Summarization - Entity extraction
  • 20. 20 Natalino Busa - @natbusa PaaS: Advanced Analytics Machine Learning: - Classification - Regression - Clustering - Forecasting - Anomaly detection
  • 21. 21 Natalino Busa - @natbusa PaaS: Advanced Analytics AI and Deep Learning - Unstructured Data - Object Detection - Natural Language Processing - Video Summarization - Speech Recognition
  • 22. 22 Natalino Busa - @natbusa PaaS: Advanced Analytics SQL + Graph + Text + Machine Learning + Voice/Image/Video
  • 23. 23 Natalino Busa - @natbusa dPaaS: Machine (deep) Learning … this are just a few examples ...
  • 24. 24 Natalino Busa - @natbusa Analytics Everywhere Public Cloud Managed Cloud Private Cloud Private Infra
  • 25. 25 Natalino Busa - @natbusa iPaas: Components for Analytics in the Cloud SQL : Big Data Data Warehousing NoSQL Machine LearningObjects Stores Streaming Computing SQL: Relational Transactional DB
  • 26. 26 Natalino Busa - @natbusa iPaas, dPaaS: Objects Stores HDFS GlusterFS CephFS NFS Swift Nova Cassandra Redis S3 (AWS) Storage (GCP) ...
  • 27. 27 Natalino Busa - @natbusa iPaas, dPaaS: NoSQLObjects Stores HDFS GlusterFS CephFS NFS Swift Nova Cassandra Redis S3 (AWS) Storage (GCP) ... Cassandra Redis HBase Accumulo Neo4J ElasticSearch MongoDB Couchbase BigTable (GCP) DynamoDB
  • 28. 28 Natalino Busa - @natbusa iPaas, dPaaS: NoSQLObjects Stores SQL: Relational Transactional DB HDFS GlusterFS CephFS NFS Swift Nova Cassandra Redis S3 (AWS) Storage (GCP) ... MySQL PostgreSQL MariaDB Oracle (AWS MP) Cassandra Redis HBase Accumulo Neo4J ElasticSearch MongoDB Couchbase BigTable (GCP) DynamoDB
  • 29. 29 Natalino Busa - @natbusa iPaas, dPaaS: SQL : Big Data Data Warehousing NoSQLObjects Stores SQL: Relational Transactional DB HDFS GlusterFS CephFS NFS Swift Nova Cassandra Redis S3 (AWS) Storage (GCP) ... MySQL PostgreSQL MariaDB Oracle (AWS MP) Hive Presto Spark SQL Impala Redshift (AWS) BigQuery (GCP) Big SQL (IBM) Teradata (AWS MP) SAP Hana(AWS MP) Vertica (AWS MP) Cassandra Redis HBase Accumulo Neo4J ElasticSearch MongoDB Couchbase BigTable (GCP) DynamoDB
  • 30. 30 Natalino Busa - @natbusa iPaas, dPaaS: SQL : Big Data Data Warehousing NoSQL Machine Learning Objects Stores SQL: Relational Transactional DB HDFS GlusterFS CephFS NFS Swift Nova Cassandra Redis S3 (AWS) Storage (GCP) ... MySQL PostgreSQL MariaDB Oracle (AWS MP) Hive Presto Spark SQL Impala Redshift (AWS) BigQuery (GCP) Big SQL (IBM) Teradata (AWS MP) SAP Hana(AWS MP) Vertica (AWS MP) Cassandra Redis HBase Accumulo Neo4J ElasticSearch MongoDB Couchbase BigTable (GCP) DynamoDB Spark ML H2O Flink Areosolve Theano Tensorflow XGboost Azure ML AWS ML Google ML IBM Watson
  • 31. 31 Natalino Busa - @natbusa iPaas, dPaaS: SQL : Big Data Data Warehousing NoSQL Machine Learning Objects Stores Streaming Computing SQL: Relational Transactional DB HDFS GlusterFS CephFS NFS Swift Nova Cassandra Redis S3 (AWS) Storage (GCP) ... MySQL PostgreSQL MariaDB Oracle (AWS MP) Hive Presto Spark SQL Impala Redshift (AWS) BigQuery (GCP) Big SQL (IBM) Teradata (AWS MP) SAP Hana(AWS MP) Vertica (AWS MP) Cassandra Redis HBase Accumulo Neo4J ElasticSearch MongoDB Couchbase BigTable (GCP) DynamoDB Spark ML H2O Flink Areosolve Theano Tensorflow XGboost Azure ML AWS ML Google ML IBM Watson Heron (Storm) NiFi Spark Streaming Flink Kafka Streams Logstash StreamSQL Google DataFlow (GCP)
  • 32. 32 Natalino Busa - @natbusa iPaaS: Selecting your Analytical Stack Flexible. Powerful. - Combinations for this example: 8 * 3 * 4 * 8 * 7 * 7 = 37632 Right tool for the right job - Fit for purpose - Multi-Genre Analytics Hard to maintain and upgrade: - Extended Skills and Know-how - Components upgrades must be compatible Hard to configure: - no matter if cloud or bare or vms - complex stacks with many tools and services
  • 33. 33 Natalino Busa - @natbusa iPaaS: Deploy & Manage your own Analytics How to simplify? Select a bundle!
  • 34. 34 Natalino Busa - @natbusa iPaaS: bundled recipes & stacks Select a recipe: - Hortonworks Data Platform - Cloudera Data Platform - Reactive Platform - Smack Stack - Pancake Stack - ELK Stack - Select your own
  • 35. 35 Natalino Busa - @natbusa iPaaS: my favs analytical stacks Objects Stores NoSQL SQL : Big Data Data Warehousing Machine Learning Streaming Computing All Hadoop (5) HDFS Hbase Hive Spark Storm Smack stack (2) Cassandra Cassandra Spark Spark Spark Elastic (5) HDFS ElasticSearch Hive H2O Kafka Data Science (8) HDFS ElasticSearch Hive, Presto Spark, H2O, Tensorflow Flink Real Time (2) Cassandra Cassandra Flink Flink Flink
  • 36. 36 Natalino Busa - @natbusa dPaaS: Managed Analytics This is hard ! Can we access it as a service?
  • 37. 37 Natalino Busa - @natbusa dPaaS: Managed Hadoop & Spark HDInsight: Hadoop, Spark, and R as services Managed Spark Clusters, BigInsight (Hadoop) DataFlow and DataProc: Flink, Spark and Hadoop Clusters as a Service EMR: Hadoop components a la carte
  • 38. 38 Natalino Busa - @natbusa PaaS: Analytical clusters Ephemeral Create then Dispose Clusters are Short-Lived Data Exploration Isolated, Personal Simple Access Management Interactive Analytics Permanent Clusters are Long Lived Scheduled Operations Production ETL Co-Ordinated Complex Access Management Batch Analytics vs
  • 39. 39 Natalino Busa - @natbusa DAaaS: Microsoft’s Cortana and ML Studio
  • 40. 40 Natalino Busa - @natbusa DAaaS: IBM Watson
  • 41. 41 Natalino Busa - @natbusa DAaaS: Google ML and AI as a service Cloud Computing for Deep Neural Networks > Train, Score, Data AI and ML models for: ● Speech (audio) ● Language (text) ● Vision (images/video)
  • 42. 42 Natalino Busa - @natbusa Summary • Analytics in the Cloud: The dawn of a new computing era • IPaas, dPaas: complexity vs flexibility, it’s a tradeoff • Computing clusters: Ephemeral and Persistent
  • 43. 43 Natalino Busa - @natbusa Head of Applied Data Science at Teradata Distributed computing Machine Learning Statistics Big/Fast Data Streaming Computing Linkedin and Twitter: natbusa