SlideShare a Scribd company logo
1 of 44
Put your data to work with Big Data
services from AWS and Qubole
Dharmesh (Dash) Desai
Technology Evangelist
Qubole
Speakers
Rahul Bhartia
Solutions Architect
Amazon Web Services
Data is growing
of new data will be
created every second
for every human being
on the planet by 2020
http://www.whizpr.be/upload/medialab/21/c
ompany/Media_Presentation_2012_DigiUn
iverseFINAL1.pdf
1.7MB
compound annual
growth rate of 58%
surpassing $1 billion by
2020 forecasted for the
Hadoop market
http://www.ap-institute.com/big-data-
articles/big-data-what-is-hadoop-
%E2%80%93-an-explanation-for-
absolutely-anyone.aspx
http://www.marketanalysis.com/?p=279
58%
of all data is ever
analyzed and used at
the moment
http://www.technologyreview.com/news/51
4346/the-data-made-me-do-it/
0.5%<
Big Data is for everyone
The market for Big Data technologies is growing more than six times faster than the
information technology market as a whole….
…and those companies who use their data well win.
Why AWS for Big Data?
Immediately
Available
Broad and Deep
Capabilities
Trusted and
Secure
Scalable
Collect, Store, Analyze, and Visualize
It’s easy to get data to AWS, store it securely, and analyze it with the engine of your choice,
without any long-term commitment or vendor lock-in
Collect
Import/Export
Snowball
Direct Connect
VM Import/Export
Store
Amazon S3
EMR
Amazon Glacier
Amazon Redshift
DynamoDB
Analyze
Amazon Kinesis
Lambda
EMR
EC2
Aurora
AWS provides the most complete
platform for Big Data
What can you do with Big Data on AWS?
Big Data Repositories Clickstream Analysis ETL Offload
Machine Learning Online Ad Serving BI Applications
Migrating your Big Data to
the Cloud
Qubole cloud advantage
 More efficient
 Faster time-to-value
 Resiliency and fault tolerance
 Elasticity
 Lower TCO
 Flexible
Data as a differentiator
Guided search
Big Data deployments are difficult
Where Big Data falls short
Rigid and
inflexible
infrastructure
Non adaptive
software
services
Highly
specialized
systems
Difficult to build
and operate
Big Data deployments are difficult
Qubole Confidential
Big Data Deployments are difficult
months to implement
6-18
succeed
27%
achieve full-scale
production
13%
cite skills gap as a
major inhibitor
57%
Where Big Data falls short
Source:
https://www.capgemini-consulting.com/resource-file-access/resource/pdf/cracking_the_data_conundrum-big_data_pov_13-1-15_v2.pdf
http://www.gartner.com/newsroom/id/3051717
Simplicity on the Cloud
SaaS self-service platform
Qubole Advantage
Data Analysts - Visualize data with
Qubole’s Notebook
Data Analysts - Build queries using
Qubole’s SmartQuery
Data Engineers - Build workflows and schedule
jobs for automation
Data Engineers - Analyze data with
Qubole’s Workbench
Data Admins - Use Control Panel to manage clusters
Data Admins - Use Qubole to manage roles,
groups and users
Data Admins - Use Qubole to monitor Cluster Usage
Data Admins - Use Qubole for detailed Cluster Usage
Data Admins - Use Qubole’s built-in
Ganglia Monitoring
Scalability on the Cloud
Provisioning, Management, Autoscaling
Qubole Advantage
On-premise HDFS cluster
 Compute & storage live together
 Compute & storage scale together
 Provisioned for peak capacity
 Cluster must be persistently on
Qubole Confidential
C+S C+S C+SC+S
C+S C+S C+SC+S
C+S C+S C+SC+S
C C CC
C C CC
C C CC
Amazon
S3
C C CC
C C CC
C C CC
Compute and storage separated on the Cloud
Auto-scales back up
when batch jobs start
Take advantage of the scale of the Cloud
Unlimited compute capacity
3:30 p.m.
Downscaling
7:00 p.m. Min
cluster size
C C CC C C CC
Take advantage of the scale of the Cloud
Instance type flexibility
instance types
40+
integration with AWS
reserved instances
different instance types
used
37
On-premises to the Cloud
Qubole Confidential
Qubole’s Hadoop Migration Service
Migrate workload to the cloud
Any on-premises Hadoop distro Data consistency and unified data visibility between on-premises and cloud
Cloud migration use cases
Pain
Maxed-out on-prem cluster
Requirements
Data in-synch during migration
Decommission on prem workload
24x7 data replication, no data loss
No downtime
Cloud migration use cases
Migrate workload to the cloud
Any on-premises Hadoop distro Data consistency and unified data visibility between on-premises and cloud
Pain
Maxed-out on-prem cluster
Requirements
Data in-synch during migration
Decommission on prem workload
24x7 data replication, no data loss
No downtime
Solution
Data  Cloud
Apps/data pipelines  QDS
Cloud migration use cases
Workload burst out to the cloud
Any on-premises Hadoop distro Data consistency and unified data visibility between on-premises and cloud
Pain
Workload spikes  Can’t be processed on-prem
Requirements
24x7 data replication, no data loss
No downtime
Bi-directional replication
Cloud migration use cases
Workload burst out to the cloud
Any on-premises Hadoop distro Data consistency and unified data visibility between on-premises and cloud
Pain
Workload spikes  Can’t be processed on-prem
Requirements
24x7 data replication, no data loss
No downtime
Bi-directional replication
Solution
Sync On-Prem Data  Cloud
Results  On-Prem
Workloads  QDS
Cloud migration use cases
Move test/dev environment to the Cloud
Any on-prem Hadoop distro Data consistency and unified data visibility between on-prem and Cloud
Pain
Shared cluster  Production
Requirements
Periodic replication  No data loss
No downtime
Development  Limit
Cloud migration use cases
Move test/dev environment to the Cloud
Any on-prem Hadoop distro Data consistency and unified data visibility between on-prem and Cloud
Pain
Shared cluster  Production
Requirements
Periodic replication  No data loss
No downtime
Solution
Free on-prem resources
Apps/data pipelines  QDS
Data subset  Cloud
Development  Limit
Qubole’s Hadoop Migration Service
Qubole Confidential
Hadoop
Spark
Presto
Hive
HBase
S3
Cloudera
CDH
Hortonworks
HDP
MAPR
Journey to the Cloud
Qubole Case Study Media Math
Use cases
Qubole Confidential
Build customer
profiles
Simplify attribution
insights
Qubole case study:
Segment audiences
Strength in numbers
Qubole Confidential
Each record = financial transaction
Qubole case study:
impression
opportunities a day
180B
peak qps of data/day
(compressed)
3+M 3+TB
Non-trivial challenges
Qubole Confidential
Transforming
semi-structured
data
Qubole case study:
Repeatable data
pipelines
Upfront
investment &
commitment
“We needed something that was reliable
and easy to learn, setup, use and put into
production without the risk and high
expectations that comes with committing
millions of dollars in upfront investment.
Qubole was that thing.”
Marc Rosen Sr. Director, Data Analytics
The solution – Qubole
Qubole case study:
Analytics
Spark/Hive
(with Amazon Redshift connector)
Qubole case study:
Qubole at MediaMath
Product
Hive
Engineering
Spark/Hive
Business Analysts
SmartQuery
Data Science
Spark (Scala)
Don’t have to worry about this anymore!!!
Qubole Confidential
Qubole case study:
Thank you!
Qubole Confidential
Dharmesh (Dash) Desai
@iamontheinet
dharmesh@qubole.com
Questions & Answers
Dharmesh (Dash) Desai
Technology Evangelist, Qubole
Email: dharmesh@qubole.com
Twitter: @iamontheinet #BigDataQube
LinkedIn: www.linkedin.com/in/dharmesh
dashdesai

More Related Content

What's hot

Getting Started with AWS Lambda and the Serverless Cloud by Jim Tran, Princip...
Getting Started with AWS Lambda and the Serverless Cloud by Jim Tran, Princip...Getting Started with AWS Lambda and the Serverless Cloud by Jim Tran, Princip...
Getting Started with AWS Lambda and the Serverless Cloud by Jim Tran, Princip...Amazon Web Services
 
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)Amazon Web Services
 
Keeping Security In-Step with Your Application Demand Curve
Keeping Security In-Step with Your Application Demand CurveKeeping Security In-Step with Your Application Demand Curve
Keeping Security In-Step with Your Application Demand CurveAmazon Web Services
 
Migrating Databases to AWS for Business Critical Applications and Analytics
Migrating Databases to AWS for Business Critical Applications and Analytics Migrating Databases to AWS for Business Critical Applications and Analytics
Migrating Databases to AWS for Business Critical Applications and Analytics Amazon Web Services
 
CI/CD on AWS: Deploy Everything All the Time | AWS Public Sector Summit 2016
CI/CD on AWS: Deploy Everything All the Time | AWS Public Sector Summit 2016CI/CD on AWS: Deploy Everything All the Time | AWS Public Sector Summit 2016
CI/CD on AWS: Deploy Everything All the Time | AWS Public Sector Summit 2016Amazon Web Services
 
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...Amazon Web Services
 
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot InstancesWKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot InstancesAmazon Web Services
 
Wild Rides Takes off - The Dawn of a New Unicorn
Wild Rides Takes off - The Dawn of a New UnicornWild Rides Takes off - The Dawn of a New Unicorn
Wild Rides Takes off - The Dawn of a New UnicornAmazon Web Services
 
Building Big Data Applications on AWS
Building Big Data Applications on AWSBuilding Big Data Applications on AWS
Building Big Data Applications on AWSAmazon Web Services
 
Microsoft SQL Server Dive Deep.pdf
Microsoft SQL Server Dive Deep.pdfMicrosoft SQL Server Dive Deep.pdf
Microsoft SQL Server Dive Deep.pdfAmazon Web Services
 
Getting Started with Amazon Redshift
 Getting Started with Amazon Redshift Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
SRV412 Deep Dive on CICD and Docker
SRV412 Deep Dive on CICD and DockerSRV412 Deep Dive on CICD and Docker
SRV412 Deep Dive on CICD and DockerAmazon Web Services
 
Design, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech Talks
Design, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech TalksDesign, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech Talks
Design, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech TalksAmazon Web Services
 
How HHS agencies are running Mission Critical Systems in the Cloud
How HHS agencies are running Mission Critical Systems in the CloudHow HHS agencies are running Mission Critical Systems in the Cloud
How HHS agencies are running Mission Critical Systems in the CloudAmazon Web Services
 
Seeing More Clearly: How Essilor Overcame 3 Common Cloud Security Challenges ...
Seeing More Clearly: How Essilor Overcame 3 Common Cloud Security Challenges ...Seeing More Clearly: How Essilor Overcame 3 Common Cloud Security Challenges ...
Seeing More Clearly: How Essilor Overcame 3 Common Cloud Security Challenges ...Amazon Web Services
 
AWS Sydney Summit 2013 - Keynote
AWS Sydney Summit 2013 - KeynoteAWS Sydney Summit 2013 - Keynote
AWS Sydney Summit 2013 - KeynoteAmazon Web Services
 
AWS Enterprise Summit Netherlands - Cost Optimisation at Scale
AWS Enterprise Summit Netherlands - Cost Optimisation at ScaleAWS Enterprise Summit Netherlands - Cost Optimisation at Scale
AWS Enterprise Summit Netherlands - Cost Optimisation at ScaleAmazon Web Services
 
Common Workloads on the AWS Cloud
Common Workloads on the AWS CloudCommon Workloads on the AWS Cloud
Common Workloads on the AWS CloudAmazon Web Services
 
Apache Spark Clusters for Everyone | AWS Public Sector Summit 2016
Apache Spark Clusters for Everyone | AWS Public Sector Summit 2016Apache Spark Clusters for Everyone | AWS Public Sector Summit 2016
Apache Spark Clusters for Everyone | AWS Public Sector Summit 2016Amazon Web Services
 

What's hot (20)

Getting Started with AWS Lambda and the Serverless Cloud by Jim Tran, Princip...
Getting Started with AWS Lambda and the Serverless Cloud by Jim Tran, Princip...Getting Started with AWS Lambda and the Serverless Cloud by Jim Tran, Princip...
Getting Started with AWS Lambda and the Serverless Cloud by Jim Tran, Princip...
 
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
 
Keeping Security In-Step with Your Application Demand Curve
Keeping Security In-Step with Your Application Demand CurveKeeping Security In-Step with Your Application Demand Curve
Keeping Security In-Step with Your Application Demand Curve
 
Migrating Databases to AWS for Business Critical Applications and Analytics
Migrating Databases to AWS for Business Critical Applications and Analytics Migrating Databases to AWS for Business Critical Applications and Analytics
Migrating Databases to AWS for Business Critical Applications and Analytics
 
CI/CD on AWS: Deploy Everything All the Time | AWS Public Sector Summit 2016
CI/CD on AWS: Deploy Everything All the Time | AWS Public Sector Summit 2016CI/CD on AWS: Deploy Everything All the Time | AWS Public Sector Summit 2016
CI/CD on AWS: Deploy Everything All the Time | AWS Public Sector Summit 2016
 
SEC301 Security @ (Cloud) Scale
SEC301 Security @ (Cloud) ScaleSEC301 Security @ (Cloud) Scale
SEC301 Security @ (Cloud) Scale
 
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
AWS re:Invent 2016: Large-Scale, Cloud-Based Analysis of Cancer Genomes: Less...
 
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot InstancesWKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
WKS401 Deploy a Deep Learning Framework on Amazon ECS and EC2 Spot Instances
 
Wild Rides Takes off - The Dawn of a New Unicorn
Wild Rides Takes off - The Dawn of a New UnicornWild Rides Takes off - The Dawn of a New Unicorn
Wild Rides Takes off - The Dawn of a New Unicorn
 
Building Big Data Applications on AWS
Building Big Data Applications on AWSBuilding Big Data Applications on AWS
Building Big Data Applications on AWS
 
Microsoft SQL Server Dive Deep.pdf
Microsoft SQL Server Dive Deep.pdfMicrosoft SQL Server Dive Deep.pdf
Microsoft SQL Server Dive Deep.pdf
 
Getting Started with Amazon Redshift
 Getting Started with Amazon Redshift Getting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
SRV412 Deep Dive on CICD and Docker
SRV412 Deep Dive on CICD and DockerSRV412 Deep Dive on CICD and Docker
SRV412 Deep Dive on CICD and Docker
 
Design, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech Talks
Design, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech TalksDesign, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech Talks
Design, Deploy, and Optimize SQL Server on AWS - June 2017 AWS Online Tech Talks
 
How HHS agencies are running Mission Critical Systems in the Cloud
How HHS agencies are running Mission Critical Systems in the CloudHow HHS agencies are running Mission Critical Systems in the Cloud
How HHS agencies are running Mission Critical Systems in the Cloud
 
Seeing More Clearly: How Essilor Overcame 3 Common Cloud Security Challenges ...
Seeing More Clearly: How Essilor Overcame 3 Common Cloud Security Challenges ...Seeing More Clearly: How Essilor Overcame 3 Common Cloud Security Challenges ...
Seeing More Clearly: How Essilor Overcame 3 Common Cloud Security Challenges ...
 
AWS Sydney Summit 2013 - Keynote
AWS Sydney Summit 2013 - KeynoteAWS Sydney Summit 2013 - Keynote
AWS Sydney Summit 2013 - Keynote
 
AWS Enterprise Summit Netherlands - Cost Optimisation at Scale
AWS Enterprise Summit Netherlands - Cost Optimisation at ScaleAWS Enterprise Summit Netherlands - Cost Optimisation at Scale
AWS Enterprise Summit Netherlands - Cost Optimisation at Scale
 
Common Workloads on the AWS Cloud
Common Workloads on the AWS CloudCommon Workloads on the AWS Cloud
Common Workloads on the AWS Cloud
 
Apache Spark Clusters for Everyone | AWS Public Sector Summit 2016
Apache Spark Clusters for Everyone | AWS Public Sector Summit 2016Apache Spark Clusters for Everyone | AWS Public Sector Summit 2016
Apache Spark Clusters for Everyone | AWS Public Sector Summit 2016
 

Viewers also liked

Big Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by QuboleBig Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by QuboleQubole
 
Qubole hadoop-summit-2013-europe
Qubole hadoop-summit-2013-europeQubole hadoop-summit-2013-europe
Qubole hadoop-summit-2013-europeJoydeep Sen Sarma
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataQubole
 
5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoption5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoptionQubole
 
Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015Joydeep Sen Sarma
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesQubole
 
Nw qubole overview_033015
Nw qubole overview_033015Nw qubole overview_033015
Nw qubole overview_033015Michael Mersch
 
Fortinet Automates Migration onto Layered Secure Workloads
Fortinet Automates Migration onto Layered Secure WorkloadsFortinet Automates Migration onto Layered Secure Workloads
Fortinet Automates Migration onto Layered Secure WorkloadsAmazon Web Services
 
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup   Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup Qubole
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics SuiteJames Serra
 
Azure stream analytics by Nico Jacobs
Azure stream analytics by Nico JacobsAzure stream analytics by Nico Jacobs
Azure stream analytics by Nico JacobsITProceed
 
BIPD Tech Tuesday Presentation - Qubole
BIPD Tech Tuesday Presentation - QuboleBIPD Tech Tuesday Presentation - Qubole
BIPD Tech Tuesday Presentation - QuboleQubole
 
Azure ARM’d and Ready
Azure ARM’d and ReadyAzure ARM’d and Ready
Azure ARM’d and Readymscug
 
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...NoSQLmatters
 
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...Amazon Web Services
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...MSAdvAnalytics
 
DataXu: Programmatic Premium Webinar - June 7, 2012
DataXu: Programmatic Premium Webinar - June 7, 2012DataXu: Programmatic Premium Webinar - June 7, 2012
DataXu: Programmatic Premium Webinar - June 7, 2012dataxu
 

Viewers also liked (20)

Big Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by QuboleBig Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by Qubole
 
Qubole hadoop-summit-2013-europe
Qubole hadoop-summit-2013-europeQubole hadoop-summit-2013-europe
Qubole hadoop-summit-2013-europe
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big Data
 
5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoption5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoption
 
Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slides
 
Nw qubole overview_033015
Nw qubole overview_033015Nw qubole overview_033015
Nw qubole overview_033015
 
Creating a fortigate vpn network & security blog
Creating a fortigate vpn   network & security blogCreating a fortigate vpn   network & security blog
Creating a fortigate vpn network & security blog
 
Fortinet Automates Migration onto Layered Secure Workloads
Fortinet Automates Migration onto Layered Secure WorkloadsFortinet Automates Migration onto Layered Secure Workloads
Fortinet Automates Migration onto Layered Secure Workloads
 
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup   Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
 
RDO-Packstack Workshop
RDO-Packstack Workshop RDO-Packstack Workshop
RDO-Packstack Workshop
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics Suite
 
Azure stream analytics by Nico Jacobs
Azure stream analytics by Nico JacobsAzure stream analytics by Nico Jacobs
Azure stream analytics by Nico Jacobs
 
BIPD Tech Tuesday Presentation - Qubole
BIPD Tech Tuesday Presentation - QuboleBIPD Tech Tuesday Presentation - Qubole
BIPD Tech Tuesday Presentation - Qubole
 
Azure ARM’d and Ready
Azure ARM’d and ReadyAzure ARM’d and Ready
Azure ARM’d and Ready
 
Azure Document Db
Azure Document DbAzure Document Db
Azure Document Db
 
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
 
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
 
DataXu: Programmatic Premium Webinar - June 7, 2012
DataXu: Programmatic Premium Webinar - June 7, 2012DataXu: Programmatic Premium Webinar - June 7, 2012
DataXu: Programmatic Premium Webinar - June 7, 2012
 

Similar to Unlocking Self-Service Big Data Analytics on AWS

Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Alluxio, Inc.
 
Scenarios for building Hybrid Cloud
Scenarios for building Hybrid CloudScenarios for building Hybrid Cloud
Scenarios for building Hybrid CloudPracheta Budhwar
 
Solving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute finalSolving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute finalAvere Systems
 
Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...
Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...
Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...HostedbyConfluent
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoophuguk
 
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...Amazon Web Services
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platformdhruv_chaudhari
 
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...Yahoo Developer Network
 
Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Holden Ackerman
 
How to migrate workloads to the google cloud platform
How to migrate workloads to the google cloud platformHow to migrate workloads to the google cloud platform
How to migrate workloads to the google cloud platformactualtechmedia
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantageAmazon Web Services
 
Look Before You Leap: Migrating On-Premises Hadoop to AWS
Look Before You Leap: Migrating On-Premises Hadoop to AWSLook Before You Leap: Migrating On-Premises Hadoop to AWS
Look Before You Leap: Migrating On-Premises Hadoop to AWSDevOps.com
 
Big Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudBig Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudAmazon Web Services
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Carole Gunst
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorialrustd
 
Accelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & AlluxioAccelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & AlluxioBig Data Aplications Meetup
 
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStax
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStaxWebinar | From Zero to 1 Million with Google Cloud Platform and DataStax
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStaxDataStax
 
Hybrid data lake on google cloud with alluxio and dataproc
Hybrid data lake on google cloud  with alluxio and dataprocHybrid data lake on google cloud  with alluxio and dataproc
Hybrid data lake on google cloud with alluxio and dataprocAlluxio, Inc.
 
Gimel and PayPal Notebooks @ TDWI Leadership Summit Orlando
Gimel and PayPal Notebooks @ TDWI Leadership Summit OrlandoGimel and PayPal Notebooks @ TDWI Leadership Summit Orlando
Gimel and PayPal Notebooks @ TDWI Leadership Summit OrlandoRomit Mehta
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 

Similar to Unlocking Self-Service Big Data Analytics on AWS (20)

Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
 
Scenarios for building Hybrid Cloud
Scenarios for building Hybrid CloudScenarios for building Hybrid Cloud
Scenarios for building Hybrid Cloud
 
Solving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute finalSolving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute final
 
Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...
Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...
Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
 
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...
 
Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI Top Trends in Building Data Lakes for Machine Learning and AI
Top Trends in Building Data Lakes for Machine Learning and AI
 
How to migrate workloads to the google cloud platform
How to migrate workloads to the google cloud platformHow to migrate workloads to the google cloud platform
How to migrate workloads to the google cloud platform
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
 
Look Before You Leap: Migrating On-Premises Hadoop to AWS
Look Before You Leap: Migrating On-Premises Hadoop to AWSLook Before You Leap: Migrating On-Premises Hadoop to AWS
Look Before You Leap: Migrating On-Premises Hadoop to AWS
 
Big Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudBig Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS Cloud
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorial
 
Accelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & AlluxioAccelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & Alluxio
 
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStax
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStaxWebinar | From Zero to 1 Million with Google Cloud Platform and DataStax
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStax
 
Hybrid data lake on google cloud with alluxio and dataproc
Hybrid data lake on google cloud  with alluxio and dataprocHybrid data lake on google cloud  with alluxio and dataproc
Hybrid data lake on google cloud with alluxio and dataproc
 
Gimel and PayPal Notebooks @ TDWI Leadership Summit Orlando
Gimel and PayPal Notebooks @ TDWI Leadership Summit OrlandoGimel and PayPal Notebooks @ TDWI Leadership Summit Orlando
Gimel and PayPal Notebooks @ TDWI Leadership Summit Orlando
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Unlocking Self-Service Big Data Analytics on AWS

  • 1. Put your data to work with Big Data services from AWS and Qubole
  • 2. Dharmesh (Dash) Desai Technology Evangelist Qubole Speakers Rahul Bhartia Solutions Architect Amazon Web Services
  • 3. Data is growing of new data will be created every second for every human being on the planet by 2020 http://www.whizpr.be/upload/medialab/21/c ompany/Media_Presentation_2012_DigiUn iverseFINAL1.pdf 1.7MB compound annual growth rate of 58% surpassing $1 billion by 2020 forecasted for the Hadoop market http://www.ap-institute.com/big-data- articles/big-data-what-is-hadoop- %E2%80%93-an-explanation-for- absolutely-anyone.aspx http://www.marketanalysis.com/?p=279 58% of all data is ever analyzed and used at the moment http://www.technologyreview.com/news/51 4346/the-data-made-me-do-it/ 0.5%<
  • 4. Big Data is for everyone The market for Big Data technologies is growing more than six times faster than the information technology market as a whole…. …and those companies who use their data well win.
  • 5. Why AWS for Big Data? Immediately Available Broad and Deep Capabilities Trusted and Secure Scalable
  • 6. Collect, Store, Analyze, and Visualize It’s easy to get data to AWS, store it securely, and analyze it with the engine of your choice, without any long-term commitment or vendor lock-in Collect Import/Export Snowball Direct Connect VM Import/Export Store Amazon S3 EMR Amazon Glacier Amazon Redshift DynamoDB Analyze Amazon Kinesis Lambda EMR EC2 Aurora
  • 7. AWS provides the most complete platform for Big Data What can you do with Big Data on AWS? Big Data Repositories Clickstream Analysis ETL Offload Machine Learning Online Ad Serving BI Applications
  • 8. Migrating your Big Data to the Cloud
  • 9. Qubole cloud advantage  More efficient  Faster time-to-value  Resiliency and fault tolerance  Elasticity  Lower TCO  Flexible
  • 10. Data as a differentiator Guided search
  • 11. Big Data deployments are difficult Where Big Data falls short Rigid and inflexible infrastructure Non adaptive software services Highly specialized systems Difficult to build and operate
  • 12. Big Data deployments are difficult Qubole Confidential Big Data Deployments are difficult months to implement 6-18 succeed 27% achieve full-scale production 13% cite skills gap as a major inhibitor 57% Where Big Data falls short Source: https://www.capgemini-consulting.com/resource-file-access/resource/pdf/cracking_the_data_conundrum-big_data_pov_13-1-15_v2.pdf http://www.gartner.com/newsroom/id/3051717
  • 13. Simplicity on the Cloud SaaS self-service platform Qubole Advantage
  • 14. Data Analysts - Visualize data with Qubole’s Notebook
  • 15. Data Analysts - Build queries using Qubole’s SmartQuery
  • 16. Data Engineers - Build workflows and schedule jobs for automation
  • 17. Data Engineers - Analyze data with Qubole’s Workbench
  • 18. Data Admins - Use Control Panel to manage clusters
  • 19. Data Admins - Use Qubole to manage roles, groups and users
  • 20. Data Admins - Use Qubole to monitor Cluster Usage
  • 21. Data Admins - Use Qubole for detailed Cluster Usage
  • 22. Data Admins - Use Qubole’s built-in Ganglia Monitoring
  • 23. Scalability on the Cloud Provisioning, Management, Autoscaling Qubole Advantage
  • 24. On-premise HDFS cluster  Compute & storage live together  Compute & storage scale together  Provisioned for peak capacity  Cluster must be persistently on Qubole Confidential C+S C+S C+SC+S C+S C+S C+SC+S C+S C+S C+SC+S
  • 25. C C CC C C CC C C CC Amazon S3 C C CC C C CC C C CC Compute and storage separated on the Cloud
  • 26. Auto-scales back up when batch jobs start Take advantage of the scale of the Cloud Unlimited compute capacity 3:30 p.m. Downscaling 7:00 p.m. Min cluster size C C CC C C CC
  • 27. Take advantage of the scale of the Cloud Instance type flexibility instance types 40+ integration with AWS reserved instances different instance types used 37
  • 28. On-premises to the Cloud Qubole Confidential Qubole’s Hadoop Migration Service
  • 29. Migrate workload to the cloud Any on-premises Hadoop distro Data consistency and unified data visibility between on-premises and cloud Cloud migration use cases Pain Maxed-out on-prem cluster Requirements Data in-synch during migration Decommission on prem workload 24x7 data replication, no data loss No downtime
  • 30. Cloud migration use cases Migrate workload to the cloud Any on-premises Hadoop distro Data consistency and unified data visibility between on-premises and cloud Pain Maxed-out on-prem cluster Requirements Data in-synch during migration Decommission on prem workload 24x7 data replication, no data loss No downtime Solution Data  Cloud Apps/data pipelines  QDS
  • 31. Cloud migration use cases Workload burst out to the cloud Any on-premises Hadoop distro Data consistency and unified data visibility between on-premises and cloud Pain Workload spikes  Can’t be processed on-prem Requirements 24x7 data replication, no data loss No downtime Bi-directional replication
  • 32. Cloud migration use cases Workload burst out to the cloud Any on-premises Hadoop distro Data consistency and unified data visibility between on-premises and cloud Pain Workload spikes  Can’t be processed on-prem Requirements 24x7 data replication, no data loss No downtime Bi-directional replication Solution Sync On-Prem Data  Cloud Results  On-Prem Workloads  QDS
  • 33. Cloud migration use cases Move test/dev environment to the Cloud Any on-prem Hadoop distro Data consistency and unified data visibility between on-prem and Cloud Pain Shared cluster  Production Requirements Periodic replication  No data loss No downtime Development  Limit
  • 34. Cloud migration use cases Move test/dev environment to the Cloud Any on-prem Hadoop distro Data consistency and unified data visibility between on-prem and Cloud Pain Shared cluster  Production Requirements Periodic replication  No data loss No downtime Solution Free on-prem resources Apps/data pipelines  QDS Data subset  Cloud Development  Limit
  • 35. Qubole’s Hadoop Migration Service Qubole Confidential Hadoop Spark Presto Hive HBase S3 Cloudera CDH Hortonworks HDP MAPR
  • 36. Journey to the Cloud Qubole Case Study Media Math
  • 37. Use cases Qubole Confidential Build customer profiles Simplify attribution insights Qubole case study: Segment audiences
  • 38. Strength in numbers Qubole Confidential Each record = financial transaction Qubole case study: impression opportunities a day 180B peak qps of data/day (compressed) 3+M 3+TB
  • 39. Non-trivial challenges Qubole Confidential Transforming semi-structured data Qubole case study: Repeatable data pipelines Upfront investment & commitment
  • 40. “We needed something that was reliable and easy to learn, setup, use and put into production without the risk and high expectations that comes with committing millions of dollars in upfront investment. Qubole was that thing.” Marc Rosen Sr. Director, Data Analytics The solution – Qubole Qubole case study:
  • 41. Analytics Spark/Hive (with Amazon Redshift connector) Qubole case study: Qubole at MediaMath Product Hive Engineering Spark/Hive Business Analysts SmartQuery Data Science Spark (Scala)
  • 42. Don’t have to worry about this anymore!!! Qubole Confidential Qubole case study:
  • 43. Thank you! Qubole Confidential Dharmesh (Dash) Desai @iamontheinet dharmesh@qubole.com
  • 44. Questions & Answers Dharmesh (Dash) Desai Technology Evangelist, Qubole Email: dharmesh@qubole.com Twitter: @iamontheinet #BigDataQube LinkedIn: www.linkedin.com/in/dharmesh dashdesai