SlideShare a Scribd company logo
1 of 10
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Querying and analyzing data in Amazon
S3
Roy Hasson - Business Development Manager, Amazon Athena
Radhika Ravirala, EMR Solutions Architect
October 2017
Amazon Athena is an interactive query service
that makes it easy to analyze data directly from
Amazon S3 using Standard SQL
Value Proposition
• Decouple storage from compute
• Serverless – No infrastructure or resources to manage
• Pay only for data scanned
• Schema on read – Same data, many views
• Secure – IAM for authentication; Encryption at rest & in transit
• Standard compliant and open storage formats
• Built on powerful community supported OSS solutions
Familiar Technologies Under the Covers
Used for SQL Queries
In-memory distributed query engine
ANSI-SQL compatible with extensions
Used for DDL functionality
Complex data types
Multitude of formats
Supports data partitioning
Glue automates the undifferentiated heavy-lifting of ETL
 Cataloging data sources
 Identifying data formats and data types
 Generating Extract, Transform, Load code
 Executing ETL jobs; managing dependencies
 Secured by IAM policies
 Handling errors
 Managing and scaling resources
Data Catalog
 Hive metastore compatible metadata repository of data sources.
 Crawls data source to infer table, data type, partition format.
Job Execution
 Runs jobs in Spark containers – automatic scaling based on SLA.
 Glue is serverless - only pay for the resources you consume.
Job Authoring
 Generates Python code to move data from source to destination.
 Edit with your favorite IDE; share code snippets using Git.
Glue Components
Basic Concepts
Retail Data
Ops Data
Marketing Data
Relational
Databases
Flat Files
Deep Integration with
AWS Data Sources
Amazon RDS,
Aurora
Amazon
Redshift
Amazon
Athena
Amazon S3
Flat Files
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
S3
Data Catalog
AthenaEMR Redshift
Spectrum
Amazon ML / MXNet
RDS
QuickSight
Kinesis
Database
Migration
Service
Glue
Amazon Analytics End to End Architecture
IAM
Other
Sources
Getting Started
http://amazonathenahandson.s3-website-
us-east-1.amazonaws.com/

More Related Content

What's hot

Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...Amazon Web Services
 
Generating insights from IoT data using Amazon Machine Learning
Generating insights from IoT data using Amazon Machine LearningGenerating insights from IoT data using Amazon Machine Learning
Generating insights from IoT data using Amazon Machine LearningAmazon Web Services
 
Fintech Pace Security on AWS: The Customer Perspective
Fintech Pace Security on AWS: The Customer PerspectiveFintech Pace Security on AWS: The Customer Perspective
Fintech Pace Security on AWS: The Customer PerspectiveAmazon Web Services
 
(ARC304) Designing for SaaS: Next-Generation Software Delivery Models on AWS ...
(ARC304) Designing for SaaS: Next-Generation Software Delivery Models on AWS ...(ARC304) Designing for SaaS: Next-Generation Software Delivery Models on AWS ...
(ARC304) Designing for SaaS: Next-Generation Software Delivery Models on AWS ...Amazon Web Services
 
Importance of global certifications
Importance of global certificationsImportance of global certifications
Importance of global certificationsAnjani Phuyal
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseElena Lopez
 
Build 2017 - P4062 - Delivering world-class game experiences using Microsoft ...
Build 2017 - P4062 - Delivering world-class game experiences using Microsoft ...Build 2017 - P4062 - Delivering world-class game experiences using Microsoft ...
Build 2017 - P4062 - Delivering world-class game experiences using Microsoft ...Windows Developer
 
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...Amazon Web Services
 
Azure plug & play architecture
Azure   plug & play architectureAzure   plug & play architecture
Azure plug & play architectureSteef-Jan Wiggers
 
AWS Financial Services Cloud Symposium | Hong Kong - Keynote
AWS Financial Services Cloud Symposium | Hong Kong - KeynoteAWS Financial Services Cloud Symposium | Hong Kong - Keynote
AWS Financial Services Cloud Symposium | Hong Kong - KeynoteAmazon Web Services
 
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEBig Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEMatt Stubbs
 
Data Structure and Types
Data Structure and TypesData Structure and Types
Data Structure and TypesAnjani Phuyal
 
Data Technology Platform @ RueLaLa.com
Data Technology Platform @ RueLaLa.comData Technology Platform @ RueLaLa.com
Data Technology Platform @ RueLaLa.comSaurabh Chadha
 
Mining Information from Data on Cloud
Mining Information from Data on CloudMining Information from Data on Cloud
Mining Information from Data on CloudAmazon Web Services
 
Big Data Analytics & Architecture
Big Data Analytics & ArchitectureBig Data Analytics & Architecture
Big Data Analytics & ArchitectureAnjani Phuyal
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQLBig Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQLMatt Stubbs
 
Making Better Decisions Using BigData and Analytics
Making Better Decisions Using BigData and AnalyticsMaking Better Decisions Using BigData and Analytics
Making Better Decisions Using BigData and AnalyticsBoaz Ziniman
 

What's hot (20)

Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
Big Data and Analytics on Amazon Web Services: Building A Business-Friendly P...
 
Generating insights from IoT data using Amazon Machine Learning
Generating insights from IoT data using Amazon Machine LearningGenerating insights from IoT data using Amazon Machine Learning
Generating insights from IoT data using Amazon Machine Learning
 
Fintech Pace Security on AWS: The Customer Perspective
Fintech Pace Security on AWS: The Customer PerspectiveFintech Pace Security on AWS: The Customer Perspective
Fintech Pace Security on AWS: The Customer Perspective
 
(ARC304) Designing for SaaS: Next-Generation Software Delivery Models on AWS ...
(ARC304) Designing for SaaS: Next-Generation Software Delivery Models on AWS ...(ARC304) Designing for SaaS: Next-Generation Software Delivery Models on AWS ...
(ARC304) Designing for SaaS: Next-Generation Software Delivery Models on AWS ...
 
Importance of global certifications
Importance of global certificationsImportance of global certifications
Importance of global certifications
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Build 2017 - P4062 - Delivering world-class game experiences using Microsoft ...
Build 2017 - P4062 - Delivering world-class game experiences using Microsoft ...Build 2017 - P4062 - Delivering world-class game experiences using Microsoft ...
Build 2017 - P4062 - Delivering world-class game experiences using Microsoft ...
 
Big Data Analytics on AWS
Big Data Analytics on AWSBig Data Analytics on AWS
Big Data Analytics on AWS
 
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
 
Azure plug & play architecture
Azure   plug & play architectureAzure   plug & play architecture
Azure plug & play architecture
 
AWS Financial Services Cloud Symposium | Hong Kong - Keynote
AWS Financial Services Cloud Symposium | Hong Kong - KeynoteAWS Financial Services Cloud Symposium | Hong Kong - Keynote
AWS Financial Services Cloud Symposium | Hong Kong - Keynote
 
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEBig Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
 
Data Structure and Types
Data Structure and TypesData Structure and Types
Data Structure and Types
 
Data Technology Platform @ RueLaLa.com
Data Technology Platform @ RueLaLa.comData Technology Platform @ RueLaLa.com
Data Technology Platform @ RueLaLa.com
 
Mining Information from Data on Cloud
Mining Information from Data on CloudMining Information from Data on Cloud
Mining Information from Data on Cloud
 
Big Data Analytics & Architecture
Big Data Analytics & ArchitectureBig Data Analytics & Architecture
Big Data Analytics & Architecture
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQLBig Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
 
Amazon Simpledb
Amazon Simpledb Amazon Simpledb
Amazon Simpledb
 
Making Better Decisions Using BigData and Analytics
Making Better Decisions Using BigData and AnalyticsMaking Better Decisions Using BigData and Analytics
Making Better Decisions Using BigData and Analytics
 

Viewers also liked

Big Data Expo 2015 - Teradata Big Data : Just use it!
Big Data Expo 2015 - Teradata Big Data : Just use it!Big Data Expo 2015 - Teradata Big Data : Just use it!
Big Data Expo 2015 - Teradata Big Data : Just use it!BigDataExpo
 
Silicon Valley Grade IT and Cloud Maturity Assessment for Startup Ecosystem i...
Silicon Valley Grade IT and Cloud Maturity Assessment for Startup Ecosystem i...Silicon Valley Grade IT and Cloud Maturity Assessment for Startup Ecosystem i...
Silicon Valley Grade IT and Cloud Maturity Assessment for Startup Ecosystem i...Engin Deveci, Ph.D.
 
AWSome Day - Milan, July 24th 2014
AWSome Day - Milan, July 24th 2014AWSome Day - Milan, July 24th 2014
AWSome Day - Milan, July 24th 2014Amazon Web Services
 
2011_Herbstcampus_Rapid_Cloud_Development_with_Spring_Roo
2011_Herbstcampus_Rapid_Cloud_Development_with_Spring_Roo2011_Herbstcampus_Rapid_Cloud_Development_with_Spring_Roo
2011_Herbstcampus_Rapid_Cloud_Development_with_Spring_RooKai Wähner
 
Pre-Con Ed: Discover the New CA App Experience Analytics 16.3 - The Omnichann...
Pre-Con Ed: Discover the New CA App Experience Analytics 16.3 - The Omnichann...Pre-Con Ed: Discover the New CA App Experience Analytics 16.3 - The Omnichann...
Pre-Con Ed: Discover the New CA App Experience Analytics 16.3 - The Omnichann...CA Technologies
 
Things you should know about Scalability!
Things you should know about Scalability!Things you should know about Scalability!
Things you should know about Scalability!Robert Mederer
 
App infrastructure &_integration_keynote_final
App infrastructure &_integration_keynote_finalApp infrastructure &_integration_keynote_final
App infrastructure &_integration_keynote_finaleileendohertysmith
 
Vasilis Bankov & Calin Iliescu AEGON
Vasilis Bankov & Calin Iliescu AEGONVasilis Bankov & Calin Iliescu AEGON
Vasilis Bankov & Calin Iliescu AEGONBigDataExpo
 
Stephenson big data utrecht 2017
Stephenson   big data utrecht 2017Stephenson   big data utrecht 2017
Stephenson big data utrecht 2017BigDataExpo
 
Azure Large Scale Deployments - Tales from the Trenches
Azure Large Scale Deployments - Tales from the TrenchesAzure Large Scale Deployments - Tales from the Trenches
Azure Large Scale Deployments - Tales from the TrenchesAaron Saikovski
 
GoAzure 2015 Azure AD for Developers
GoAzure 2015 Azure AD for DevelopersGoAzure 2015 Azure AD for Developers
GoAzure 2015 Azure AD for Developerskekekekenta
 
Modernes Rechenzentrum - Future Decoded
Modernes Rechenzentrum - Future DecodedModernes Rechenzentrum - Future Decoded
Modernes Rechenzentrum - Future DecodedMicrosoft Österreich
 
Dino Product Overview
Dino Product OverviewDino Product Overview
Dino Product OverviewPim Brokken
 
Running Business Critical Workloads on AWS
Running Business Critical Workloads on AWS Running Business Critical Workloads on AWS
Running Business Critical Workloads on AWS Amazon Web Services
 
Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...
Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...
Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...SolarWinds Loggly
 

Viewers also liked (20)

stagerapport2.3
stagerapport2.3stagerapport2.3
stagerapport2.3
 
Big Data Expo 2015 - Teradata Big Data : Just use it!
Big Data Expo 2015 - Teradata Big Data : Just use it!Big Data Expo 2015 - Teradata Big Data : Just use it!
Big Data Expo 2015 - Teradata Big Data : Just use it!
 
Silicon Valley Grade IT and Cloud Maturity Assessment for Startup Ecosystem i...
Silicon Valley Grade IT and Cloud Maturity Assessment for Startup Ecosystem i...Silicon Valley Grade IT and Cloud Maturity Assessment for Startup Ecosystem i...
Silicon Valley Grade IT and Cloud Maturity Assessment for Startup Ecosystem i...
 
okspring3x
okspring3xokspring3x
okspring3x
 
AWSome Day - Milan, July 24th 2014
AWSome Day - Milan, July 24th 2014AWSome Day - Milan, July 24th 2014
AWSome Day - Milan, July 24th 2014
 
2011_Herbstcampus_Rapid_Cloud_Development_with_Spring_Roo
2011_Herbstcampus_Rapid_Cloud_Development_with_Spring_Roo2011_Herbstcampus_Rapid_Cloud_Development_with_Spring_Roo
2011_Herbstcampus_Rapid_Cloud_Development_with_Spring_Roo
 
Pre-Con Ed: Discover the New CA App Experience Analytics 16.3 - The Omnichann...
Pre-Con Ed: Discover the New CA App Experience Analytics 16.3 - The Omnichann...Pre-Con Ed: Discover the New CA App Experience Analytics 16.3 - The Omnichann...
Pre-Con Ed: Discover the New CA App Experience Analytics 16.3 - The Omnichann...
 
Things you should know about Scalability!
Things you should know about Scalability!Things you should know about Scalability!
Things you should know about Scalability!
 
App infrastructure &_integration_keynote_final
App infrastructure &_integration_keynote_finalApp infrastructure &_integration_keynote_final
App infrastructure &_integration_keynote_final
 
Vasilis Bankov & Calin Iliescu AEGON
Vasilis Bankov & Calin Iliescu AEGONVasilis Bankov & Calin Iliescu AEGON
Vasilis Bankov & Calin Iliescu AEGON
 
Introduction to QC
Introduction to QCIntroduction to QC
Introduction to QC
 
Go Serverless with AWS Lambda and Apex
Go Serverless with AWS Lambda and ApexGo Serverless with AWS Lambda and Apex
Go Serverless with AWS Lambda and Apex
 
Stephenson big data utrecht 2017
Stephenson   big data utrecht 2017Stephenson   big data utrecht 2017
Stephenson big data utrecht 2017
 
Azure Large Scale Deployments - Tales from the Trenches
Azure Large Scale Deployments - Tales from the TrenchesAzure Large Scale Deployments - Tales from the Trenches
Azure Large Scale Deployments - Tales from the Trenches
 
GoAzure 2015 Azure AD for Developers
GoAzure 2015 Azure AD for DevelopersGoAzure 2015 Azure AD for Developers
GoAzure 2015 Azure AD for Developers
 
Oracle Cloud Café IOT 12 avril 2016
Oracle Cloud Café IOT 12 avril 2016Oracle Cloud Café IOT 12 avril 2016
Oracle Cloud Café IOT 12 avril 2016
 
Modernes Rechenzentrum - Future Decoded
Modernes Rechenzentrum - Future DecodedModernes Rechenzentrum - Future Decoded
Modernes Rechenzentrum - Future Decoded
 
Dino Product Overview
Dino Product OverviewDino Product Overview
Dino Product Overview
 
Running Business Critical Workloads on AWS
Running Business Critical Workloads on AWS Running Business Critical Workloads on AWS
Running Business Critical Workloads on AWS
 
Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...
Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...
Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...
 

Similar to Workshop 2: Building a streaming data platform on AWS

Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Amazon Web Services
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS Amazon Web Services
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWSAmazon Web Services
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSAmazon Web Services
 
Module 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWSModule 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWSLam Le
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFAmazon Web Services
 
Building Data Lakes in the AWS Cloud
Building Data Lakes in the AWS CloudBuilding Data Lakes in the AWS Cloud
Building Data Lakes in the AWS CloudAmazon Web Services
 
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxAmazon Web Services
 
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...Amazon Web Services
 
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)Amazon Web Services Korea
 
ABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data ApplicationsABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data ApplicationsAmazon Web Services
 
Deploying a Data Lake in AWS - AWS Summit Tel Aviv 2017
Deploying a Data Lake in AWS - AWS Summit Tel Aviv 2017Deploying a Data Lake in AWS - AWS Summit Tel Aviv 2017
Deploying a Data Lake in AWS - AWS Summit Tel Aviv 2017Amazon Web Services
 
From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...
From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...
From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...Amazon Web Services
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 
From Data Analysis to Smarter Apps - AWS Summit Tel Aviv 2017
From Data Analysis to Smarter Apps - AWS Summit Tel Aviv 2017From Data Analysis to Smarter Apps - AWS Summit Tel Aviv 2017
From Data Analysis to Smarter Apps - AWS Summit Tel Aviv 2017Amazon Web Services
 
Building Your Data Lake on AWS - Level 200
Building Your Data Lake on AWS - Level 200Building Your Data Lake on AWS - Level 200
Building Your Data Lake on AWS - Level 200Amazon Web Services
 

Similar to Workshop 2: Building a streaming data platform on AWS (20)

Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWS
 
Module 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWSModule 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWS
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Building Data Lakes in the AWS Cloud
Building Data Lakes in the AWS CloudBuilding Data Lakes in the AWS Cloud
Building Data Lakes in the AWS Cloud
 
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
 
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
 
ABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data ApplicationsABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data Applications
 
Deploying a Data Lake in AWS - AWS Summit Tel Aviv 2017
Deploying a Data Lake in AWS - AWS Summit Tel Aviv 2017Deploying a Data Lake in AWS - AWS Summit Tel Aviv 2017
Deploying a Data Lake in AWS - AWS Summit Tel Aviv 2017
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...
From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...
From Data Collection to Actionable Insights in 60 Seconds: AWS Developer Work...
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
From Data Analysis to Smarter Apps - AWS Summit Tel Aviv 2017
From Data Analysis to Smarter Apps - AWS Summit Tel Aviv 2017From Data Analysis to Smarter Apps - AWS Summit Tel Aviv 2017
From Data Analysis to Smarter Apps - AWS Summit Tel Aviv 2017
 
Securing Your Big Data on AWS
Securing Your Big Data on AWSSecuring Your Big Data on AWS
Securing Your Big Data on AWS
 
Building Your Data Lake on AWS - Level 200
Building Your Data Lake on AWS - Level 200Building Your Data Lake on AWS - Level 200
Building Your Data Lake on AWS - Level 200
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Workshop 2: Building a streaming data platform on AWS

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Querying and analyzing data in Amazon S3 Roy Hasson - Business Development Manager, Amazon Athena Radhika Ravirala, EMR Solutions Architect October 2017
  • 2. Amazon Athena is an interactive query service that makes it easy to analyze data directly from Amazon S3 using Standard SQL
  • 3. Value Proposition • Decouple storage from compute • Serverless – No infrastructure or resources to manage • Pay only for data scanned • Schema on read – Same data, many views • Secure – IAM for authentication; Encryption at rest & in transit • Standard compliant and open storage formats • Built on powerful community supported OSS solutions
  • 4. Familiar Technologies Under the Covers Used for SQL Queries In-memory distributed query engine ANSI-SQL compatible with extensions Used for DDL functionality Complex data types Multitude of formats Supports data partitioning
  • 5. Glue automates the undifferentiated heavy-lifting of ETL  Cataloging data sources  Identifying data formats and data types  Generating Extract, Transform, Load code  Executing ETL jobs; managing dependencies  Secured by IAM policies  Handling errors  Managing and scaling resources
  • 6. Data Catalog  Hive metastore compatible metadata repository of data sources.  Crawls data source to infer table, data type, partition format. Job Execution  Runs jobs in Spark containers – automatic scaling based on SLA.  Glue is serverless - only pay for the resources you consume. Job Authoring  Generates Python code to move data from source to destination.  Edit with your favorite IDE; share code snippets using Git. Glue Components
  • 7. Basic Concepts Retail Data Ops Data Marketing Data Relational Databases Flat Files
  • 8. Deep Integration with AWS Data Sources Amazon RDS, Aurora Amazon Redshift Amazon Athena Amazon S3 Flat Files
  • 9. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. S3 Data Catalog AthenaEMR Redshift Spectrum Amazon ML / MXNet RDS QuickSight Kinesis Database Migration Service Glue Amazon Analytics End to End Architecture IAM Other Sources