1. Data storage
&
Database Services
in
AWS and Azure
A COMPARISONAs delivered in tech meetup #01
organized by www.edYoda.com
on 22nd April 2018
at zekeLabs Bangalore
3. Agenda
Comparative study of -
Services in Data and Big Data Paradigm including…
• Data Storage services
• Database services
… on AWS and Azure cloud
5. Data services
Collect Store Analyze/Process Visualization
blob
Data box
StorSimple
HDInsightData Lake
Analytics
SQL Data
warehouse
Power BI
Cosmos DB
Redis CacheDatabase
Data Factory
Amazon
Events
hub
Stream
Analytics
Azure
Function
Datalake
Data Sync
7. Object based storage services
AWS: S3
▶ Supports replication based on storage class
▶ 3 storage classes:
▶ Standard
▶ S3- Infrequently Accessed
▶ One-zone IA
▶ Reduced Redundent
▶ Versioning enabled
Azure: Block Blobs
▶ Replication Depends on storage account
▶ 2 storage classes:
▶ Hot
▶ Cold
▶ Replication strategy:
▶ Locally redundant storage LRS
▶ Zone redundant storage ZRS
▶ Geo Redundant GRS
▶ Read Access Geo Redundant storage RA-GRS
8. Block based storage services
AWS: EBS
▶ Volume for OS and Data disks
▶ Max Size 16 TB
▶ Replication within AZ
Azure: Page Blobs
▶ Volume for OS and Data disks
▶ Max Size 8 TB
▶ Replication Depends on redundancy
strategy
9. Shared File storage services
AWS: Elastic File System EFS
▶ Uses NFS v4
▶ Max Size: Unlimited (Automatically scales)
▶ Multiple on-premise servers as well as cloud
servers can access them simultaneously
Azure: Azure Files
▶ Uses SMB3.0 and HTTPS
▶ Max Size 5 TB
▶ Multiple on-premise servers as well as cloud
servers can access them simultaneously
10. Extra storage services
AWS
▶ Glacier
▶ Lifecycle management rule
▶ Multiple type of EBS volume
▶ Simple Queue Service (SQS)
Azure
▶ Archive storage Tier in Blob
▶ Data Lifecycle Mgmt using Data Factory
▶ Storage account:
▶ Standard
▶ Premium
▶ Queue
▶ Table
12. Database services in AWS
▶ SQL Servers on EC2 SQL Server on Virtual Machine
▶ Relational Database Services Managed Database compatible with
MySQL, PostgreSQL, Oracle etc.
▶ Redshift Elastic data warehouse as a
service with petabyte scale
▶ DynamoDB Highly distributed, NO-SQL database for
any scale
▶ ElastiCache Powers applications with high-throughput,
low-latency data access
▶ Glue Machine Learning Enabled ETL
tool
13. Database services in Azure
▶ SQL Server [IAAS] SQL Server on Virtual Machine
▶ Azure Databases Managed Database compatible with [SQL,
MySQL, PostgreSQL]
▶ SQL Data Warehouse Elastic data warehouse as a service with
enterprise-class features
▶ SQL Server Stretch Database Dynamically stretch on-premises SQL Server databases
to Azure
▶ Azure Cosmos DB Globally distributed, multi-model database for any
scale
▶ Table Storage NoSQL key-value store using semi-
structured datasets
15. AWS:
Legacy
Advantages
▶ Pioneer in IAAS.
▶ Amazon Aurora: A cloud-native in-house Mysql and
PostgreSQL compatible fully Managed database.
▶ RedShift Data warehouse.
▶ Third party compatibility with no baggage
16. Managed Database
AWS: RDS
▶ Supports: MySQL, MSSQL, PostgreSQL, MariaDB,
Oracle, Aurora.
▶ Multi AZ replication
▶ Cross region Read Replica
▶ Aurora [High performance and throughput]
Azure: SQL Databases
▶ Supports: MSSQL, MySQL, PostgreSQL, mariaDB
▶ Other versions are supported on Virtual Machine
(IAAS)
▶ Stretch databases
▶ Elastic Database pools
17. Non-Relational Database
AWS: DynamoDB
▶ Automatically Scales
▶ 6 X cross regions Replication
Azure: CosmosDB
▶ Multi Modal Globally Distributed
▶ low latency NO-SQL database.
18. Data warehouses
AWS: Redshift
▶ Based on PostgreSQL
▶ Columnar based structure
▶ ELT techniques used
▶ High level of Query performance
▶ Integrates well with other AWS services.
Azure: SQL Warehouse
▶ Based on MSSQL
▶ Columnar based
▶ Unlimited scale
▶ Can be paused when no query expected.
19. ETL Tools
AWS: Glue
▶ Based on Spark
▶ Enabled with Machine Learning
▶ Crawlers
▶ Uses Data Catalogue
▶ Auto generates Spark code
Azure: Data Factory
▶ Orchestrates Data Pipeline activity
▶ Comprises of activity and data stores
▶ Triggers
▶ May use HDInsight, Spark, cosmosDB etc.
20. Other Related
Services
AWS
▶ Database Migration Services
▶ Schema Conversion Tool
▶ RDS Cluster
Azure
▶ SQL Data Sync
▶ SQL Server management system SSMS
▶ Database pools
21. Other factors to select your Cloud
▶ Pricing
▶ Legacy softwares being used
▶ Availability of skills
▶ Client Requirement
▶ Customer support
▶ Third party tool integration
▶ Integration with Existing Infrastructure
22. Challenges in Moving to Cloud
▶ Making the correct choice: SaaS, IaaS, PaaS
▶ Loss of Control
▶ Vendor Lock-in
▶ Security and Compliance
▶ Availability and Reliability
▶ Performance and Bandwidth Cost
▶ Integration with Existing Infrastructure
▶ Lack of Skills, Knowledge and Expertise
23. Thank You
Visit www.edYoda.com for more free Tech Videos about cutting edge technologies like
cloud, DevOps, Machine Learning, AI, Blockchain and many more.
You are more than welcome to upload your own tech videos on edYoda.