Short introduction to different options for ETL & ELT in the Cloud with Microsoft Azure. This is a small accompanying set of slides for my presentations and blogs on this topic
1. Microsoft ETL in the Cloud
Microsoft Azure
Cloud Data Platform
Mark Kromer
Microsoft Azure Cloud Data Architect
@kromerbigdata
@mssqldude
2. What is ETL?
• Acronym for “Extract, Transform and Load”
• Classic form of data movement, aggregation, summarization, cleansing and loading
a Data Warehouse
• More loosely defined as data management processes that clean, move and
aggregate data
• Formal ETL processes are typically scheduled (i.e. hourly, nightly, monthly)
• Not real-time, although micro-batch ETL systems are quite common
3. Classic Enterprise ETL in the Cloud with Azure
Microsoft and ISV Marketplace common offerings (Examples)
Spin-up SQL Server VM image
from the Azure Portal to run
SSIS in the cloud via Azure IaaS
Informatica is an Enterprise-
grade ETL product suite that
offers an Azure VM available in
the ISV Marketplace Microsoft partner with Azure ISV
Marketplace offerings including
CDC. Attunity Compose can
provide additional ELT/ELT
capabilities.
4. ELT in the Cloud with Azure Data Factory
ADF provides Extract, Transform and Load in the Cloud
• ADF relies on external execution engines like SQL Server, Hadoop and AzureML
• Provides very easy Copy Activities to get started quickly
5. Azure ML as an ETL Tool
Transforming Data is a common task for Data Scientists and Data Engineers
• AML has a fully Cloud / Web based UI with basic SQL Transformations
• AML’s core capability is training and scoring data via ML models. But you don’t need to include those
advanced analytics in your “data flow”.
• Schedule ETL activities via ADF
Data
Transformations