SlideShare a Scribd company logo
1 of 52
Download to read offline
Grant Fritchey | www.ScaryDBA.com
www.ScaryDBA.com
Introducing
Azure SQL Data Warehouse
Grant Fritchey
grant@scarydba.com
Grant Fritchey | www.ScaryDBA.com
Goals
 Understand the basic infrastructure and architecture behindAzure SQL
Data Warehouse
 Learn different methods of design, querying, and data migration in
order to begin an implementation ofAzure SQL Data Warehouse
 Investigate the tooling available in support of automation and
monitoring around Azure SQL Data Warehouse
Grant Fritchey | www.ScaryDBA.com
Get in touch Grant Fritchey
scarydba.com
grant@scarydba.com
@gfritchey
Grant Fritchey | www.ScaryDBA.com
Azure SQL Data Warehouse
 Analytics Platform System (APS)
 Not simply a database
» Massively parallel computing platform
 Platform as a Service (PaaS)
 Pay for what you use
» Pay for when you use it
 Connectivity dependent
 Just a database
4
Grant Fritchey | www.ScaryDBA.com
ARCHITECTURE
AzureSQL DataWarehouse
5
Grant Fritchey | www.ScaryDBA.com
Azure SQL Data Warehouse
 Built on a combination ofAzure SQL Database and Analytics Platform
System(APS)
 DBMS = Azure SQL Database
 Processing = APS
 Storage = Azure BLOB Storage
 Default storage is through columnstore
 It’s still SQL Server at it’s core
6
Grant Fritchey | www.ScaryDBA.com 7
BlobStorage
APS
Control Node:
Coordinates data movement
and workload management
Compute Nodes:
Provide processing mechanisms
in parallel or individually
Massively Parallel Processing
Engine
Read Access Geo-Redundant Storage:
RA-GRS stores multi-terabyte data
across Azure geo regions
Application
Grant Fritchey | www.ScaryDBA.com
Table Architecture
 Clustered columnstore by default
 Each “table” consists of 60 tables
 Tables consist of segments
» 100k per compressed row group improves performance
» 1 million rows per/group is max
 Columnstore storage
» Compressed colulmnstore segments
» Delta store (standard clustered index)
8
Grant Fritchey | www.ScaryDBA.com
Protection Features
 Locally Redundant Storage
 Geo-Redundant Storage
 Automated backups
» Every 8 hours
» Kept for 7 days
 Transparent Data Encryption
9
Grant Fritchey | www.ScaryDBA.com
Security
 SQL Server logins
 AzureActive Directory
 Manage ResourceGroups
 Firewall
 Built-in Auditing
10
Grant Fritchey | www.ScaryDBA.com 11
Grant Fritchey | www.ScaryDBA.com
DATABASE DESIGN
AzureSQL DataWarehouse
12
Grant Fritchey | www.ScaryDBA.com
Actually, Table Design
 Define table distribution
 Partitioning
 Statistics
 GeneralTips
 Unsupported
13
Grant Fritchey | www.ScaryDBA.com
Table Distribution
 Each table consists of 60 tables
» 60 distributions
 Round-robin
» One, then the next
 Hash
 For best performance, pick the distribution method
14
Grant Fritchey | www.ScaryDBA.com
Round-Robin Distribution
 Starting out
 No join key to other tables
 No good hash candidate
 Joins against this table aren’t significant
 Staging or temporary table
15
Grant Fritchey | www.ScaryDBA.com
Hash Distribution
 Ensure
» No updates
» Even data distribution
» Minimal data movement
 Suggestions for Hash key
» Highly selective data
» Minimal nulls and duplicates
» Avoid dates
» Avoid fewer than 60 values
» Foreign key columns
16
Grant Fritchey | www.ScaryDBA.com
Ensuring Index Quality
 Avoid memory pressure when building indexes
» Balance memory with concurrency
 Avoid high volume DML operations
» Deletes are not deleted until table rebuild
» Inserts are added to delta group
» Updates are logical delete then an insert (delta group)
» Different than large DML operations
— 102,400 rows per distribution, or 6.144 million rows in an operation goes to direct
storage
 Avoid small or trickle load operations
» Very small data loads always go to delta group
 Be cautious with the number of partitions
» Each partition is a new table
» Each table is 60 tables
17
Grant Fritchey | www.ScaryDBA.com
Table Tips
 Row Store
» < 60 million rows
» Frequent updates
» Small dimension tables
 Columnstore
» > 60 million rows
» Infrequent updates
» Fact tables & large dimension tables
18
Grant Fritchey | www.ScaryDBA.com
Partitioning
 60 million rows per partition to see benefits
 There can be too many partitions
 Partitioning can prevent 1 million rows per group
 Partitioning can cause rows to go to delta row group instead of
compressed row group
 Partition elimination must occur to see benefits
19
Grant Fritchey | www.ScaryDBA.com
Statistics
 No automatic creation
 No automatic update
 Microsoft suggests creating statistics on every column as a start point
» I don’t agree, but this is a better choice than no statistics
 Multi-column statistics supported
» Histogram is still only on first column
 Syntax is the same
20
Grant Fritchey | www.ScaryDBA.com
General Tips
 Denormalization is actually viable
 Use minimum viable data size
 Heap tables for transient data
21
Grant Fritchey | www.ScaryDBA.com
Unsupported
 Currently (these things change)
» Identity
» Primary key, foreign key, unique and check constraints
» Unique indexes
» Computed columns
» Sparse columns
» User-Defined types
» Sequence
» Triggers
» Indexed views
» Synonyms
22
Grant Fritchey | www.ScaryDBA.com
And Memory
 Connection group setting
 More memory more processing as ADW size increases
 Still only 30 connections
 Fundamental to data loads as well as querying
23
Grant Fritchey | www.ScaryDBA.com 24
Grant Fritchey | www.ScaryDBA.com
D-SQL
AzureSQL DataWarehouse
25
Grant Fritchey | www.ScaryDBA.com
New & Different
 CREATETABLEAS SELECT
 GROUP BY differences
 Labels
 Stored procedures limitations
 View limitations
 General Notes
26
Grant Fritchey | www.ScaryDBA.com
CREATE TABLE AS SELECT
 Must define distribution
 Uses parallel processing
 Uses
» Copy a table
» Change structure on a table
» Replace ANSI derived tables (unsupported)
» External data import
27
Grant Fritchey | www.ScaryDBA.com
GROUP BY
 Unsupported
» ROLLUP
» GROUPING SETS
» CUBE
28
Grant Fritchey | www.ScaryDBA.com
Labels
 Mark a query
 Useful for troubleshooting
29
Grant Fritchey | www.ScaryDBA.com
Stored procedures limitations
 Unsupported
» Temporary stored procedures
» Numbered stored procedures
» Extended stored procedures
» CLR stored procedures
» Encryption
» Replication
» Table-valued parameters
» Read-only parameters
» Default parameters
» Execution contexts
» RETURN statement
30
Grant Fritchey | www.ScaryDBA.com
View Limitations
 Schema binding
 No data manipulation through view
 No temporary tables
 No support for EXPAND/NOEXPAND
 No indexed views
31
Grant Fritchey | www.ScaryDBA.com
General Notes
 Cursurs are not supported
» UseWHILE
 Transaction isolation level is limited to READ_UNCOMMITTED
 No SELECT or UPDATE for variable assignment
» Instead
SET @i = (SELECT count(*) FROM dbo.Table)
32
Grant Fritchey | www.ScaryDBA.com
DATA IMPORT MECHANISMS
AzureSQL DataWarehouse
33
Grant Fritchey | www.ScaryDBA.com
Import Processes
 Azure Data Factory
 SSIS
 Polybase
 3rd Party
34
Grant Fritchey | www.ScaryDBA.com
Azure Data Factory
 Currently single core through control node
» Can use Polybase
 Reads from
» Azure blob storage
» Azure SQL Database
» On-premises SQL Server
» SQL ServerVM in Azure
 Requires software installations locally to On-Premise andVMs
 Second slowest method (unless Polybase is used)
35
Grant Fritchey | www.ScaryDBA.com
SSIS
 Single core through control node only
 Include retry logic
 Increase timeout, radically
 Use “all or nothing” load processing
 Parallel loads from multiple SSIS can help
 Slowest method according to Microsoft
36
Grant Fritchey | www.ScaryDBA.com
Polybase
 Supports delimted file and Hadoop
 Supports compressed files
» Gzip,zlab, snappy
 Single compressed file per reader, for better performance, multiple
compressed files scaled for DWU
 Compressed files load slower, but upload faster
 Single operation
 Load speed increases with scale
» Readers increase
» Writers increase
37
Grant Fritchey | www.ScaryDBA.com
3rd Party
38
Grant Fritchey | www.ScaryDBA.com
Data Loading Tips
 Network bandwidth must be considered unless the load is all done
withinAzure
» Express Route, paid access, can help
 Memory affects columnstore, so use more memory for load processes
 Fixed length file format not currently supported by Polybase
 Remember, it’s all a balancing act between upload speed & import
speeds
 100k chunks to get data onto compressed segments in columnstore
39
Grant Fritchey | www.ScaryDBA.com
TOOLING
AzureSQL DataWarehouse
40
Grant Fritchey | www.ScaryDBA.com
Available Tools
 Azure Portal
 Visual Studio
 SQL Server Management Studio
 PowerShell
41
Grant Fritchey | www.ScaryDBA.com 42
Grant Fritchey | www.ScaryDBA.com
MAINTENANCE
AzureSQL DataWarehouse
43
Grant Fritchey | www.ScaryDBA.com
SQL Server
 Index Maintenance
» But not for defragmentation
 Statistics maintenance
 Monitoring
 Backups
» Managed for you, just monitor
44
Grant Fritchey | www.ScaryDBA.com
Statistics
 No automatic creation
 No automatic update
» Update after data loads
» Update after data modification
» If either of the above doesn’t change data distribution, don’t update the
statistics
 Target columns
» JOIN
» GROUP BY
» ORDER BY
» WHERE
» HAVING
 Syntax is the same as SQL Server
45
Grant Fritchey | www.ScaryDBA.com
DBCC SHOW_STATISTICS()
 Limits
» No undocumented features
» No stats_stream
» Square brackets not supported
» Cannot use column names to identify stats
— Must use the stats name
46
Grant Fritchey | www.ScaryDBA.com
Monitoring
 Portal
 Dynamic ManagementViews
» Sys.pdw_loader_backup_runs
» Sys.dm_pdw_exec_sessions
» Sys.dm_pdw_exec_requests
» Sys.dm_pdw_request_steps
» Sys.dm_pdw_sql_requests
» Sys.dm_pdw_dms_workers
» Sys.dm_pdw_waits
 DBCC
» PDW_SHOWEXECUTIONPLAN
» PDW_SHOWSPACEUSED
47
Grant Fritchey | www.ScaryDBA.com
Microsoft Marketing Slide
48
Grant Fritchey | www.ScaryDBA.com
Resources
 Microsoft Documentation
 Azure Data Platform Learning Resources
 Grant Fritchey
 ColumnstoreArchitecture
 Troubleshooting
 CreatingArtificial KeyValues
49
Grant Fritchey | www.ScaryDBA.com
Goals
 Understand the basic infrastructure and architecture behindAzure SQL
Data Warehouse
 Learn different methods of design, querying, and data migration in
order to begin an implementation ofAzure SQL Data Warehouse
 Investigate the tooling available in support of automation and
monitoring around Azure SQL Data Warehouse
Grant Fritchey | www.ScaryDBA.com
Get in touch Grant Fritchey
scarydba.com
grant@scarydba.com
@gfritchey
Grant Fritchey | www.ScaryDBA.com
Most useful docs
 https://azure.microsoft.com/en-us/documentation/articles/sql-data-
warehouse-best-practices/
 https://azure.microsoft.com/en-us/documentation/articles/sql-data-
warehouse-tables-index/#causes-of-poor-columnstore-index-quality
 https://azure.microsoft.com/en-us/documentation/articles/sql-data-
warehouse-tables-distribute/
52

More Related Content

What's hot

Get started with Microsoft SQL Polybase
Get started with Microsoft SQL PolybaseGet started with Microsoft SQL Polybase
Get started with Microsoft SQL Polybase
Henk van der Valk
 
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStaxWebinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
DataStax
 

What's hot (20)

Cortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data LakeCortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data Lake
 
Azure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data LakeAzure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data Lake
 
Azure Data Lake and Azure Data Lake Analytics
Azure Data Lake and Azure Data Lake AnalyticsAzure Data Lake and Azure Data Lake Analytics
Azure Data Lake and Azure Data Lake Analytics
 
Snowflake Best Practices for Elastic Data Warehousing
Snowflake Best Practices for Elastic Data WarehousingSnowflake Best Practices for Elastic Data Warehousing
Snowflake Best Practices for Elastic Data Warehousing
 
Changing the game with cloud dw
Changing the game with cloud dwChanging the game with cloud dw
Changing the game with cloud dw
 
A lap around Azure Data Factory
A lap around Azure Data FactoryA lap around Azure Data Factory
A lap around Azure Data Factory
 
Introduction to Azure Data Lake
Introduction to Azure Data LakeIntroduction to Azure Data Lake
Introduction to Azure Data Lake
 
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
 
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. NielsenJ1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
 
Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...
 
SQL Server 2016 - Stretch DB
SQL Server 2016 - Stretch DB SQL Server 2016 - Stretch DB
SQL Server 2016 - Stretch DB
 
Get started with Microsoft SQL Polybase
Get started with Microsoft SQL PolybaseGet started with Microsoft SQL Polybase
Get started with Microsoft SQL Polybase
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 
Microsoft Azure Data Warehouse Overview
Microsoft Azure Data Warehouse OverviewMicrosoft Azure Data Warehouse Overview
Microsoft Azure Data Warehouse Overview
 
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStaxWebinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
 
Integration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeIntegration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data Lake
 
HA/DR options with SQL Server in Azure and hybrid
HA/DR options with SQL Server in Azure and hybridHA/DR options with SQL Server in Azure and hybrid
HA/DR options with SQL Server in Azure and hybrid
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Technical overview of Azure Cosmos DB
Technical overview of Azure Cosmos DBTechnical overview of Azure Cosmos DB
Technical overview of Azure Cosmos DB
 
Azure Data Factory
Azure Data FactoryAzure Data Factory
Azure Data Factory
 

Similar to Introducing Azure SQL Data Warehouse

Data Privacy at Scale
Data Privacy at ScaleData Privacy at Scale
Data Privacy at Scale
DataWorks Summit
 

Similar to Introducing Azure SQL Data Warehouse (20)

SQL Server Optimization Checklist
SQL Server Optimization ChecklistSQL Server Optimization Checklist
SQL Server Optimization Checklist
 
Top Tips for Better T-SQL
Top Tips for Better T-SQLTop Tips for Better T-SQL
Top Tips for Better T-SQL
 
Azure SQL Database for the Earthed DBA
Azure SQL Database for the Earthed DBAAzure SQL Database for the Earthed DBA
Azure SQL Database for the Earthed DBA
 
Query Tuning Azure SQL Databases
Query Tuning Azure SQL DatabasesQuery Tuning Azure SQL Databases
Query Tuning Azure SQL Databases
 
Changing Your Habits: Tips to Tune Your T-SQL
Changing Your Habits: Tips to Tune Your T-SQLChanging Your Habits: Tips to Tune Your T-SQL
Changing Your Habits: Tips to Tune Your T-SQL
 
Statistics and the Query Optimizer
Statistics and the Query OptimizerStatistics and the Query Optimizer
Statistics and the Query Optimizer
 
Grant Fritchey - Query Tuning In Azure SQL Database
Grant Fritchey - Query Tuning In Azure SQL DatabaseGrant Fritchey - Query Tuning In Azure SQL Database
Grant Fritchey - Query Tuning In Azure SQL Database
 
Statistics And the Query Optimizer
Statistics And the Query OptimizerStatistics And the Query Optimizer
Statistics And the Query Optimizer
 
Data Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQLData Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQL
 
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco SlotDistributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
 
Masterclass - Redshift
Masterclass - RedshiftMasterclass - Redshift
Masterclass - Redshift
 
Building a Real-time Stream Processing Pipeline - Kinesis Data Firehose, Amaz...
Building a Real-time Stream Processing Pipeline - Kinesis Data Firehose, Amaz...Building a Real-time Stream Processing Pipeline - Kinesis Data Firehose, Amaz...
Building a Real-time Stream Processing Pipeline - Kinesis Data Firehose, Amaz...
 
Accelerating Business Intelligence Solutions with Microsoft Azure pass
Accelerating Business Intelligence Solutions with Microsoft Azure   passAccelerating Business Intelligence Solutions with Microsoft Azure   pass
Accelerating Business Intelligence Solutions with Microsoft Azure pass
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Data Privacy at Scale
Data Privacy at ScaleData Privacy at Scale
Data Privacy at Scale
 
AWS Webcast - Attunity Couchsurfing
AWS Webcast - Attunity CouchsurfingAWS Webcast - Attunity Couchsurfing
AWS Webcast - Attunity Couchsurfing
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Why new hardware may not make SQL Server faster
Why new hardware may not make SQL Server fasterWhy new hardware may not make SQL Server faster
Why new hardware may not make SQL Server faster
 
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the CloudInteractive Analytics with the Starburst Presto + Alluxio stack for the Cloud
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
 
Why Standards-Based Drivers Offer Better API Integration
Why Standards-Based Drivers Offer Better API IntegrationWhy Standards-Based Drivers Offer Better API Integration
Why Standards-Based Drivers Offer Better API Integration
 

More from Grant Fritchey

More from Grant Fritchey (20)

Migrating To PostgreSQL
Migrating To PostgreSQLMigrating To PostgreSQL
Migrating To PostgreSQL
 
PostgreSQL Performance Problems: Monitoring and Alerting
PostgreSQL Performance Problems: Monitoring and AlertingPostgreSQL Performance Problems: Monitoring and Alerting
PostgreSQL Performance Problems: Monitoring and Alerting
 
Automating Database Deployments Using Azure DevOps
Automating Database Deployments Using Azure DevOpsAutomating Database Deployments Using Azure DevOps
Automating Database Deployments Using Azure DevOps
 
Learn To Effectively Use Extended Events_Techorama.pdf
Learn To Effectively Use Extended Events_Techorama.pdfLearn To Effectively Use Extended Events_Techorama.pdf
Learn To Effectively Use Extended Events_Techorama.pdf
 
Using Query Store to Understand and Control Query Performance
Using Query Store to Understand and Control Query PerformanceUsing Query Store to Understand and Control Query Performance
Using Query Store to Understand and Control Query Performance
 
You Should Be Standing Here: Learn How To Present a Session
You Should Be Standing Here: Learn How To Present a SessionYou Should Be Standing Here: Learn How To Present a Session
You Should Be Standing Here: Learn How To Present a Session
 
Redgate Community Circle: Tools For SQL Server Performance Tuning
Redgate Community Circle: Tools For SQL Server Performance TuningRedgate Community Circle: Tools For SQL Server Performance Tuning
Redgate Community Circle: Tools For SQL Server Performance Tuning
 
10 Steps To Global Data Compliance
10 Steps To Global Data Compliance10 Steps To Global Data Compliance
10 Steps To Global Data Compliance
 
Time to Use the Columnstore Index
Time to Use the Columnstore IndexTime to Use the Columnstore Index
Time to Use the Columnstore Index
 
Introduction to SQL Server in Containers
Introduction to SQL Server in ContainersIntroduction to SQL Server in Containers
Introduction to SQL Server in Containers
 
DevOps for the DBA
DevOps for the DBADevOps for the DBA
DevOps for the DBA
 
SQL Injection: How It Works, How to Stop It
SQL Injection: How It Works, How to Stop ItSQL Injection: How It Works, How to Stop It
SQL Injection: How It Works, How to Stop It
 
Privacy and Protection in the World of Database DevOps
Privacy and Protection in the World of Database DevOpsPrivacy and Protection in the World of Database DevOps
Privacy and Protection in the World of Database DevOps
 
SQL Server Tools for Query Tuning
SQL Server Tools for Query TuningSQL Server Tools for Query Tuning
SQL Server Tools for Query Tuning
 
Extending DevOps to SQL Server
Extending DevOps to SQL ServerExtending DevOps to SQL Server
Extending DevOps to SQL Server
 
Introducing Azure Databases
Introducing Azure DatabasesIntroducing Azure Databases
Introducing Azure Databases
 
Statistis, Row Counts, Execution Plans and Query Tuning
Statistis, Row Counts, Execution Plans and Query TuningStatistis, Row Counts, Execution Plans and Query Tuning
Statistis, Row Counts, Execution Plans and Query Tuning
 
Understanding Your Servers, All Your Servers
Understanding Your Servers, All Your ServersUnderstanding Your Servers, All Your Servers
Understanding Your Servers, All Your Servers
 
The Query Store SQL Tuning
The Query Store SQL TuningThe Query Store SQL Tuning
The Query Store SQL Tuning
 
Performance Tuning Azure SQL Database
Performance Tuning Azure SQL DatabasePerformance Tuning Azure SQL Database
Performance Tuning Azure SQL Database
 

Recently uploaded

%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 

Recently uploaded (20)

%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 

Introducing Azure SQL Data Warehouse

  • 1. Grant Fritchey | www.ScaryDBA.com www.ScaryDBA.com Introducing Azure SQL Data Warehouse Grant Fritchey grant@scarydba.com
  • 2. Grant Fritchey | www.ScaryDBA.com Goals  Understand the basic infrastructure and architecture behindAzure SQL Data Warehouse  Learn different methods of design, querying, and data migration in order to begin an implementation ofAzure SQL Data Warehouse  Investigate the tooling available in support of automation and monitoring around Azure SQL Data Warehouse
  • 3. Grant Fritchey | www.ScaryDBA.com Get in touch Grant Fritchey scarydba.com grant@scarydba.com @gfritchey
  • 4. Grant Fritchey | www.ScaryDBA.com Azure SQL Data Warehouse  Analytics Platform System (APS)  Not simply a database » Massively parallel computing platform  Platform as a Service (PaaS)  Pay for what you use » Pay for when you use it  Connectivity dependent  Just a database 4
  • 5. Grant Fritchey | www.ScaryDBA.com ARCHITECTURE AzureSQL DataWarehouse 5
  • 6. Grant Fritchey | www.ScaryDBA.com Azure SQL Data Warehouse  Built on a combination ofAzure SQL Database and Analytics Platform System(APS)  DBMS = Azure SQL Database  Processing = APS  Storage = Azure BLOB Storage  Default storage is through columnstore  It’s still SQL Server at it’s core 6
  • 7. Grant Fritchey | www.ScaryDBA.com 7 BlobStorage APS Control Node: Coordinates data movement and workload management Compute Nodes: Provide processing mechanisms in parallel or individually Massively Parallel Processing Engine Read Access Geo-Redundant Storage: RA-GRS stores multi-terabyte data across Azure geo regions Application
  • 8. Grant Fritchey | www.ScaryDBA.com Table Architecture  Clustered columnstore by default  Each “table” consists of 60 tables  Tables consist of segments » 100k per compressed row group improves performance » 1 million rows per/group is max  Columnstore storage » Compressed colulmnstore segments » Delta store (standard clustered index) 8
  • 9. Grant Fritchey | www.ScaryDBA.com Protection Features  Locally Redundant Storage  Geo-Redundant Storage  Automated backups » Every 8 hours » Kept for 7 days  Transparent Data Encryption 9
  • 10. Grant Fritchey | www.ScaryDBA.com Security  SQL Server logins  AzureActive Directory  Manage ResourceGroups  Firewall  Built-in Auditing 10
  • 11. Grant Fritchey | www.ScaryDBA.com 11
  • 12. Grant Fritchey | www.ScaryDBA.com DATABASE DESIGN AzureSQL DataWarehouse 12
  • 13. Grant Fritchey | www.ScaryDBA.com Actually, Table Design  Define table distribution  Partitioning  Statistics  GeneralTips  Unsupported 13
  • 14. Grant Fritchey | www.ScaryDBA.com Table Distribution  Each table consists of 60 tables » 60 distributions  Round-robin » One, then the next  Hash  For best performance, pick the distribution method 14
  • 15. Grant Fritchey | www.ScaryDBA.com Round-Robin Distribution  Starting out  No join key to other tables  No good hash candidate  Joins against this table aren’t significant  Staging or temporary table 15
  • 16. Grant Fritchey | www.ScaryDBA.com Hash Distribution  Ensure » No updates » Even data distribution » Minimal data movement  Suggestions for Hash key » Highly selective data » Minimal nulls and duplicates » Avoid dates » Avoid fewer than 60 values » Foreign key columns 16
  • 17. Grant Fritchey | www.ScaryDBA.com Ensuring Index Quality  Avoid memory pressure when building indexes » Balance memory with concurrency  Avoid high volume DML operations » Deletes are not deleted until table rebuild » Inserts are added to delta group » Updates are logical delete then an insert (delta group) » Different than large DML operations — 102,400 rows per distribution, or 6.144 million rows in an operation goes to direct storage  Avoid small or trickle load operations » Very small data loads always go to delta group  Be cautious with the number of partitions » Each partition is a new table » Each table is 60 tables 17
  • 18. Grant Fritchey | www.ScaryDBA.com Table Tips  Row Store » < 60 million rows » Frequent updates » Small dimension tables  Columnstore » > 60 million rows » Infrequent updates » Fact tables & large dimension tables 18
  • 19. Grant Fritchey | www.ScaryDBA.com Partitioning  60 million rows per partition to see benefits  There can be too many partitions  Partitioning can prevent 1 million rows per group  Partitioning can cause rows to go to delta row group instead of compressed row group  Partition elimination must occur to see benefits 19
  • 20. Grant Fritchey | www.ScaryDBA.com Statistics  No automatic creation  No automatic update  Microsoft suggests creating statistics on every column as a start point » I don’t agree, but this is a better choice than no statistics  Multi-column statistics supported » Histogram is still only on first column  Syntax is the same 20
  • 21. Grant Fritchey | www.ScaryDBA.com General Tips  Denormalization is actually viable  Use minimum viable data size  Heap tables for transient data 21
  • 22. Grant Fritchey | www.ScaryDBA.com Unsupported  Currently (these things change) » Identity » Primary key, foreign key, unique and check constraints » Unique indexes » Computed columns » Sparse columns » User-Defined types » Sequence » Triggers » Indexed views » Synonyms 22
  • 23. Grant Fritchey | www.ScaryDBA.com And Memory  Connection group setting  More memory more processing as ADW size increases  Still only 30 connections  Fundamental to data loads as well as querying 23
  • 24. Grant Fritchey | www.ScaryDBA.com 24
  • 25. Grant Fritchey | www.ScaryDBA.com D-SQL AzureSQL DataWarehouse 25
  • 26. Grant Fritchey | www.ScaryDBA.com New & Different  CREATETABLEAS SELECT  GROUP BY differences  Labels  Stored procedures limitations  View limitations  General Notes 26
  • 27. Grant Fritchey | www.ScaryDBA.com CREATE TABLE AS SELECT  Must define distribution  Uses parallel processing  Uses » Copy a table » Change structure on a table » Replace ANSI derived tables (unsupported) » External data import 27
  • 28. Grant Fritchey | www.ScaryDBA.com GROUP BY  Unsupported » ROLLUP » GROUPING SETS » CUBE 28
  • 29. Grant Fritchey | www.ScaryDBA.com Labels  Mark a query  Useful for troubleshooting 29
  • 30. Grant Fritchey | www.ScaryDBA.com Stored procedures limitations  Unsupported » Temporary stored procedures » Numbered stored procedures » Extended stored procedures » CLR stored procedures » Encryption » Replication » Table-valued parameters » Read-only parameters » Default parameters » Execution contexts » RETURN statement 30
  • 31. Grant Fritchey | www.ScaryDBA.com View Limitations  Schema binding  No data manipulation through view  No temporary tables  No support for EXPAND/NOEXPAND  No indexed views 31
  • 32. Grant Fritchey | www.ScaryDBA.com General Notes  Cursurs are not supported » UseWHILE  Transaction isolation level is limited to READ_UNCOMMITTED  No SELECT or UPDATE for variable assignment » Instead SET @i = (SELECT count(*) FROM dbo.Table) 32
  • 33. Grant Fritchey | www.ScaryDBA.com DATA IMPORT MECHANISMS AzureSQL DataWarehouse 33
  • 34. Grant Fritchey | www.ScaryDBA.com Import Processes  Azure Data Factory  SSIS  Polybase  3rd Party 34
  • 35. Grant Fritchey | www.ScaryDBA.com Azure Data Factory  Currently single core through control node » Can use Polybase  Reads from » Azure blob storage » Azure SQL Database » On-premises SQL Server » SQL ServerVM in Azure  Requires software installations locally to On-Premise andVMs  Second slowest method (unless Polybase is used) 35
  • 36. Grant Fritchey | www.ScaryDBA.com SSIS  Single core through control node only  Include retry logic  Increase timeout, radically  Use “all or nothing” load processing  Parallel loads from multiple SSIS can help  Slowest method according to Microsoft 36
  • 37. Grant Fritchey | www.ScaryDBA.com Polybase  Supports delimted file and Hadoop  Supports compressed files » Gzip,zlab, snappy  Single compressed file per reader, for better performance, multiple compressed files scaled for DWU  Compressed files load slower, but upload faster  Single operation  Load speed increases with scale » Readers increase » Writers increase 37
  • 38. Grant Fritchey | www.ScaryDBA.com 3rd Party 38
  • 39. Grant Fritchey | www.ScaryDBA.com Data Loading Tips  Network bandwidth must be considered unless the load is all done withinAzure » Express Route, paid access, can help  Memory affects columnstore, so use more memory for load processes  Fixed length file format not currently supported by Polybase  Remember, it’s all a balancing act between upload speed & import speeds  100k chunks to get data onto compressed segments in columnstore 39
  • 40. Grant Fritchey | www.ScaryDBA.com TOOLING AzureSQL DataWarehouse 40
  • 41. Grant Fritchey | www.ScaryDBA.com Available Tools  Azure Portal  Visual Studio  SQL Server Management Studio  PowerShell 41
  • 42. Grant Fritchey | www.ScaryDBA.com 42
  • 43. Grant Fritchey | www.ScaryDBA.com MAINTENANCE AzureSQL DataWarehouse 43
  • 44. Grant Fritchey | www.ScaryDBA.com SQL Server  Index Maintenance » But not for defragmentation  Statistics maintenance  Monitoring  Backups » Managed for you, just monitor 44
  • 45. Grant Fritchey | www.ScaryDBA.com Statistics  No automatic creation  No automatic update » Update after data loads » Update after data modification » If either of the above doesn’t change data distribution, don’t update the statistics  Target columns » JOIN » GROUP BY » ORDER BY » WHERE » HAVING  Syntax is the same as SQL Server 45
  • 46. Grant Fritchey | www.ScaryDBA.com DBCC SHOW_STATISTICS()  Limits » No undocumented features » No stats_stream » Square brackets not supported » Cannot use column names to identify stats — Must use the stats name 46
  • 47. Grant Fritchey | www.ScaryDBA.com Monitoring  Portal  Dynamic ManagementViews » Sys.pdw_loader_backup_runs » Sys.dm_pdw_exec_sessions » Sys.dm_pdw_exec_requests » Sys.dm_pdw_request_steps » Sys.dm_pdw_sql_requests » Sys.dm_pdw_dms_workers » Sys.dm_pdw_waits  DBCC » PDW_SHOWEXECUTIONPLAN » PDW_SHOWSPACEUSED 47
  • 48. Grant Fritchey | www.ScaryDBA.com Microsoft Marketing Slide 48
  • 49. Grant Fritchey | www.ScaryDBA.com Resources  Microsoft Documentation  Azure Data Platform Learning Resources  Grant Fritchey  ColumnstoreArchitecture  Troubleshooting  CreatingArtificial KeyValues 49
  • 50. Grant Fritchey | www.ScaryDBA.com Goals  Understand the basic infrastructure and architecture behindAzure SQL Data Warehouse  Learn different methods of design, querying, and data migration in order to begin an implementation ofAzure SQL Data Warehouse  Investigate the tooling available in support of automation and monitoring around Azure SQL Data Warehouse
  • 51. Grant Fritchey | www.ScaryDBA.com Get in touch Grant Fritchey scarydba.com grant@scarydba.com @gfritchey
  • 52. Grant Fritchey | www.ScaryDBA.com Most useful docs  https://azure.microsoft.com/en-us/documentation/articles/sql-data- warehouse-best-practices/  https://azure.microsoft.com/en-us/documentation/articles/sql-data- warehouse-tables-index/#causes-of-poor-columnstore-index-quality  https://azure.microsoft.com/en-us/documentation/articles/sql-data- warehouse-tables-distribute/ 52