SlideShare a Scribd company logo
1 of 41
Download to read offline
Grösser und Komplexer
ist nicht immer besser
Meinrad Weiss
Senior Cloud Solution Architect
«Moderne» Data Warehouse/Data Lake Architekturen strotzen oft nur von Layern und Services.
• Mit solchen Systemen lassen sich Petabytes von Daten verwalten und analysieren.
• Das Ganze hat aber auch seinen Preis (Komplexität, Latenzzeit, Stabilität)
und nicht jedes Projekt wird mit diesem Ansatz glücklich.
Der Vortrag zeigt die Reise von einer technologieverliebten Lösung zu einer auf die Anwender Bedürfnisse
abgestimmten Umgebung.
Er zeigt die Sonnen- und Schattenseiten von massiv parallelen Systemen und
soll die Sinne auf das Aufnehmen der realen Kundenanforderungen sensibilisieren.
Agenda/Goal
• 10 Years Ago: selling hardware
• 5 Years Ago: Transition to Leasing business
− Challenge-> How to maintain the hardware for Leasing?
− Solution -> Put Sensor on everything, and predictive maintenance on IoT
• Trend Now:
Manufacturing companies
are becoming data companies
Manufactory Companies are becoming Data Companies:
Architecture Overview for Information Management
4
IoT
Non
Structured
CRM BW
…
Azure
DW
ERP
Polybase
IoT
Hub
Stream
Analytics
Blob
Storage
ADF: Azure Data Factory1
2 Logic, API, App
Theobald1
Theobald IS
&
SSIS
2
Data Lake Store
SSAS
Data Lake
Analysis
HD Insights
(Spark)
Azure
Machine
Learning
Visual Studio
R-Studio
Local
SQL
Local
SQL
Data scientist
MSR
ADF1
SSIS2
Direct Query1
Process2
Local IT
Market Reach
Excel
MS
Access
xxx Apps
SQL1
API2
Browser
API
Product Mgr.
Local
SQL
Reporting
& Analysis
Azure
SQL DB
SQL
API
ADW:
+ No size limit
+ Scalability
+ Polybase
- Less compatible
- Concurrency limit
- No Row level security
- No Ref. Integrity
SQL DB / SSAS:
+ Row level security
+ High concurrency
+ High T-SQL compatibility
- Limited DB size
Push
Pull
SQL connect
Architecture Overview for Information Management
5
IoT
Non
Structured
CRM BW
…
Azure
DW
ERP
Polybase
IoT
Hub
Stream
Analytics
Blob
Storage
ADF: Azure Data Factory1
2 Logic, API, App
Theobald1
Theobald IS
&
SSIS
2
Data Lake Store
SSAS
Data Lake
Analysis
HD Insights
(Spark)
Azure
Machine
Learning
Visual Studio
R-Studio
Local
SQL
Local
SQL
Data scientist
MSR
ADF1
SSIS2
Direct Query1
Process2
Local IT
Market Reach
Excel
MS
Access
xxx Apps
SQL1
API2
Browser
API
Product Mgr.
Local
SQL
Reporting
& Analysis
Azure
SQL DB
SQL
API
ADW:
+ No size limit
+ Scalability
+ Polybase
- Less compatible
- Concurrency limit
- No Row level security
- No Ref. Integrity
SQL DB / SSAS:
+ Row level security
+ High concurrency
+ High T-SQL compatibility
- Limited DB size
Push
Pull
SQL connect
Azure SQL Data Warehouse performance advantage
Overview
SQL Data Warehouse’s industry leading price-performance
comes from leveraging the Azure ecosystem and core SQL
Server engine improvements to produce massive gains in
performance.
These benefits require no customer configuration and are
provided out-of-the-box for every data warehouse
• Gen2 adaptive caching – using non-volatile memory
solid-state drives (NVMe) to increase the I/O bandwidth
available to queries.
• Azure FPGA-accelerated networking enhancements –
to move data at rates of up to 1GB/sec per node to
improve queries
• Instant data movement – leverages multi-core
parallelism in underlying SQL Servers to move data
efficiently between compute nodes.
• SQL Query Optimizer – ongoing investments in
distributed query optimization
Logical overview
Mapping Compute in SQLDW – (2 * 30 = 60)
13 14 1615 17 18
2
0
19 21
2
2
2
4
2
3
2
5
2
6
2
8
27
2
9
3
0
3
2
31
3
3
3
4
3
6
3
5
37
3
8
4
0
3
9
41
4
2
4
4
4
3
4
5
4
6
4
8
4
7
4
9
5
0
5
2
51
5
3
5
4
5
6
5
5
57
5
8
6
0
5
9
01
0
2
0
4
0
3
0
5
0
6
0
8
07
0
9
10 1211
DW200
Mapping Compute in SQLDW (3 * 20 = 60)
13 14 1615 17 18
2
0
19 21
2
2
2
4
2
3
2
5
2
6
2
8
27
2
9
3
0
3
2
31
3
3
3
4
3
6
3
5
37
3
8
4
0
3
9
41
4
2
4
4
4
3
4
5
4
6
4
8
4
7
4
9
5
0
5
2
51
5
3
5
4
5
6
5
5
57
5
8
6
0
5
9
01
0
2
0
4
0
3
0
5
0
6
0
8
07
0
9
10 1211
DW300
CREATE TABLE Sales.Order
(
OrderId INT NOT NULL,
Date DATE NOT NULL,
Name VARCHAR(2),
Country VARCHAR(2)
)
WITH
(
CLUSTERED COLUMNSTORE INDEX,
DISTRIBUTION = HASH([OrderId]) |
ROUND ROBIN |
REPLICATED
);
Round-robin distributed
Distributes table rows evenly across all distributions at
random.
Hash distributed
Distributes table rows across the Compute nodes by
using a deterministic hash function to assign each row
to one distribution.
Replicated
Full copy of table accessible on each Compute node.
Tables – Distributions
OrderId OrderId
OrderId OrderId
Architecture Overview for Information Management
1
3
IoT
Non
Structured
CRM BW
…
Azure
DW
ERP
Polybase
IoT
Hub
Stream
Analytics
Blob
Storage
ADF: Azure Data Factory1
2 Logic, API, App
Theobald1
Theobald IS
&
SSIS
2
Data Lake Store
SSAS
Data Lake
Analysis
HD Insights
(Spark)
Azure
Machine
Learning
Visual Studio
R-Studio
Local
SQL
Local
SQL
Data scientist
MSR
ADF1
SSIS2
Direct Query1
Process2
Local IT
Market Reach
Excel
MS
Access
xxx Apps
SQL1
API2
Browser
API
Product Mgr.
Local
SQL
Reporting
& Analysis
Azure
SQL DB
SQL
API
ADW:
+ No size limit
+ Scalability
+ Polybase
- Less compatible
- Concurrency limit
- No Row level security
- No Ref. Integrity
SQL DB / SSAS:
+ Row level security
+ High concurrency
+ High T-SQL compatibility
- Limited DB size
Push
Pull
SQL connect
• Slow load and query performance
• Limited number of concurrent
queries (improved with ADW Gen2)
Design Decisions/Table Distributions
All tables are currently Round Robin distributed.
There are 2 main reasons for this:
• When tables are loaded through Theobald, using a drop and create approach, these tables are created
with default settings, which is a round robin distribution
• Loaded via the head node
• An optimal data distribution would allow to use all compute nodes equally during the most frequent
operations.
In our case, this is inserting and reading data. At this moment, the most intensive data reads are the
ones on the Salesdetails view.
• These will get a month of XXXX data and join this with
• Date, TerritoryHierarchy and the Regions table,
to identify the data that needs to be sent to the spoke.
• Distributing by region, or even by company code would not give enough distinct values to evenly
use all nodes.
• Distributing by month would reduce data movement but would not provide any load balancing
during querying as all data would come from the same node.
Architecture Overview for Information Management
1
5
IoT
Non
Structured
CRM BW
…
Azure
DW
ERP
Polybase
IoT
Hub
Stream
Analytics
Blob
Storage
ADF: Azure Data Factory1
2 Logic, API, App
Theobald1
Theobald IS
&
SSIS
2
Data Lake Store
SSAS
Data Lake
Analysis
HD Insights
(Spark)
Azure
Machine
Learning
Visual Studio
R-Studio
Local
SQL
Local
SQL
Data scientist
MSR
ADF1
SSIS2
Direct Query1
Process2
Local IT
Market Reach
Excel
MS
Access
xxx Apps
SQL1
API2
Browser
API
Product Mgr.
Local
SQL
Reporting
& Analysis
Azure
SQL DB
SQL
API
ADW:
+ No size limit
+ Scalability
+ Polybase
- Less compatible
- Concurrency limit
- No Row level security
- No Ref. Integrity
SQL DB / SSAS:
+ Row level security
+ High concurrency
+ High T-SQL compatibility
- Limited DB size
Push
Pull
SQL connect
• Slow load and query performance
• Limited number of concurrent
queries (improved with ADW Gen2)
Architecture Overview for Information Management
1
6
IoT
Non
Structured
CRM BW
…
Azure
DW
ERP
Polybase
IoT
Hub
Stream
Analytics
Blob
Storage
ADF: Azure Data Factory1
2 Logic, API, App
Theobald1
Theobald IS
&
SSIS
2
Data Lake Store
SSAS
Data Lake
Analysis
HD Insights
(Spark)
Azure
Machine
Learning
Visual Studio
R-Studio
Local
SQL
Local
SQL
Data scientist
MSR
ADF1
SSIS2
Direct Query1
Process2
Local IT
Market Reach
Excel
MS
Access
xxx Apps
SQL1
API2
Browser
API
Product Mgr.
Local
SQL
Reporting
& Analysis
Azure
SQL DB
SQL
API
ADW:
+ No size limit
+ Scalability
+ Polybase
- Less compatible
- Concurrency limit
- No Row level security
- No Ref. Integrity
SQL DB / SSAS:
+ Row level security
+ High concurrency
+ High T-SQL compatibility
- Limited DB size
Push
Pull
SQL connect
Data lake platform Hub – spoke concept using ADW
SAP BW
Azure
SQL DWH
Theobald ADF
HUB
Read Only Spoke
Per Region
ADF
ADF
SAP ERP
SAP CRM
…
E1 Spoke
E2 Spoke
E3 Spoke
Read Write Spoke
for local applications
E1W
Spoke
E4 Spoke
ADF
SAP ByD
E2W
Spoke
E3W
Spoke
E4W
Spoke
E4 Spoke
ADF
Data Reference (Elastic Query)
Data Movement
Azure Blob Storage Hadoop Azure Data Lake Storage
SQL
MySQL PostgreSQL MariaDB SQL Server in Azure Azure SQL Data Warehouse
Azure Cosmos DB
SQL Server Hyperscale
& Data virtualization
SQL
SQL Server PaaS offerings
SQL Database
(PaaS)
Elastic
Pool
Managed
Instance
Singleton
SQL Server
in a VM
SQL
General availabilityGeneral availability Preview
System Current Near Future + 1 Y Future
SAP BW 1TB 4TB 8TB
Data lake platform Hub – spoke concept using ADW
SAP BW
Azure
SQL DWH
Theobald ADF
HUB
Read Only Spoke
Per Region
ADF
ADF
SAP ERP
SAP CRM
…
E1 Spoke
E2 Spoke
E3 Spoke
Read Write Spoke
for local applications
E1W
Spoke
E4 Spoke
ADF
SAP ByD
E2W
Spoke
E3W
Spoke
E4W
Spoke
E4 Spoke
ADF
Data Reference (Elastic Query)
Data Movement
• Sub-optimal availability
• Long load times
• Bad query performance using distributed queries
Data lake platform Hub – spoke concept using ADW
SAP BW
Azure
SQL DWH
Theobald ADF
HUB
Read Only Spoke
Per Region
ADF
ADF
SAP ERP
SAP CRM
…
E1 Spoke
E2 Spoke
E3 Spoke
Read Write Spoke
for local applications
E1W
Spoke
E4 Spoke
ADF
SAP ByD
E2W
Spoke
E3W
Spoke
E4W
Spoke
E4 Spoke
ADF
Data Reference (Elastic Query)
Data Movement
• Sub-optimal availability
• Long load times
• Bad query performance using distributed queries
SLA 99.9 % SLA 99.99 % SLA 99.99 %
SLA max 99.89 % SLA 99.98 %
Data lake platform Hub – spoke concept using ADW
SAP BW
Azure
SQL DWH
Theobald ADF
HUB
Read Only Spoke
Per Region
ADF
ADF
SAP ERP
SAP CRM
…
E1 Spoke
E2 Spoke
E3 Spoke
Read Write Spoke
for local applications
E1W
Spoke
E4 Spoke
ADF
SAP ByD
E2W
Spoke
E3W
Spoke
E4W
Spoke
E4 Spoke
ADF
Data Reference (Elastic Query)
Data Movement
• Sub-optimal availability
• Long load times
• Bad query performance using distributed queries
• Most meta data are copied to
each of the Spokes
• Spoke E1 represents 60% of
all transactions
AdventureWorks Sample mapped to referenced project
- Views could replace multiple copy operations
SQL Server Instance
Read Only Spoke
For all Regions
Read Write Spoke
for local applications
Region
Views
Theobald
SAP CRM
SAP ERP
…
SAP ByD
Data lake platform Hub – spoke concept using ADW
SAP BW
Azure
SQL DWH
Theobald ADF
HUB
Read Only Spoke
Per Region
ADF
ADF
SAP ERP
SAP CRM
…
E1 Spoke
E2 Spoke
E3 Spoke
Read Write Spoke
for local applications
E1W
Spoke
E4 Spoke
ADF
SAP ByD
E2W
Spoke
E3W
Spoke
E4W
Spoke
E4 Spoke
ADF
Data Reference (Elastic Query)
Data Movement
• Sub-optimal availability
• Long load times
• Bad query performance using distributed queries
Azure SQL Database
SQL Server Instance
CPU
Memory
DB Buffer
Cache
Procedure
Cache
Log
Cache
Files
DB1
SQL Server Instance
CPU
Memory
DB Buffer
Cache
Procedure
Cache
Log
Cache
Files
DBn
Spoke Database on Azure SQL Server
SQL Server Instance
Read Only Spoke
For all Regions
Read Write Spoke
for local applications
Region
Views
Hub and spoke objects are stored in differen SQL server instance
- Access via network (external tables)
or
-- e.g. in [AdventureWorksDW2017_US]
CREATE EXTERNAL TABLE [dbo].[DimScenario]
(
[ScenarioKey] [int] NOT NULL,
[ScenarioName] [nvarchar](50) NULL
)
WITH
(
DATA_SOURCE = [AdventureWorksDW2017]
,SCHEMA_NAME = N'Mart_US'
,OBJECT_NAME = N'DimScenario'
)
Excution Plans
(Test 9: Spoke Query with accessing Hub and Spoke Objects)
Azure SQL Database
Azure SQL Server Managed Instance
SQL Server Instance
CPU
Memory
DB Buffer
Cache
Procedure
Cache
Log
Cache
Files
DB1 DB2 DBn
Data from all databases share the same memory space
Access via simple 3 part nameing is possible db.schema.object
Spoke Database on SQL Server Managed Instance
SQL Server Instance
Read Only Spoke
For all Regions
Read Write Spoke
for local applications
Region
Views
SQL Server Instance
CPU
Memory
DB Buffer
Cache
Procedure
Cache
Log
Cache
Files
DB1 DB2 DBn
All objects are stored in the same SQL server instance
- 3 part nameing can be used to access objects in another db
or
-- e.g. in [AdventureWorksDW2017_US]
CREATE VIEW [dbo].[DimScenario]
AS
SELECT *
FROM [AdventureWorksDW2017].[Mart_US].[DimScenario];
-- e.g. in [AdventureWorksDW2017_US]
CREATE VIEW [dbo].[DimScenario]
AS
SELECT [ScenarioKey]
,[ScenarioName]
FROM [AdventureWorksDW2017].[Mart_US].[DimScenario];
Excution Plans
(Test 9: Spoke Query with accessing Hub and Spoke Objects)
Azure SQL Database
Managed Instance
Layers and objects (via Sample AdventureWorksDW)
Schema
.dbo
Schema
.Mart_DE
.Mart_US
Most Views will be
1:1 mappings
Some Views
will filter
data
Test Queries (1)
Title SQL Result Set/Test
Select some attributes
with filter
SELECT LastName, FirstName
FROM dbo.dimCustomer
WHERE LastName = 'Adams’
AND NumberChildrenAtHome = 3;
2 Columns/6 Rows
Aggregate single table select sum([SalesAmount])
from [dbo].[FactInternetSales]
1 Column/1 Row
Aggregate single table
with simple filter
select sum([SalesAmount])
from [dbo].[FactInternetSales]
where [SalesOrderLineNumber] > 1
1 Column/1 Row
Aggregate single table
with filter
with FilterKriteria
as
( select 1 as [MinSalesOrderLineNumber])
select sum([SalesAmount])
from [dbo].[FactInternetSales]
cross join FilterKriteria
where [SalesOrderLineNumber] > [MinSalesOrderLineNumber]
1 Column/1 Row
- Filter push down
Inner database join and
aggregate
select [EnglishProductName], sum(SalesAmount)
from [dbo].[FactInternetSales] [S]
inner join [dbo].[DimProduct] [P]
on [S].[ProductKey] = [P].[ProductKey]
group by [P].[EnglishProductName]
order by 2 desc
2 Columns/130 Rows
- Join push down
Test Queries (2)
Title SQL Result Set/Test
Transfer data to spoke select *
into [dbo].[LocalFactInternetSales]
from [dbo].[FactInternetSales]
27 Columns/21’344 Rows
Transfer speed of data
between databases
Cross database join with
aggregate
select [OrderDateKey], [Gender], sum([SalesAmount])
from [dbo].[LocalFactInternetSales] as [S]
inner join [dbo].[DimCustomer] as [C]
on [S].[CustomerKey] = [C].[CustomerKey]
group by [OrderDateKey], [Gender]
3 Column/1761 Rows
Handling of cross database
joins
SQL Server Instance
Read Only Spoke
For all Regions
Read Write Spoke
for local applications
Region
Views
Hub query Spoke query
Performance Tests (1)
0
10
20
30
40
50
60
Select some attributes with filter Aggregate single table Aggregate single table with simple filter Aggregate single table with filter Inner database join and aggregate
Test package one
Azure SQL Server Hub Azure SQL Server Spoke Managed Instance Hub Managed Instance Spoke
0
50
100
150
200
250
300
350
400
Transfer data to spoke Cross database join with aggregate
Test package two
Azure SQL Server Hub Azure SQL Server Spoke Managed Instance Hub Managed Instance Spoke
Performance Tests (2)
Part of todays ETL
performance
problem
Issue, if spoke uses
local spoke data and
remote hub data in
one query
New architecture
SQL Server Instance
Read Only Spoke
For all Regions
Read Write Spoke
for local applications
Region
Views
From System with:
- 1 Azure SQL DWH
- 9 Azure SQL DB’s
to
- 1 Azure SQL MI (SLA 99.99%)
Azure Blob Storage Hadoop Azure Data Lake Storage
SQL
MySQL PostgreSQL MariaDB SQL Server in Azure Azure SQL Data Warehouse
Azure Cosmos DB
SQL Server Hyperscale
& Data virtualization
SQL
Future Option
Hyperscale service tier for up to 100 TB
• Support for up to 100 TB of database size
• Higher overall performance due to higher log throughput and faster
transaction commit times regardless
of data volumes
• Nearly instantaneous database backups
(snapshots of Azure Blob storage)
• Fast database restores
(based on file snapshots)
• Rapid read scale out
• Rapid Scale up
SQL Server Data Virtualization
• Allows the data to stay in its original location, however you can
virtualize the data in a SQL Server instance
• it can be queried there like any other table in SQL Server.
Conclusion KISS (Keep It Simple [not Stupid])
• All used services are excellent services
• Azure SQL Data Warehouse
• Azure SQL Database
• Azure SQL MI
• (SQL in a VM) -> Don’t take PaaS as a religion
• Theobald
• Technical implementation details can make the difference
• Transparent must not mean fast!
• My personal advice
• Try to use as few and “simple” services as possible (but not less)
• For each used service, you should have a good argument chain, why you use it
• POC’s help you to understand the different technologies
• There is no free lunch
• E.g. with databases like Azure SQL Data Warehouse or Cosmos DB you get “endless scale” but
you must deal with data distributions/partitions
© Copyright Microsoft Corporation. All rights reserved.

More Related Content

What's hot

What's hot (20)

Designing big data analytics solutions on azure
Designing big data analytics solutions on azureDesigning big data analytics solutions on azure
Designing big data analytics solutions on azure
 
Benefits of the Azure cloud
Benefits of the Azure cloudBenefits of the Azure cloud
Benefits of the Azure cloud
 
Azure SQL Database & Azure SQL Data Warehouse
Azure SQL Database & Azure SQL Data WarehouseAzure SQL Database & Azure SQL Data Warehouse
Azure SQL Database & Azure SQL Data Warehouse
 
AI for Intelligent Cloud and Intelligent Edge: Discover, Deploy, and Manage w...
AI for Intelligent Cloud and Intelligent Edge:Discover, Deploy, and Manage w...AI for Intelligent Cloud and Intelligent Edge:Discover, Deploy, and Manage w...
AI for Intelligent Cloud and Intelligent Edge: Discover, Deploy, and Manage w...
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics Suite
 
What are the Business Benefits of Microsoft Azure
What are the Business Benefits of Microsoft AzureWhat are the Business Benefits of Microsoft Azure
What are the Business Benefits of Microsoft Azure
 
Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...
Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...
Azure Data Factory | Moving On-Premise Data to Azure Cloud | Microsoft Azure ...
 
Data Migration to Azure
Data Migration to AzureData Migration to Azure
Data Migration to Azure
 
Adam azure presentation
Adam   azure presentationAdam   azure presentation
Adam azure presentation
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Overview on Azure Machine Learning
Overview on Azure Machine LearningOverview on Azure Machine Learning
Overview on Azure Machine Learning
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Deep Learning Technical Pitch Deck
Deep Learning Technical Pitch DeckDeep Learning Technical Pitch Deck
Deep Learning Technical Pitch Deck
 
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorial
 
Best Practices: Hadoop migration to Azure HDInsight
Best Practices: Hadoop migration to Azure HDInsightBest Practices: Hadoop migration to Azure HDInsight
Best Practices: Hadoop migration to Azure HDInsight
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced Analytics
 
Simplify and Accelerate SQL Server Migration to Azure
Simplify and Accelerate SQL Server Migration to AzureSimplify and Accelerate SQL Server Migration to Azure
Simplify and Accelerate SQL Server Migration to Azure
 

Similar to Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)

Cosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics WorkshopCosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics Workshop
Databricks
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
Amazon Web Services Korea
 

Similar to Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss) (20)

Azure SQL DB Managed Instances Built to easily modernize application data layer
Azure SQL DB Managed Instances Built to easily modernize application data layerAzure SQL DB Managed Instances Built to easily modernize application data layer
Azure SQL DB Managed Instances Built to easily modernize application data layer
 
Ralph Kemperdick – IT-Tage 2015 – Microsoft Azure als Datenplattform
Ralph Kemperdick – IT-Tage 2015 – Microsoft Azure als DatenplattformRalph Kemperdick – IT-Tage 2015 – Microsoft Azure als Datenplattform
Ralph Kemperdick – IT-Tage 2015 – Microsoft Azure als Datenplattform
 
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2
 
Mainframe Modernization with Precisely and Microsoft Azure
Mainframe Modernization with Precisely and Microsoft AzureMainframe Modernization with Precisely and Microsoft Azure
Mainframe Modernization with Precisely and Microsoft Azure
 
Azure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the CloudAzure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the Cloud
 
Exploring Microsoft Azure Infrastructures
Exploring Microsoft Azure InfrastructuresExploring Microsoft Azure Infrastructures
Exploring Microsoft Azure Infrastructures
 
Cosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics WorkshopCosmos DB Real-time Advanced Analytics Workshop
Cosmos DB Real-time Advanced Analytics Workshop
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
 
AZURE Data Related Services
AZURE Data Related ServicesAZURE Data Related Services
AZURE Data Related Services
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BI
 
Customer migration to Azure SQL database, December 2019
Customer migration to Azure SQL database, December 2019Customer migration to Azure SQL database, December 2019
Customer migration to Azure SQL database, December 2019
 
Dell Digital Transformation Through AI and Data Analytics Webinar
Dell Digital Transformation Through AI and  Data Analytics WebinarDell Digital Transformation Through AI and  Data Analytics Webinar
Dell Digital Transformation Through AI and Data Analytics Webinar
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic Solutions
 
Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 

More from Trivadis

More from Trivadis (20)

Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...
Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...
Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...
 
Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...
Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...
Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...
 
Azure Days 2019: Master the Move to Azure (Konrad Brunner)
Azure Days 2019: Master the Move to Azure (Konrad Brunner)Azure Days 2019: Master the Move to Azure (Konrad Brunner)
Azure Days 2019: Master the Move to Azure (Konrad Brunner)
 
Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...
Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...
Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...
 
Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...
Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...
Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...
 
Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...
Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...
Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...
 
Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...
Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...
Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...
 
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...
 
TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...
TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...
TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...
 
TechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - Trivadis
TechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - TrivadisTechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - Trivadis
TechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - Trivadis
 
TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...
TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...
TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...
 
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...
 
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...
 
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
 
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...
 
TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...
TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...
TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...
 
TechEvent 2019: The sleeping Power of Data; Eberhard Lösch - Trivadis
TechEvent 2019: The sleeping Power of Data; Eberhard Lösch - TrivadisTechEvent 2019: The sleeping Power of Data; Eberhard Lösch - Trivadis
TechEvent 2019: The sleeping Power of Data; Eberhard Lösch - Trivadis
 
TechEvent 2019: Tales from a Scrum Master; Ernst Jakob - Trivadis
TechEvent 2019: Tales from a Scrum Master; Ernst Jakob - TrivadisTechEvent 2019: Tales from a Scrum Master; Ernst Jakob - Trivadis
TechEvent 2019: Tales from a Scrum Master; Ernst Jakob - Trivadis
 
TechEvent 2019: Serverless - Ist das was für mich?; Thorsten Maier - Trivadis
TechEvent 2019: Serverless - Ist das was für mich?; Thorsten Maier - TrivadisTechEvent 2019: Serverless - Ist das was für mich?; Thorsten Maier - Trivadis
TechEvent 2019: Serverless - Ist das was für mich?; Thorsten Maier - Trivadis
 
TechEvent 2019: Alexa, Netatmo, ZeptrionAir and Co, Home Automation with fun;...
TechEvent 2019: Alexa, Netatmo, ZeptrionAir and Co, Home Automation with fun;...TechEvent 2019: Alexa, Netatmo, ZeptrionAir and Co, Home Automation with fun;...
TechEvent 2019: Alexa, Netatmo, ZeptrionAir and Co, Home Automation with fun;...
 

Recently uploaded

Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 
Microsoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdfMicrosoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdf
Overkill Security
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
Wonjun Hwang
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Microsoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdfMicrosoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdf
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
الأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهلهالأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهله
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 

Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)

  • 1. Grösser und Komplexer ist nicht immer besser Meinrad Weiss Senior Cloud Solution Architect
  • 2. «Moderne» Data Warehouse/Data Lake Architekturen strotzen oft nur von Layern und Services. • Mit solchen Systemen lassen sich Petabytes von Daten verwalten und analysieren. • Das Ganze hat aber auch seinen Preis (Komplexität, Latenzzeit, Stabilität) und nicht jedes Projekt wird mit diesem Ansatz glücklich. Der Vortrag zeigt die Reise von einer technologieverliebten Lösung zu einer auf die Anwender Bedürfnisse abgestimmten Umgebung. Er zeigt die Sonnen- und Schattenseiten von massiv parallelen Systemen und soll die Sinne auf das Aufnehmen der realen Kundenanforderungen sensibilisieren. Agenda/Goal
  • 3. • 10 Years Ago: selling hardware • 5 Years Ago: Transition to Leasing business − Challenge-> How to maintain the hardware for Leasing? − Solution -> Put Sensor on everything, and predictive maintenance on IoT • Trend Now: Manufacturing companies are becoming data companies Manufactory Companies are becoming Data Companies:
  • 4. Architecture Overview for Information Management 4 IoT Non Structured CRM BW … Azure DW ERP Polybase IoT Hub Stream Analytics Blob Storage ADF: Azure Data Factory1 2 Logic, API, App Theobald1 Theobald IS & SSIS 2 Data Lake Store SSAS Data Lake Analysis HD Insights (Spark) Azure Machine Learning Visual Studio R-Studio Local SQL Local SQL Data scientist MSR ADF1 SSIS2 Direct Query1 Process2 Local IT Market Reach Excel MS Access xxx Apps SQL1 API2 Browser API Product Mgr. Local SQL Reporting & Analysis Azure SQL DB SQL API ADW: + No size limit + Scalability + Polybase - Less compatible - Concurrency limit - No Row level security - No Ref. Integrity SQL DB / SSAS: + Row level security + High concurrency + High T-SQL compatibility - Limited DB size Push Pull SQL connect
  • 5. Architecture Overview for Information Management 5 IoT Non Structured CRM BW … Azure DW ERP Polybase IoT Hub Stream Analytics Blob Storage ADF: Azure Data Factory1 2 Logic, API, App Theobald1 Theobald IS & SSIS 2 Data Lake Store SSAS Data Lake Analysis HD Insights (Spark) Azure Machine Learning Visual Studio R-Studio Local SQL Local SQL Data scientist MSR ADF1 SSIS2 Direct Query1 Process2 Local IT Market Reach Excel MS Access xxx Apps SQL1 API2 Browser API Product Mgr. Local SQL Reporting & Analysis Azure SQL DB SQL API ADW: + No size limit + Scalability + Polybase - Less compatible - Concurrency limit - No Row level security - No Ref. Integrity SQL DB / SSAS: + Row level security + High concurrency + High T-SQL compatibility - Limited DB size Push Pull SQL connect
  • 6. Azure SQL Data Warehouse performance advantage Overview SQL Data Warehouse’s industry leading price-performance comes from leveraging the Azure ecosystem and core SQL Server engine improvements to produce massive gains in performance. These benefits require no customer configuration and are provided out-of-the-box for every data warehouse • Gen2 adaptive caching – using non-volatile memory solid-state drives (NVMe) to increase the I/O bandwidth available to queries. • Azure FPGA-accelerated networking enhancements – to move data at rates of up to 1GB/sec per node to improve queries • Instant data movement – leverages multi-core parallelism in underlying SQL Servers to move data efficiently between compute nodes. • SQL Query Optimizer – ongoing investments in distributed query optimization
  • 8. Mapping Compute in SQLDW – (2 * 30 = 60) 13 14 1615 17 18 2 0 19 21 2 2 2 4 2 3 2 5 2 6 2 8 27 2 9 3 0 3 2 31 3 3 3 4 3 6 3 5 37 3 8 4 0 3 9 41 4 2 4 4 4 3 4 5 4 6 4 8 4 7 4 9 5 0 5 2 51 5 3 5 4 5 6 5 5 57 5 8 6 0 5 9 01 0 2 0 4 0 3 0 5 0 6 0 8 07 0 9 10 1211 DW200
  • 9. Mapping Compute in SQLDW (3 * 20 = 60) 13 14 1615 17 18 2 0 19 21 2 2 2 4 2 3 2 5 2 6 2 8 27 2 9 3 0 3 2 31 3 3 3 4 3 6 3 5 37 3 8 4 0 3 9 41 4 2 4 4 4 3 4 5 4 6 4 8 4 7 4 9 5 0 5 2 51 5 3 5 4 5 6 5 5 57 5 8 6 0 5 9 01 0 2 0 4 0 3 0 5 0 6 0 8 07 0 9 10 1211 DW300
  • 10. CREATE TABLE Sales.Order ( OrderId INT NOT NULL, Date DATE NOT NULL, Name VARCHAR(2), Country VARCHAR(2) ) WITH ( CLUSTERED COLUMNSTORE INDEX, DISTRIBUTION = HASH([OrderId]) | ROUND ROBIN | REPLICATED ); Round-robin distributed Distributes table rows evenly across all distributions at random. Hash distributed Distributes table rows across the Compute nodes by using a deterministic hash function to assign each row to one distribution. Replicated Full copy of table accessible on each Compute node. Tables – Distributions
  • 13. Architecture Overview for Information Management 1 3 IoT Non Structured CRM BW … Azure DW ERP Polybase IoT Hub Stream Analytics Blob Storage ADF: Azure Data Factory1 2 Logic, API, App Theobald1 Theobald IS & SSIS 2 Data Lake Store SSAS Data Lake Analysis HD Insights (Spark) Azure Machine Learning Visual Studio R-Studio Local SQL Local SQL Data scientist MSR ADF1 SSIS2 Direct Query1 Process2 Local IT Market Reach Excel MS Access xxx Apps SQL1 API2 Browser API Product Mgr. Local SQL Reporting & Analysis Azure SQL DB SQL API ADW: + No size limit + Scalability + Polybase - Less compatible - Concurrency limit - No Row level security - No Ref. Integrity SQL DB / SSAS: + Row level security + High concurrency + High T-SQL compatibility - Limited DB size Push Pull SQL connect • Slow load and query performance • Limited number of concurrent queries (improved with ADW Gen2)
  • 14. Design Decisions/Table Distributions All tables are currently Round Robin distributed. There are 2 main reasons for this: • When tables are loaded through Theobald, using a drop and create approach, these tables are created with default settings, which is a round robin distribution • Loaded via the head node • An optimal data distribution would allow to use all compute nodes equally during the most frequent operations. In our case, this is inserting and reading data. At this moment, the most intensive data reads are the ones on the Salesdetails view. • These will get a month of XXXX data and join this with • Date, TerritoryHierarchy and the Regions table, to identify the data that needs to be sent to the spoke. • Distributing by region, or even by company code would not give enough distinct values to evenly use all nodes. • Distributing by month would reduce data movement but would not provide any load balancing during querying as all data would come from the same node.
  • 15. Architecture Overview for Information Management 1 5 IoT Non Structured CRM BW … Azure DW ERP Polybase IoT Hub Stream Analytics Blob Storage ADF: Azure Data Factory1 2 Logic, API, App Theobald1 Theobald IS & SSIS 2 Data Lake Store SSAS Data Lake Analysis HD Insights (Spark) Azure Machine Learning Visual Studio R-Studio Local SQL Local SQL Data scientist MSR ADF1 SSIS2 Direct Query1 Process2 Local IT Market Reach Excel MS Access xxx Apps SQL1 API2 Browser API Product Mgr. Local SQL Reporting & Analysis Azure SQL DB SQL API ADW: + No size limit + Scalability + Polybase - Less compatible - Concurrency limit - No Row level security - No Ref. Integrity SQL DB / SSAS: + Row level security + High concurrency + High T-SQL compatibility - Limited DB size Push Pull SQL connect • Slow load and query performance • Limited number of concurrent queries (improved with ADW Gen2)
  • 16. Architecture Overview for Information Management 1 6 IoT Non Structured CRM BW … Azure DW ERP Polybase IoT Hub Stream Analytics Blob Storage ADF: Azure Data Factory1 2 Logic, API, App Theobald1 Theobald IS & SSIS 2 Data Lake Store SSAS Data Lake Analysis HD Insights (Spark) Azure Machine Learning Visual Studio R-Studio Local SQL Local SQL Data scientist MSR ADF1 SSIS2 Direct Query1 Process2 Local IT Market Reach Excel MS Access xxx Apps SQL1 API2 Browser API Product Mgr. Local SQL Reporting & Analysis Azure SQL DB SQL API ADW: + No size limit + Scalability + Polybase - Less compatible - Concurrency limit - No Row level security - No Ref. Integrity SQL DB / SSAS: + Row level security + High concurrency + High T-SQL compatibility - Limited DB size Push Pull SQL connect
  • 17. Data lake platform Hub – spoke concept using ADW SAP BW Azure SQL DWH Theobald ADF HUB Read Only Spoke Per Region ADF ADF SAP ERP SAP CRM … E1 Spoke E2 Spoke E3 Spoke Read Write Spoke for local applications E1W Spoke E4 Spoke ADF SAP ByD E2W Spoke E3W Spoke E4W Spoke E4 Spoke ADF Data Reference (Elastic Query) Data Movement
  • 18. Azure Blob Storage Hadoop Azure Data Lake Storage SQL MySQL PostgreSQL MariaDB SQL Server in Azure Azure SQL Data Warehouse Azure Cosmos DB SQL Server Hyperscale & Data virtualization SQL
  • 19. SQL Server PaaS offerings SQL Database (PaaS) Elastic Pool Managed Instance Singleton SQL Server in a VM SQL General availabilityGeneral availability Preview System Current Near Future + 1 Y Future SAP BW 1TB 4TB 8TB
  • 20. Data lake platform Hub – spoke concept using ADW SAP BW Azure SQL DWH Theobald ADF HUB Read Only Spoke Per Region ADF ADF SAP ERP SAP CRM … E1 Spoke E2 Spoke E3 Spoke Read Write Spoke for local applications E1W Spoke E4 Spoke ADF SAP ByD E2W Spoke E3W Spoke E4W Spoke E4 Spoke ADF Data Reference (Elastic Query) Data Movement • Sub-optimal availability • Long load times • Bad query performance using distributed queries
  • 21. Data lake platform Hub – spoke concept using ADW SAP BW Azure SQL DWH Theobald ADF HUB Read Only Spoke Per Region ADF ADF SAP ERP SAP CRM … E1 Spoke E2 Spoke E3 Spoke Read Write Spoke for local applications E1W Spoke E4 Spoke ADF SAP ByD E2W Spoke E3W Spoke E4W Spoke E4 Spoke ADF Data Reference (Elastic Query) Data Movement • Sub-optimal availability • Long load times • Bad query performance using distributed queries SLA 99.9 % SLA 99.99 % SLA 99.99 % SLA max 99.89 % SLA 99.98 %
  • 22. Data lake platform Hub – spoke concept using ADW SAP BW Azure SQL DWH Theobald ADF HUB Read Only Spoke Per Region ADF ADF SAP ERP SAP CRM … E1 Spoke E2 Spoke E3 Spoke Read Write Spoke for local applications E1W Spoke E4 Spoke ADF SAP ByD E2W Spoke E3W Spoke E4W Spoke E4 Spoke ADF Data Reference (Elastic Query) Data Movement • Sub-optimal availability • Long load times • Bad query performance using distributed queries • Most meta data are copied to each of the Spokes • Spoke E1 represents 60% of all transactions
  • 23. AdventureWorks Sample mapped to referenced project - Views could replace multiple copy operations SQL Server Instance Read Only Spoke For all Regions Read Write Spoke for local applications Region Views Theobald SAP CRM SAP ERP … SAP ByD
  • 24. Data lake platform Hub – spoke concept using ADW SAP BW Azure SQL DWH Theobald ADF HUB Read Only Spoke Per Region ADF ADF SAP ERP SAP CRM … E1 Spoke E2 Spoke E3 Spoke Read Write Spoke for local applications E1W Spoke E4 Spoke ADF SAP ByD E2W Spoke E3W Spoke E4W Spoke E4 Spoke ADF Data Reference (Elastic Query) Data Movement • Sub-optimal availability • Long load times • Bad query performance using distributed queries
  • 26. SQL Server Instance CPU Memory DB Buffer Cache Procedure Cache Log Cache Files DB1 SQL Server Instance CPU Memory DB Buffer Cache Procedure Cache Log Cache Files DBn Spoke Database on Azure SQL Server SQL Server Instance Read Only Spoke For all Regions Read Write Spoke for local applications Region Views Hub and spoke objects are stored in differen SQL server instance - Access via network (external tables) or -- e.g. in [AdventureWorksDW2017_US] CREATE EXTERNAL TABLE [dbo].[DimScenario] ( [ScenarioKey] [int] NOT NULL, [ScenarioName] [nvarchar](50) NULL ) WITH ( DATA_SOURCE = [AdventureWorksDW2017] ,SCHEMA_NAME = N'Mart_US' ,OBJECT_NAME = N'DimScenario' )
  • 27. Excution Plans (Test 9: Spoke Query with accessing Hub and Spoke Objects) Azure SQL Database
  • 28. Azure SQL Server Managed Instance SQL Server Instance CPU Memory DB Buffer Cache Procedure Cache Log Cache Files DB1 DB2 DBn Data from all databases share the same memory space Access via simple 3 part nameing is possible db.schema.object
  • 29. Spoke Database on SQL Server Managed Instance SQL Server Instance Read Only Spoke For all Regions Read Write Spoke for local applications Region Views SQL Server Instance CPU Memory DB Buffer Cache Procedure Cache Log Cache Files DB1 DB2 DBn All objects are stored in the same SQL server instance - 3 part nameing can be used to access objects in another db or -- e.g. in [AdventureWorksDW2017_US] CREATE VIEW [dbo].[DimScenario] AS SELECT * FROM [AdventureWorksDW2017].[Mart_US].[DimScenario]; -- e.g. in [AdventureWorksDW2017_US] CREATE VIEW [dbo].[DimScenario] AS SELECT [ScenarioKey] ,[ScenarioName] FROM [AdventureWorksDW2017].[Mart_US].[DimScenario];
  • 30. Excution Plans (Test 9: Spoke Query with accessing Hub and Spoke Objects) Azure SQL Database Managed Instance
  • 31. Layers and objects (via Sample AdventureWorksDW) Schema .dbo Schema .Mart_DE .Mart_US Most Views will be 1:1 mappings Some Views will filter data
  • 32. Test Queries (1) Title SQL Result Set/Test Select some attributes with filter SELECT LastName, FirstName FROM dbo.dimCustomer WHERE LastName = 'Adams’ AND NumberChildrenAtHome = 3; 2 Columns/6 Rows Aggregate single table select sum([SalesAmount]) from [dbo].[FactInternetSales] 1 Column/1 Row Aggregate single table with simple filter select sum([SalesAmount]) from [dbo].[FactInternetSales] where [SalesOrderLineNumber] > 1 1 Column/1 Row Aggregate single table with filter with FilterKriteria as ( select 1 as [MinSalesOrderLineNumber]) select sum([SalesAmount]) from [dbo].[FactInternetSales] cross join FilterKriteria where [SalesOrderLineNumber] > [MinSalesOrderLineNumber] 1 Column/1 Row - Filter push down Inner database join and aggregate select [EnglishProductName], sum(SalesAmount) from [dbo].[FactInternetSales] [S] inner join [dbo].[DimProduct] [P] on [S].[ProductKey] = [P].[ProductKey] group by [P].[EnglishProductName] order by 2 desc 2 Columns/130 Rows - Join push down
  • 33. Test Queries (2) Title SQL Result Set/Test Transfer data to spoke select * into [dbo].[LocalFactInternetSales] from [dbo].[FactInternetSales] 27 Columns/21’344 Rows Transfer speed of data between databases Cross database join with aggregate select [OrderDateKey], [Gender], sum([SalesAmount]) from [dbo].[LocalFactInternetSales] as [S] inner join [dbo].[DimCustomer] as [C] on [S].[CustomerKey] = [C].[CustomerKey] group by [OrderDateKey], [Gender] 3 Column/1761 Rows Handling of cross database joins SQL Server Instance Read Only Spoke For all Regions Read Write Spoke for local applications Region Views Hub query Spoke query
  • 34. Performance Tests (1) 0 10 20 30 40 50 60 Select some attributes with filter Aggregate single table Aggregate single table with simple filter Aggregate single table with filter Inner database join and aggregate Test package one Azure SQL Server Hub Azure SQL Server Spoke Managed Instance Hub Managed Instance Spoke
  • 35. 0 50 100 150 200 250 300 350 400 Transfer data to spoke Cross database join with aggregate Test package two Azure SQL Server Hub Azure SQL Server Spoke Managed Instance Hub Managed Instance Spoke Performance Tests (2) Part of todays ETL performance problem Issue, if spoke uses local spoke data and remote hub data in one query
  • 36. New architecture SQL Server Instance Read Only Spoke For all Regions Read Write Spoke for local applications Region Views From System with: - 1 Azure SQL DWH - 9 Azure SQL DB’s to - 1 Azure SQL MI (SLA 99.99%)
  • 37. Azure Blob Storage Hadoop Azure Data Lake Storage SQL MySQL PostgreSQL MariaDB SQL Server in Azure Azure SQL Data Warehouse Azure Cosmos DB SQL Server Hyperscale & Data virtualization SQL Future Option
  • 38. Hyperscale service tier for up to 100 TB • Support for up to 100 TB of database size • Higher overall performance due to higher log throughput and faster transaction commit times regardless of data volumes • Nearly instantaneous database backups (snapshots of Azure Blob storage) • Fast database restores (based on file snapshots) • Rapid read scale out • Rapid Scale up
  • 39. SQL Server Data Virtualization • Allows the data to stay in its original location, however you can virtualize the data in a SQL Server instance • it can be queried there like any other table in SQL Server.
  • 40. Conclusion KISS (Keep It Simple [not Stupid]) • All used services are excellent services • Azure SQL Data Warehouse • Azure SQL Database • Azure SQL MI • (SQL in a VM) -> Don’t take PaaS as a religion • Theobald • Technical implementation details can make the difference • Transparent must not mean fast! • My personal advice • Try to use as few and “simple” services as possible (but not less) • For each used service, you should have a good argument chain, why you use it • POC’s help you to understand the different technologies • There is no free lunch • E.g. with databases like Azure SQL Data Warehouse or Cosmos DB you get “endless scale” but you must deal with data distributions/partitions
  • 41. © Copyright Microsoft Corporation. All rights reserved.