SlideShare a Scribd company logo
1 of 35
Download to read offline
Best Practices for Implementing
Enterprise BI Solution
Teo Lachev, Prologika
teo.lachev@prologika.com
Why BI projects fail
• 70-80% corporate BI projects fail (Gartner http://bit.ly/YRi028)
• Top reasons
 Poor communication between IT and Business
 Failure to ask the right questions
 Other reasons from my experience






Business doesn’t know about BI
Inexperience and lack of technical knowledge
“When all you have is a hammer…”
Data inaccuracy
Performance degradation with large datasets
Agenda
• Share best practices and lessons learned
 BI architecture
 Data warehouse design
 ETL
 Semantic layer
 Presentation layer

• Assumptions

 Experience with Microsoft BI and database design

• Microsoft case study

 Records Management Firm Saves $1 Million
http://bit.ly/15exUpM
 Most performance practices around biggish data
Ground rules
• Ask questions
• Turn cellphones off
• Tweet away (@tlachev #BestBI)
About me
• Consultant, author, and mentor with focus on Microsoft BI
• Owner of Prologika – BI consulting and training
company based in Atlanta (www.prologika.com)
• Microsoft SQL Server MVP for 10 years
• Leader of Atlanta BI group (atlantabi.sqlpass.org)
Used phased approach
• Identify critical success factors
• Break project into phases
• Phase 1
• Most important
• Scope it relatively small
• Sets foundation
• Business process to model
• First conformant dimensions
• A few fact tables
Use classic BI solution architecture
Transactional reporting

Dimension
Tables

Fact
Tables
ETL
Integration Services

Multidimensional
OR

Historical &
trend reporting

Tabular

Data Sources

Data is extracted from
data sources,
transformed, and
loaded into DW

Data Warehouse

Data is stored in
dimensional schema
consisting of dimension
and fact tables

Semantic Layer

Great performance
Business calculations
Single version of truth
Client support
Security
Isolation

Ad-hoc reports
Operational reports
Dashboards
Third party tools

Presentation Layer
Standard reporting
Ad-hoc reporting
Dashboards
Keep it simple!

Europe

NA

ASIA

Europe

Teo’s insight: Remove complexity
until it cannot be simplified
anymore

Asia
NA
Consider active-active clustering
Cluster

Database
server

SSAS
server
Check your environment
• I/O
 BACKUP DATABASE [ContosoRetailDW] TO DISK='NUL';

 Or use tools such as IOMeter or CrystalMark
 I/O should be above 500 MB/sec

• Network speed

 select * from <some fact table>
(consider discarding query results)
 Num rows/sec = row count/execution time (sec)
 Aim for > 100K rows/sec

• Virtualization

 Disk pass-through enabled
 Dedicated resources
Agenda
BI architecture
Data warehouse design
ETL
Semantic Layer
Presentation layer
Star schema is your best friend
• Your dimensional model is foundation
• Design it with end user in mind
• Avoid normalization
• Avoid summarized tables
• Use smartkey (YYYYMMDD) or
[date] keys for Date tables
• Use referential integrity

Teo’s insight: The fact that Tabular supports more
flexible relationships doesn’t mean that star
schema is obsolete - just the opposite.
Optimize physical storage
• Set database recovery to Simple
• Index considerations
 Cluster key on DateKey column in fact tables
 Other indexes as needed

 File groups
 File group per each large table
 Files on different drives
 Avoid using Primary file group
Use partitioning
• Partition large tables (above 50 Gb)
 Partition switching
 Better manageability
 Partition elimination when querying data

Good read: “Partitioned Table and Index Strategies Using SQL
Server 2008” whitepaper by Ron Talmage
Use compression
• Consider page compression above 1 TB
• 50-80% saving in disk space
• To estimate storage savings:

 Use SSMS Data Compression Wizard
 sp_estimate_data_compression_savings stored procedure

EXEC sp_estimate_data_compression_savings 'dbo', 'FactResellerSales', 1, NULL, 'PAGE'

Good read: “Data Compression: Strategy, Capacity Planning and
Best Practices” whitepaper by Sanjay Mishra
Agenda
BI architecture
Data warehouse design
ETL
Semantic Layer
Presentation layer
Consider merge design pattern
• More efficient than SSIS transforms
• More flexible than SSIS lookups
• Easier to maintain
stored procedure with T-SQL
merge statement

LOB

Staging
Database
Files

select a,b
from st1 inner join
st2 where...

incremental
extraction
Data Sources

Staging Database

work table

dimension or
fact table

Data Warehouse
Consider Operational Data Store
• ODS advantages
• Offloads transactional data
• Maintains data history
• Smarter “staging” database
Start_Date

End_Date

Store

Product

1/1/2010

5/1/2010

Atlanta

Mountain Bike 1

5/2/2010

3/8/2012

Atlanta

Mountain Bike 2

3/9/2012

12/31/9999

Norcross

Mountain Bike 2

…
Index considerations
• Eliminate read locks
• Indexes: ALLOW_PAGE_LOCKS = OFF and ALLOW_ROW_LOCKS = OFF

• View hints WITH (NOLOCK) or

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

• Drop non-clustered indexes and constraints
 With massive updates (10% or more)
 Enables non-logged load

 Consider COLUMNSTORE indexes when queries
aggregate data
Take advantage of partitioning
• Consider partition switching
 Fast incremental load
 Parallel partition load
 Faster updates

• Use Manage Partition Wizard to generate
 Switch in/out scripts
 Staging table
 Sliding window
For parallel partition load, change the table lock escalation
ALTER TABLE … SET ( LOCK_ESCALATION = AUTO)
To find the table lock escalation:
SELECT lock_escalation_desc FROM sys.tables WHERE name = ‘<table name>'
Optimize big joins
• Set OPTION (HASH JOIN or LOOP JOIN)
http://bit.ly/108HuHR
Agenda
BI architecture
Data warehouse design
ETL
Semantic Layer
Presentation layer
BI Semantic Layer
Third-Party BI
Applications

Reporting Services
Reports

PowerPivot
Applications

Excel
Workbooks

MDX

SharePoint
Dashboards &
Scorecards

DAX

Multidimensional

Tabular

MDX

DAX

MOLAP

ROLAP

xVelocity
(VertiPaq)

DirectQuery

Files

OData
Feeds
Choose semantic layer wisely
• Decision checkpoints
• Data volumes
• Complexity

• Scenarios for considering Multidimensional
 Data warehousing
 Large data volumes
 Complex models

• Scenarios for considering Tabular
 Promoting PowerPivot models to organizational models
 Rapid development for simple models
 Transactional reporting? (be careful)
Optimize Multidimensional
• Don’t be afraid of biggish data
• Avoid complex scope assignments
• Centralize business logic
• Consider fast storage
• Consider single cube
Tabular Considerations
• Improve your design experience http://bit.ly/106iKjt
• Small dataset during dev
• Disable automatic calculation

• Remove unnecessary columns
• Be careful about transactional reporting
• No cross-fact table support
• Performance degradation with
big data - http://bit.ly/136h60U

Dim Date

Fact Orders

Fact Receipts
Partition when makes sense
• Partition large measure groups (above 100 million)
 Mostly management technique
 Useful for incremental processing
 Partition slice: ~50 million

• Automate with partition generator
http://bit.ly/partitiongenerator
• Use SQL views to wrap tables
When to use self-service BI?
• Know your end users
 Power users
 Financial analysts

• When self-service BI make sense?
 Waiting for organizational BI to happen
 Ideate and promote lateral thinking

 Consider 80/20 rule
 80% organizational BI
 20% self-service BI
Agenda
BI architecture
Data warehouse design
ETL
Analytical layer
Presentation layer
Dashboards
“A dashboard is a visual display of the most important information needed to achieve one or more objectives;
consolidated and arranged on a single screen so the information can be monitored at a glance.”
Stephen Few, “Information Dashboard Design” book

From “Information Dashboard Design” book
PerformancePoint in real life
Power View in real life
Excel Services in SharePoint 2013
Consider your dashboard options
Technology

Pros

Cons

PerformancePoint

Designed for scorecards and KPIs
Supporting views
(reports, Excel spreadsheets, PP reports)
Decomposition tree
Customizable

BI pro-oriented
No “wow” effect

Power View

Highly interactive
Easy to implement
End user-oriented

No extensibility
No mobile support yet (but promised)
Currently requires Silverlight
(MS working on HTML5)

Excel Services

Use Excel pivot reports
Easy to implement
Reports updatable in SP 2013

Reports not updatable in SP 2010
No “wow” effect

Reporting Services reports

Highly customizable
Rich visualizations

Require report experience
Reports not updatable
Drillthrough requires new reports
Summary

• I shared proven practices and tips from past experience
• Keep things simple but have sound design
• How to contact me:
•
•
•
•

Email: teo.lachev@prologika.com
Web: www.prologika.com
Blog: http://prologika.com/cs/blogs/
Newsletter: http://prologika.com/Newsroom/News.aspx

More Related Content

Viewers also liked

Best practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power biBest practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power biSatya Shyam K Jayanty
 
backbase-cxp-datasheet
backbase-cxp-datasheetbackbase-cxp-datasheet
backbase-cxp-datasheetMykola Bova
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligences.poles
 
GRUPO MARZO PROFESSIONAL SERVICES
GRUPO MARZO PROFESSIONAL SERVICESGRUPO MARZO PROFESSIONAL SERVICES
GRUPO MARZO PROFESSIONAL SERVICESLeopoldo Vizoso
 
Best Practices to Deliver BI Solutions
Best Practices to Deliver BI SolutionsBest Practices to Deliver BI Solutions
Best Practices to Deliver BI SolutionsJames Serra
 
Inteligancia de negocios
Inteligancia de negociosInteligancia de negocios
Inteligancia de negociosEdgar Barrios
 
Business intelligence architecture
Business intelligence architectureBusiness intelligence architecture
Business intelligence architectureSlava Kokaev
 
Open Source Business Intelligence 2013 (spanish)
Open Source Business Intelligence 2013 (spanish)Open Source Business Intelligence 2013 (spanish)
Open Source Business Intelligence 2013 (spanish)Stratebi
 
Agile BI - SYBIS
Agile BI - SYBISAgile BI - SYBIS
Agile BI - SYBISIman Ef
 
Asian architecture Paper Presentation
Asian architecture Paper PresentationAsian architecture Paper Presentation
Asian architecture Paper PresentationIvy Yee
 
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce DataLearn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce DataNetwoven Inc.
 
Bi Architecture And Conceptual Framework
Bi Architecture And Conceptual FrameworkBi Architecture And Conceptual Framework
Bi Architecture And Conceptual FrameworkSlava Kokaev
 
The Future of Omni-Channel Banking
The Future of Omni-Channel BankingThe Future of Omni-Channel Banking
The Future of Omni-Channel BankingBackbase
 
Exploring Architectures for Fast and Easy Development of Immersive Learning S...
Exploring Architectures for Fast and Easy Development of Immersive Learning S...Exploring Architectures for Fast and Easy Development of Immersive Learning S...
Exploring Architectures for Fast and Easy Development of Immersive Learning S...Rob Nadolski
 
SAP BI Implementation
SAP BI ImplementationSAP BI Implementation
SAP BI ImplementationRahul Bindroo
 

Viewers also liked (16)

Best practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power biBest practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power bi
 
Jak znaleźć filmy TED - instrukcja "krok po kroku" / Noemi Gryczko
Jak znaleźć filmy TED - instrukcja "krok po kroku" / Noemi GryczkoJak znaleźć filmy TED - instrukcja "krok po kroku" / Noemi Gryczko
Jak znaleźć filmy TED - instrukcja "krok po kroku" / Noemi Gryczko
 
backbase-cxp-datasheet
backbase-cxp-datasheetbackbase-cxp-datasheet
backbase-cxp-datasheet
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
GRUPO MARZO PROFESSIONAL SERVICES
GRUPO MARZO PROFESSIONAL SERVICESGRUPO MARZO PROFESSIONAL SERVICES
GRUPO MARZO PROFESSIONAL SERVICES
 
Best Practices to Deliver BI Solutions
Best Practices to Deliver BI SolutionsBest Practices to Deliver BI Solutions
Best Practices to Deliver BI Solutions
 
Inteligancia de negocios
Inteligancia de negociosInteligancia de negocios
Inteligancia de negocios
 
Business intelligence architecture
Business intelligence architectureBusiness intelligence architecture
Business intelligence architecture
 
Open Source Business Intelligence 2013 (spanish)
Open Source Business Intelligence 2013 (spanish)Open Source Business Intelligence 2013 (spanish)
Open Source Business Intelligence 2013 (spanish)
 
Agile BI - SYBIS
Agile BI - SYBISAgile BI - SYBIS
Agile BI - SYBIS
 
Asian architecture Paper Presentation
Asian architecture Paper PresentationAsian architecture Paper Presentation
Asian architecture Paper Presentation
 
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce DataLearn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
 
Bi Architecture And Conceptual Framework
Bi Architecture And Conceptual FrameworkBi Architecture And Conceptual Framework
Bi Architecture And Conceptual Framework
 
The Future of Omni-Channel Banking
The Future of Omni-Channel BankingThe Future of Omni-Channel Banking
The Future of Omni-Channel Banking
 
Exploring Architectures for Fast and Easy Development of Immersive Learning S...
Exploring Architectures for Fast and Easy Development of Immersive Learning S...Exploring Architectures for Fast and Easy Development of Immersive Learning S...
Exploring Architectures for Fast and Easy Development of Immersive Learning S...
 
SAP BI Implementation
SAP BI ImplementationSAP BI Implementation
SAP BI Implementation
 

Recently uploaded

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Recently uploaded (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Best Practices for Implementing Enterprise BI Solution

  • 1. Best Practices for Implementing Enterprise BI Solution Teo Lachev, Prologika teo.lachev@prologika.com
  • 2. Why BI projects fail • 70-80% corporate BI projects fail (Gartner http://bit.ly/YRi028) • Top reasons  Poor communication between IT and Business  Failure to ask the right questions  Other reasons from my experience      Business doesn’t know about BI Inexperience and lack of technical knowledge “When all you have is a hammer…” Data inaccuracy Performance degradation with large datasets
  • 3. Agenda • Share best practices and lessons learned  BI architecture  Data warehouse design  ETL  Semantic layer  Presentation layer • Assumptions  Experience with Microsoft BI and database design • Microsoft case study  Records Management Firm Saves $1 Million http://bit.ly/15exUpM  Most performance practices around biggish data
  • 4. Ground rules • Ask questions • Turn cellphones off • Tweet away (@tlachev #BestBI)
  • 5. About me • Consultant, author, and mentor with focus on Microsoft BI • Owner of Prologika – BI consulting and training company based in Atlanta (www.prologika.com) • Microsoft SQL Server MVP for 10 years • Leader of Atlanta BI group (atlantabi.sqlpass.org)
  • 6. Used phased approach • Identify critical success factors • Break project into phases • Phase 1 • Most important • Scope it relatively small • Sets foundation • Business process to model • First conformant dimensions • A few fact tables
  • 7. Use classic BI solution architecture Transactional reporting Dimension Tables Fact Tables ETL Integration Services Multidimensional OR Historical & trend reporting Tabular Data Sources Data is extracted from data sources, transformed, and loaded into DW Data Warehouse Data is stored in dimensional schema consisting of dimension and fact tables Semantic Layer Great performance Business calculations Single version of truth Client support Security Isolation Ad-hoc reports Operational reports Dashboards Third party tools Presentation Layer Standard reporting Ad-hoc reporting Dashboards
  • 8. Keep it simple! Europe NA ASIA Europe Teo’s insight: Remove complexity until it cannot be simplified anymore Asia NA
  • 10. Check your environment • I/O  BACKUP DATABASE [ContosoRetailDW] TO DISK='NUL';  Or use tools such as IOMeter or CrystalMark  I/O should be above 500 MB/sec • Network speed  select * from <some fact table> (consider discarding query results)  Num rows/sec = row count/execution time (sec)  Aim for > 100K rows/sec • Virtualization  Disk pass-through enabled  Dedicated resources
  • 11. Agenda BI architecture Data warehouse design ETL Semantic Layer Presentation layer
  • 12. Star schema is your best friend • Your dimensional model is foundation • Design it with end user in mind • Avoid normalization • Avoid summarized tables • Use smartkey (YYYYMMDD) or [date] keys for Date tables • Use referential integrity Teo’s insight: The fact that Tabular supports more flexible relationships doesn’t mean that star schema is obsolete - just the opposite.
  • 13. Optimize physical storage • Set database recovery to Simple • Index considerations  Cluster key on DateKey column in fact tables  Other indexes as needed  File groups  File group per each large table  Files on different drives  Avoid using Primary file group
  • 14. Use partitioning • Partition large tables (above 50 Gb)  Partition switching  Better manageability  Partition elimination when querying data Good read: “Partitioned Table and Index Strategies Using SQL Server 2008” whitepaper by Ron Talmage
  • 15. Use compression • Consider page compression above 1 TB • 50-80% saving in disk space • To estimate storage savings:  Use SSMS Data Compression Wizard  sp_estimate_data_compression_savings stored procedure EXEC sp_estimate_data_compression_savings 'dbo', 'FactResellerSales', 1, NULL, 'PAGE' Good read: “Data Compression: Strategy, Capacity Planning and Best Practices” whitepaper by Sanjay Mishra
  • 16. Agenda BI architecture Data warehouse design ETL Semantic Layer Presentation layer
  • 17. Consider merge design pattern • More efficient than SSIS transforms • More flexible than SSIS lookups • Easier to maintain stored procedure with T-SQL merge statement LOB Staging Database Files select a,b from st1 inner join st2 where... incremental extraction Data Sources Staging Database work table dimension or fact table Data Warehouse
  • 18. Consider Operational Data Store • ODS advantages • Offloads transactional data • Maintains data history • Smarter “staging” database Start_Date End_Date Store Product 1/1/2010 5/1/2010 Atlanta Mountain Bike 1 5/2/2010 3/8/2012 Atlanta Mountain Bike 2 3/9/2012 12/31/9999 Norcross Mountain Bike 2 …
  • 19. Index considerations • Eliminate read locks • Indexes: ALLOW_PAGE_LOCKS = OFF and ALLOW_ROW_LOCKS = OFF • View hints WITH (NOLOCK) or SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED • Drop non-clustered indexes and constraints  With massive updates (10% or more)  Enables non-logged load  Consider COLUMNSTORE indexes when queries aggregate data
  • 20. Take advantage of partitioning • Consider partition switching  Fast incremental load  Parallel partition load  Faster updates • Use Manage Partition Wizard to generate  Switch in/out scripts  Staging table  Sliding window For parallel partition load, change the table lock escalation ALTER TABLE … SET ( LOCK_ESCALATION = AUTO) To find the table lock escalation: SELECT lock_escalation_desc FROM sys.tables WHERE name = ‘<table name>'
  • 21. Optimize big joins • Set OPTION (HASH JOIN or LOOP JOIN) http://bit.ly/108HuHR
  • 22. Agenda BI architecture Data warehouse design ETL Semantic Layer Presentation layer
  • 23. BI Semantic Layer Third-Party BI Applications Reporting Services Reports PowerPivot Applications Excel Workbooks MDX SharePoint Dashboards & Scorecards DAX Multidimensional Tabular MDX DAX MOLAP ROLAP xVelocity (VertiPaq) DirectQuery Files OData Feeds
  • 24. Choose semantic layer wisely • Decision checkpoints • Data volumes • Complexity • Scenarios for considering Multidimensional  Data warehousing  Large data volumes  Complex models • Scenarios for considering Tabular  Promoting PowerPivot models to organizational models  Rapid development for simple models  Transactional reporting? (be careful)
  • 25. Optimize Multidimensional • Don’t be afraid of biggish data • Avoid complex scope assignments • Centralize business logic • Consider fast storage • Consider single cube
  • 26. Tabular Considerations • Improve your design experience http://bit.ly/106iKjt • Small dataset during dev • Disable automatic calculation • Remove unnecessary columns • Be careful about transactional reporting • No cross-fact table support • Performance degradation with big data - http://bit.ly/136h60U Dim Date Fact Orders Fact Receipts
  • 27. Partition when makes sense • Partition large measure groups (above 100 million)  Mostly management technique  Useful for incremental processing  Partition slice: ~50 million • Automate with partition generator http://bit.ly/partitiongenerator • Use SQL views to wrap tables
  • 28. When to use self-service BI? • Know your end users  Power users  Financial analysts • When self-service BI make sense?  Waiting for organizational BI to happen  Ideate and promote lateral thinking  Consider 80/20 rule  80% organizational BI  20% self-service BI
  • 29. Agenda BI architecture Data warehouse design ETL Analytical layer Presentation layer
  • 30. Dashboards “A dashboard is a visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so the information can be monitored at a glance.” Stephen Few, “Information Dashboard Design” book From “Information Dashboard Design” book
  • 32. Power View in real life
  • 33. Excel Services in SharePoint 2013
  • 34. Consider your dashboard options Technology Pros Cons PerformancePoint Designed for scorecards and KPIs Supporting views (reports, Excel spreadsheets, PP reports) Decomposition tree Customizable BI pro-oriented No “wow” effect Power View Highly interactive Easy to implement End user-oriented No extensibility No mobile support yet (but promised) Currently requires Silverlight (MS working on HTML5) Excel Services Use Excel pivot reports Easy to implement Reports updatable in SP 2013 Reports not updatable in SP 2010 No “wow” effect Reporting Services reports Highly customizable Rich visualizations Require report experience Reports not updatable Drillthrough requires new reports
  • 35. Summary • I shared proven practices and tips from past experience • Keep things simple but have sound design • How to contact me: • • • • Email: teo.lachev@prologika.com Web: www.prologika.com Blog: http://prologika.com/cs/blogs/ Newsletter: http://prologika.com/Newsroom/News.aspx