SlideShare a Scribd company logo
1 of 35
Download to read offline
Best Practices for Implementing
Enterprise BI Solution
Teo Lachev, Prologika
teo.lachev@prologika.com
Why BI projects fail
• 70-80% corporate BI projects fail (Gartner http://bit.ly/YRi028)
• Top reasons
 Poor communication between IT and Business
 Failure to ask the right questions
 Other reasons from my experience






Business doesn’t know about BI
Inexperience and lack of technical knowledge
“When all you have is a hammer…”
Data inaccuracy
Performance degradation with large datasets
Agenda
• Share best practices and lessons learned
 BI architecture
 Data warehouse design
 ETL
 Semantic layer
 Presentation layer

• Assumptions

 Experience with Microsoft BI and database design

• Microsoft case study

 Records Management Firm Saves $1 Million
http://bit.ly/15exUpM
 Most performance practices around biggish data
Ground rules
• Ask questions
• Turn cellphones off
• Tweet away (@tlachev #BestBI)
About me
• Consultant, author, and mentor with focus on Microsoft BI
• Owner of Prologika – BI consulting and training
company based in Atlanta (www.prologika.com)
• Microsoft SQL Server MVP for 10 years
• Leader of Atlanta BI group (atlantabi.sqlpass.org)
Used phased approach
• Identify critical success factors
• Break project into phases
• Phase 1
• Most important
• Scope it relatively small
• Sets foundation
• Business process to model
• First conformant dimensions
• A few fact tables
Use classic BI solution architecture
Transactional reporting

Dimension
Tables

Fact
Tables
ETL
Integration Services

Multidimensional
OR

Historical &
trend reporting

Tabular

Data Sources

Data is extracted from
data sources,
transformed, and
loaded into DW

Data Warehouse

Data is stored in
dimensional schema
consisting of dimension
and fact tables

Semantic Layer

Great performance
Business calculations
Single version of truth
Client support
Security
Isolation

Ad-hoc reports
Operational reports
Dashboards
Third party tools

Presentation Layer
Standard reporting
Ad-hoc reporting
Dashboards
Keep it simple!

Europe

NA

ASIA

Europe

Teo’s insight: Remove complexity
until it cannot be simplified
anymore

Asia
NA
Consider active-active clustering
Cluster

Database
server

SSAS
server
Check your environment
• I/O
 BACKUP DATABASE [ContosoRetailDW] TO DISK='NUL';

 Or use tools such as IOMeter or CrystalMark
 I/O should be above 500 MB/sec

• Network speed

 select * from <some fact table>
(consider discarding query results)
 Num rows/sec = row count/execution time (sec)
 Aim for > 100K rows/sec

• Virtualization

 Disk pass-through enabled
 Dedicated resources
Agenda
BI architecture
Data warehouse design
ETL
Semantic Layer
Presentation layer
Star schema is your best friend
• Your dimensional model is foundation
• Design it with end user in mind
• Avoid normalization
• Avoid summarized tables
• Use smartkey (YYYYMMDD) or
[date] keys for Date tables
• Use referential integrity

Teo’s insight: The fact that Tabular supports more
flexible relationships doesn’t mean that star
schema is obsolete - just the opposite.
Optimize physical storage
• Set database recovery to Simple
• Index considerations
 Cluster key on DateKey column in fact tables
 Other indexes as needed

 File groups
 File group per each large table
 Files on different drives
 Avoid using Primary file group
Use partitioning
• Partition large tables (above 50 Gb)
 Partition switching
 Better manageability
 Partition elimination when querying data

Good read: “Partitioned Table and Index Strategies Using SQL
Server 2008” whitepaper by Ron Talmage
Use compression
• Consider page compression above 1 TB
• 50-80% saving in disk space
• To estimate storage savings:

 Use SSMS Data Compression Wizard
 sp_estimate_data_compression_savings stored procedure

EXEC sp_estimate_data_compression_savings 'dbo', 'FactResellerSales', 1, NULL, 'PAGE'

Good read: “Data Compression: Strategy, Capacity Planning and
Best Practices” whitepaper by Sanjay Mishra
Agenda
BI architecture
Data warehouse design
ETL
Semantic Layer
Presentation layer
Consider merge design pattern
• More efficient than SSIS transforms
• More flexible than SSIS lookups
• Easier to maintain
stored procedure with T-SQL
merge statement

LOB

Staging
Database
Files

select a,b
from st1 inner join
st2 where...

incremental
extraction
Data Sources

Staging Database

work table

dimension or
fact table

Data Warehouse
Consider Operational Data Store
• ODS advantages
• Offloads transactional data
• Maintains data history
• Smarter “staging” database
Start_Date

End_Date

Store

Product

1/1/2010

5/1/2010

Atlanta

Mountain Bike 1

5/2/2010

3/8/2012

Atlanta

Mountain Bike 2

3/9/2012

12/31/9999

Norcross

Mountain Bike 2

…
Index considerations
• Eliminate read locks
• Indexes: ALLOW_PAGE_LOCKS = OFF and ALLOW_ROW_LOCKS = OFF

• View hints WITH (NOLOCK) or

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

• Drop non-clustered indexes and constraints
 With massive updates (10% or more)
 Enables non-logged load

 Consider COLUMNSTORE indexes when queries
aggregate data
Take advantage of partitioning
• Consider partition switching
 Fast incremental load
 Parallel partition load
 Faster updates

• Use Manage Partition Wizard to generate
 Switch in/out scripts
 Staging table
 Sliding window
For parallel partition load, change the table lock escalation
ALTER TABLE … SET ( LOCK_ESCALATION = AUTO)
To find the table lock escalation:
SELECT lock_escalation_desc FROM sys.tables WHERE name = ‘<table name>'
Optimize big joins
• Set OPTION (HASH JOIN or LOOP JOIN)
http://bit.ly/108HuHR
Agenda
BI architecture
Data warehouse design
ETL
Semantic Layer
Presentation layer
BI Semantic Layer
Third-Party BI
Applications

Reporting Services
Reports

PowerPivot
Applications

Excel
Workbooks

MDX

SharePoint
Dashboards &
Scorecards

DAX

Multidimensional

Tabular

MDX

DAX

MOLAP

ROLAP

xVelocity
(VertiPaq)

DirectQuery

Files

OData
Feeds
Choose semantic layer wisely
• Decision checkpoints
• Data volumes
• Complexity

• Scenarios for considering Multidimensional
 Data warehousing
 Large data volumes
 Complex models

• Scenarios for considering Tabular
 Promoting PowerPivot models to organizational models
 Rapid development for simple models
 Transactional reporting? (be careful)
Optimize Multidimensional
• Don’t be afraid of biggish data
• Avoid complex scope assignments
• Centralize business logic
• Consider fast storage
• Consider single cube
Tabular Considerations
• Improve your design experience http://bit.ly/106iKjt
• Small dataset during dev
• Disable automatic calculation

• Remove unnecessary columns
• Be careful about transactional reporting
• No cross-fact table support
• Performance degradation with
big data - http://bit.ly/136h60U

Dim Date

Fact Orders

Fact Receipts
Partition when makes sense
• Partition large measure groups (above 100 million)
 Mostly management technique
 Useful for incremental processing
 Partition slice: ~50 million

• Automate with partition generator
http://bit.ly/partitiongenerator
• Use SQL views to wrap tables
When to use self-service BI?
• Know your end users
 Power users
 Financial analysts

• When self-service BI make sense?
 Waiting for organizational BI to happen
 Ideate and promote lateral thinking

 Consider 80/20 rule
 80% organizational BI
 20% self-service BI
Agenda
BI architecture
Data warehouse design
ETL
Analytical layer
Presentation layer
Dashboards
“A dashboard is a visual display of the most important information needed to achieve one or more objectives;
consolidated and arranged on a single screen so the information can be monitored at a glance.”
Stephen Few, “Information Dashboard Design” book

From “Information Dashboard Design” book
PerformancePoint in real life
Power View in real life
Excel Services in SharePoint 2013
Consider your dashboard options
Technology

Pros

Cons

PerformancePoint

Designed for scorecards and KPIs
Supporting views
(reports, Excel spreadsheets, PP reports)
Decomposition tree
Customizable

BI pro-oriented
No “wow” effect

Power View

Highly interactive
Easy to implement
End user-oriented

No extensibility
No mobile support yet (but promised)
Currently requires Silverlight
(MS working on HTML5)

Excel Services

Use Excel pivot reports
Easy to implement
Reports updatable in SP 2013

Reports not updatable in SP 2010
No “wow” effect

Reporting Services reports

Highly customizable
Rich visualizations

Require report experience
Reports not updatable
Drillthrough requires new reports
Summary

• I shared proven practices and tips from past experience
• Keep things simple but have sound design
• How to contact me:
•
•
•
•

Email: teo.lachev@prologika.com
Web: www.prologika.com
Blog: http://prologika.com/cs/blogs/
Newsletter: http://prologika.com/Newsroom/News.aspx

More Related Content

Viewers also liked

Best practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power biBest practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power biSatya Shyam K Jayanty
 
backbase-cxp-datasheet
backbase-cxp-datasheetbackbase-cxp-datasheet
backbase-cxp-datasheetMykola Bova
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligences.poles
 
GRUPO MARZO PROFESSIONAL SERVICES
GRUPO MARZO PROFESSIONAL SERVICESGRUPO MARZO PROFESSIONAL SERVICES
GRUPO MARZO PROFESSIONAL SERVICESLeopoldo Vizoso
 
Best Practices to Deliver BI Solutions
Best Practices to Deliver BI SolutionsBest Practices to Deliver BI Solutions
Best Practices to Deliver BI SolutionsJames Serra
 
Inteligancia de negocios
Inteligancia de negociosInteligancia de negocios
Inteligancia de negociosEdgar Barrios
 
Business intelligence architecture
Business intelligence architectureBusiness intelligence architecture
Business intelligence architectureSlava Kokaev
 
Open Source Business Intelligence 2013 (spanish)
Open Source Business Intelligence 2013 (spanish)Open Source Business Intelligence 2013 (spanish)
Open Source Business Intelligence 2013 (spanish)Stratebi
 
Agile BI - SYBIS
Agile BI - SYBISAgile BI - SYBIS
Agile BI - SYBISIman Ef
 
Asian architecture Paper Presentation
Asian architecture Paper PresentationAsian architecture Paper Presentation
Asian architecture Paper PresentationIvy Yee
 
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce DataLearn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce DataNetwoven Inc.
 
Bi Architecture And Conceptual Framework
Bi Architecture And Conceptual FrameworkBi Architecture And Conceptual Framework
Bi Architecture And Conceptual FrameworkSlava Kokaev
 
The Future of Omni-Channel Banking
The Future of Omni-Channel BankingThe Future of Omni-Channel Banking
The Future of Omni-Channel BankingBackbase
 
Exploring Architectures for Fast and Easy Development of Immersive Learning S...
Exploring Architectures for Fast and Easy Development of Immersive Learning S...Exploring Architectures for Fast and Easy Development of Immersive Learning S...
Exploring Architectures for Fast and Easy Development of Immersive Learning S...Rob Nadolski
 
SAP BI Implementation
SAP BI ImplementationSAP BI Implementation
SAP BI ImplementationRahul Bindroo
 

Viewers also liked (16)

Best practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power biBest practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power bi
 
Jak znaleźć filmy TED - instrukcja "krok po kroku" / Noemi Gryczko
Jak znaleźć filmy TED - instrukcja "krok po kroku" / Noemi GryczkoJak znaleźć filmy TED - instrukcja "krok po kroku" / Noemi Gryczko
Jak znaleźć filmy TED - instrukcja "krok po kroku" / Noemi Gryczko
 
backbase-cxp-datasheet
backbase-cxp-datasheetbackbase-cxp-datasheet
backbase-cxp-datasheet
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
GRUPO MARZO PROFESSIONAL SERVICES
GRUPO MARZO PROFESSIONAL SERVICESGRUPO MARZO PROFESSIONAL SERVICES
GRUPO MARZO PROFESSIONAL SERVICES
 
Best Practices to Deliver BI Solutions
Best Practices to Deliver BI SolutionsBest Practices to Deliver BI Solutions
Best Practices to Deliver BI Solutions
 
Inteligancia de negocios
Inteligancia de negociosInteligancia de negocios
Inteligancia de negocios
 
Business intelligence architecture
Business intelligence architectureBusiness intelligence architecture
Business intelligence architecture
 
Open Source Business Intelligence 2013 (spanish)
Open Source Business Intelligence 2013 (spanish)Open Source Business Intelligence 2013 (spanish)
Open Source Business Intelligence 2013 (spanish)
 
Agile BI - SYBIS
Agile BI - SYBISAgile BI - SYBIS
Agile BI - SYBIS
 
Asian architecture Paper Presentation
Asian architecture Paper PresentationAsian architecture Paper Presentation
Asian architecture Paper Presentation
 
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce DataLearn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
 
Bi Architecture And Conceptual Framework
Bi Architecture And Conceptual FrameworkBi Architecture And Conceptual Framework
Bi Architecture And Conceptual Framework
 
The Future of Omni-Channel Banking
The Future of Omni-Channel BankingThe Future of Omni-Channel Banking
The Future of Omni-Channel Banking
 
Exploring Architectures for Fast and Easy Development of Immersive Learning S...
Exploring Architectures for Fast and Easy Development of Immersive Learning S...Exploring Architectures for Fast and Easy Development of Immersive Learning S...
Exploring Architectures for Fast and Easy Development of Immersive Learning S...
 
SAP BI Implementation
SAP BI ImplementationSAP BI Implementation
SAP BI Implementation
 

Recently uploaded

Introduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationIntroduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationKnoldus Inc.
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTopCSSGallery
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxSatishbabu Gunukula
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameKapil Thakar
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsDianaGray10
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Alkin Tezuysal
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024Brian Pichman
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4DianaGray10
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarThousandEyes
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdfThe Good Food Institute
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfTejal81
 
LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0DanBrown980551
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)IES VE
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch TuesdayIvanti
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInThousandEyes
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxNeo4j
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox
 
.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptxHansamali Gamage
 
Graphene Quantum Dots-Based Composites for Biomedical Applications
Graphene Quantum Dots-Based Composites for  Biomedical ApplicationsGraphene Quantum Dots-Based Composites for  Biomedical Applications
Graphene Quantum Dots-Based Composites for Biomedical Applicationsnooralam814309
 

Recently uploaded (20)

Introduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationIntroduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its application
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development Companies
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptx
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First Frame
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projects
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? Webinar
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
 
LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0LF Energy Webinar - Unveiling OpenEEMeter 4.0
LF Energy Webinar - Unveiling OpenEEMeter 4.0
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch Tuesday
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
 
.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx
 
Graphene Quantum Dots-Based Composites for Biomedical Applications
Graphene Quantum Dots-Based Composites for  Biomedical ApplicationsGraphene Quantum Dots-Based Composites for  Biomedical Applications
Graphene Quantum Dots-Based Composites for Biomedical Applications
 

Best Practices for Implementing Enterprise BI Solution

  • 1. Best Practices for Implementing Enterprise BI Solution Teo Lachev, Prologika teo.lachev@prologika.com
  • 2. Why BI projects fail • 70-80% corporate BI projects fail (Gartner http://bit.ly/YRi028) • Top reasons  Poor communication between IT and Business  Failure to ask the right questions  Other reasons from my experience      Business doesn’t know about BI Inexperience and lack of technical knowledge “When all you have is a hammer…” Data inaccuracy Performance degradation with large datasets
  • 3. Agenda • Share best practices and lessons learned  BI architecture  Data warehouse design  ETL  Semantic layer  Presentation layer • Assumptions  Experience with Microsoft BI and database design • Microsoft case study  Records Management Firm Saves $1 Million http://bit.ly/15exUpM  Most performance practices around biggish data
  • 4. Ground rules • Ask questions • Turn cellphones off • Tweet away (@tlachev #BestBI)
  • 5. About me • Consultant, author, and mentor with focus on Microsoft BI • Owner of Prologika – BI consulting and training company based in Atlanta (www.prologika.com) • Microsoft SQL Server MVP for 10 years • Leader of Atlanta BI group (atlantabi.sqlpass.org)
  • 6. Used phased approach • Identify critical success factors • Break project into phases • Phase 1 • Most important • Scope it relatively small • Sets foundation • Business process to model • First conformant dimensions • A few fact tables
  • 7. Use classic BI solution architecture Transactional reporting Dimension Tables Fact Tables ETL Integration Services Multidimensional OR Historical & trend reporting Tabular Data Sources Data is extracted from data sources, transformed, and loaded into DW Data Warehouse Data is stored in dimensional schema consisting of dimension and fact tables Semantic Layer Great performance Business calculations Single version of truth Client support Security Isolation Ad-hoc reports Operational reports Dashboards Third party tools Presentation Layer Standard reporting Ad-hoc reporting Dashboards
  • 8. Keep it simple! Europe NA ASIA Europe Teo’s insight: Remove complexity until it cannot be simplified anymore Asia NA
  • 10. Check your environment • I/O  BACKUP DATABASE [ContosoRetailDW] TO DISK='NUL';  Or use tools such as IOMeter or CrystalMark  I/O should be above 500 MB/sec • Network speed  select * from <some fact table> (consider discarding query results)  Num rows/sec = row count/execution time (sec)  Aim for > 100K rows/sec • Virtualization  Disk pass-through enabled  Dedicated resources
  • 11. Agenda BI architecture Data warehouse design ETL Semantic Layer Presentation layer
  • 12. Star schema is your best friend • Your dimensional model is foundation • Design it with end user in mind • Avoid normalization • Avoid summarized tables • Use smartkey (YYYYMMDD) or [date] keys for Date tables • Use referential integrity Teo’s insight: The fact that Tabular supports more flexible relationships doesn’t mean that star schema is obsolete - just the opposite.
  • 13. Optimize physical storage • Set database recovery to Simple • Index considerations  Cluster key on DateKey column in fact tables  Other indexes as needed  File groups  File group per each large table  Files on different drives  Avoid using Primary file group
  • 14. Use partitioning • Partition large tables (above 50 Gb)  Partition switching  Better manageability  Partition elimination when querying data Good read: “Partitioned Table and Index Strategies Using SQL Server 2008” whitepaper by Ron Talmage
  • 15. Use compression • Consider page compression above 1 TB • 50-80% saving in disk space • To estimate storage savings:  Use SSMS Data Compression Wizard  sp_estimate_data_compression_savings stored procedure EXEC sp_estimate_data_compression_savings 'dbo', 'FactResellerSales', 1, NULL, 'PAGE' Good read: “Data Compression: Strategy, Capacity Planning and Best Practices” whitepaper by Sanjay Mishra
  • 16. Agenda BI architecture Data warehouse design ETL Semantic Layer Presentation layer
  • 17. Consider merge design pattern • More efficient than SSIS transforms • More flexible than SSIS lookups • Easier to maintain stored procedure with T-SQL merge statement LOB Staging Database Files select a,b from st1 inner join st2 where... incremental extraction Data Sources Staging Database work table dimension or fact table Data Warehouse
  • 18. Consider Operational Data Store • ODS advantages • Offloads transactional data • Maintains data history • Smarter “staging” database Start_Date End_Date Store Product 1/1/2010 5/1/2010 Atlanta Mountain Bike 1 5/2/2010 3/8/2012 Atlanta Mountain Bike 2 3/9/2012 12/31/9999 Norcross Mountain Bike 2 …
  • 19. Index considerations • Eliminate read locks • Indexes: ALLOW_PAGE_LOCKS = OFF and ALLOW_ROW_LOCKS = OFF • View hints WITH (NOLOCK) or SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED • Drop non-clustered indexes and constraints  With massive updates (10% or more)  Enables non-logged load  Consider COLUMNSTORE indexes when queries aggregate data
  • 20. Take advantage of partitioning • Consider partition switching  Fast incremental load  Parallel partition load  Faster updates • Use Manage Partition Wizard to generate  Switch in/out scripts  Staging table  Sliding window For parallel partition load, change the table lock escalation ALTER TABLE … SET ( LOCK_ESCALATION = AUTO) To find the table lock escalation: SELECT lock_escalation_desc FROM sys.tables WHERE name = ‘<table name>'
  • 21. Optimize big joins • Set OPTION (HASH JOIN or LOOP JOIN) http://bit.ly/108HuHR
  • 22. Agenda BI architecture Data warehouse design ETL Semantic Layer Presentation layer
  • 23. BI Semantic Layer Third-Party BI Applications Reporting Services Reports PowerPivot Applications Excel Workbooks MDX SharePoint Dashboards & Scorecards DAX Multidimensional Tabular MDX DAX MOLAP ROLAP xVelocity (VertiPaq) DirectQuery Files OData Feeds
  • 24. Choose semantic layer wisely • Decision checkpoints • Data volumes • Complexity • Scenarios for considering Multidimensional  Data warehousing  Large data volumes  Complex models • Scenarios for considering Tabular  Promoting PowerPivot models to organizational models  Rapid development for simple models  Transactional reporting? (be careful)
  • 25. Optimize Multidimensional • Don’t be afraid of biggish data • Avoid complex scope assignments • Centralize business logic • Consider fast storage • Consider single cube
  • 26. Tabular Considerations • Improve your design experience http://bit.ly/106iKjt • Small dataset during dev • Disable automatic calculation • Remove unnecessary columns • Be careful about transactional reporting • No cross-fact table support • Performance degradation with big data - http://bit.ly/136h60U Dim Date Fact Orders Fact Receipts
  • 27. Partition when makes sense • Partition large measure groups (above 100 million)  Mostly management technique  Useful for incremental processing  Partition slice: ~50 million • Automate with partition generator http://bit.ly/partitiongenerator • Use SQL views to wrap tables
  • 28. When to use self-service BI? • Know your end users  Power users  Financial analysts • When self-service BI make sense?  Waiting for organizational BI to happen  Ideate and promote lateral thinking  Consider 80/20 rule  80% organizational BI  20% self-service BI
  • 29. Agenda BI architecture Data warehouse design ETL Analytical layer Presentation layer
  • 30. Dashboards “A dashboard is a visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so the information can be monitored at a glance.” Stephen Few, “Information Dashboard Design” book From “Information Dashboard Design” book
  • 32. Power View in real life
  • 33. Excel Services in SharePoint 2013
  • 34. Consider your dashboard options Technology Pros Cons PerformancePoint Designed for scorecards and KPIs Supporting views (reports, Excel spreadsheets, PP reports) Decomposition tree Customizable BI pro-oriented No “wow” effect Power View Highly interactive Easy to implement End user-oriented No extensibility No mobile support yet (but promised) Currently requires Silverlight (MS working on HTML5) Excel Services Use Excel pivot reports Easy to implement Reports updatable in SP 2013 Reports not updatable in SP 2010 No “wow” effect Reporting Services reports Highly customizable Rich visualizations Require report experience Reports not updatable Drillthrough requires new reports
  • 35. Summary • I shared proven practices and tips from past experience • Keep things simple but have sound design • How to contact me: • • • • Email: teo.lachev@prologika.com Web: www.prologika.com Blog: http://prologika.com/cs/blogs/ Newsletter: http://prologika.com/Newsroom/News.aspx