Enterprise Data and Analytics Architecture Overview for Electric Utility

Enterprise Data and Analytics
Architecture Overview
for
Electric Utility
Dr. Prajesh Bhattacharya
enSustain
Copyright enSustain

1
The Overall Architecture
Copyright enSustain

Data Warehouse
(ONLY data required for high
performing production
reporting)
Enterprise Nomenclature
Data Lake
(ALL data)
Enterprise Nomenclature
Data Translation Layer (MDMS/EI)
Data Translation Layer (MDMS/EI)
DMS & OMS
Customer Data
Smart Meter
(metadata, readings)
Asset Data
(Location, config)
Financial
Data
Data Historian
SCADA
DG metadata
DG generation
data
HR Data
Usual EDW
process and
structure
Usual EDW
process and
structure
Discovery and
indexing/Tagging
Discovery and
indexing/Tagging
Production
Reports
Projects Data Explorers
(Engineers, Data scientists)
Weather Data
Misc. Sensor
head ends
Security
Data
Transmission
Planning Data
Maintenance
Data
Demand
Response Data
Transmission OE
and Dispatch Data
EMS
Transmission
Market Data
IT Asset Ops Data
IT Support Data
Project
Documents
Marketing &
Sales Data
Catch-All Other
Applications
Email &
chat logs
Facility
Data
Fleet
DataCopyright enSustain

Possible Point-To-Point Exceptions
Purpose oriented connection. Example:
oHistorian facilitates connection with SCADA
o EMS  SCADA connection is latency-sensitive
Application requiring access to only one system
oDMS applications running off of DMS data
o Historian applications running off of Historian data
Copyright enSustain

2
Implementation
Copyright enSustain

The Approach for Implementation
Main Challenges
Siloed data
Solution Part 1:
Standard Data
Model
Solution Part 2:
User view of unified
data
Lack of analytics
ideas
Solution:
Close partnership
between IT and
business
Lack of budget
Solution Part 1:
Tax each new
project
Solution Part 2:
Take baby steps
Copyright enSustain

Necessary Condition for Success
• At the beginning, implement the new mechanism, ONLY to serve the new
requirements
• Keep the existing connections working and unaffected
• Eventually, some of the existing connections will be deemed not-required, by the
business
• The rest of the existing connections can be converted as part of application
maintenance/overhaul/upgrade, but not in the beginning phase of the initiative
Do NOT touch the existing and the working systems first
• Do not try to implement all the necessary new components at once.
• Good quality on small scope is better than mediocre quality on large scope.
• It might require more overhead, but it is often worth.
Scope the smallest possible piece and do it well
Copyright enSustain

Possible Steps for Implementation
A new data
connectivity
requirement
comes in
Identify the
source system
Define the
enterprise
nomenclature
for the source
system to align
with industry
standard
Load
MDMS/EI with
the dictionary
Configure EI to
act as the data
virtualization
layer for the
source system
Release for
production
use with
appropriate
support
mechanism
Milestone: One project is now using this new mechanism for one source system
Repeat 1 for every new
data connectivity
request
As more source systems
are brought into the
scope, resolve
discrepancies, if any
arises
The virtualization layer
might experience
performance issue as
data load increases
Research and Plan the
Data Lake
For every new data source
implementation for the
virtualization, implement
the corresponding ETL for
Data Lake
Open the Data lake to
users that prefer getting
their data from the Data
Lake (delayed but faster)
over virtualization
Implement Data Lake
Analytics (say ML based
on Spark) for a single use
case
Copyright enSustain

3
MDMS, EI, Data Virtualization,
Data Warehouse
Copyright enSustain

Skip This Section
Most utilities already use these systems and are familiar with them
Copyright enSustain
Hence, we will not discuss them
For specific questions, please contact prajesh@ensustain.com

4
Data Lake
Copyright enSustain

Why the Data Lake?
• Some of the SOR systems might not be capable of handling as
much data request
• Access to some of the SOR systems might not be practical
• Implementation of data quality check on virtualized data is
hard (at the least, it would slow down queries)
• Data travel over network: larger in a virtualized environment
than in a Data Lake designed and used in a specific way
• Bottom line: go for Data Lake only if it is foreseen to be needed
If the MDMS/EI layer virtualizes the data, then access to standardized data across the
enterprise is already established.
What additional value does the Data Lake bring?
Data Lake – not the immediate need, but the eventual destination
Copyright enSustain

Data Lake: Market Offering Landscape
Copyright enSustain

Data Lake: Getting Data Into the Lake
Copyright enSustain
HDFS
Enterprise Data
 Shared across the company based on
security policy
 Fully managed and maintained
 Tight SLA
 100% Enterprise taxonomy based tagging
User data
 Results of ad-hoc analyses
 Some maintenance/control/SLA
 Folksonomy based tagging
Project/Group Data
 Enterprise standards might be too
restrictive to fulfill the requirements of the
project
 Shared among a handful of users
 Medium maintenance/control/SLA
 Folksonomy + some governance
Data
Governance
Tagging
Tool
MDMS
Data
Loader
Streaming
Data
Manual
Data
D
A
T
A
S
O
U
R
C
E
S

Hadoop Ecosystem Relevant To Utility
Copyright enSustain
H
D
F
S
YARN
Map-Reduce Application in Java
Hive
Spark Streaming
Spark SQL
Spark ML
Sqoop
Oozie Falcon
Hadoop native client
Storm
QueryIO
Waterline
Data
Attivio
Apache
Atlas
FUNCTIONALITY
COLOR LEGEND
Data Loading
Job Management
Data Governance
Data Reading
Map-Reduce
Data Storage
Data Lake
Vendor Solutions

5
Analytics
Copyright enSustain

Taxonomy 3
Taxonomy 2
Taxonomy 1
The Analytics Tool Landscape
Analytics
Tools
Production
Data Write-
back
Read-only
Project (semi-
production)
Data Write-
back
Read-only
Ad-hoc
Data Write-
back
Read-only
Analytics Tools
Managed
(Server based)
Unmanaged
(Desktop based)
Analytics
Tools
Coding heavy
Configuration
heavy
Copyright enSustain

Sample Analytics Opportunities …
Copyright enSustain

6
Appendix
Copyright enSustain

References
• http://ceur-ws.org/Vol-1497/PoEM2015_ShortPaper4.pdf
• http://smartgrid.epri.com/doc/Utility%20Enterprise%20Architecture%20Best%20Practices%20-
%20webcast.pdf
• http://www.navigantresearch.com/wordpress/wp-content/uploads/2011/10/SGEA-11-Brochure.pdf
• http://www.gridwiseac.org/pdfs/forum_papers/114_127_paper_final.pdf
• http://www.iec.ch/smartgrid/standards/
• https://www.boozallen.com/content/dam/boozallen/documents/Data_Lake.pdf
• Data Warehousing in the Age of Big Data, Krish Krishnan
• https://es.slideshare.net/hortonworks/hortonworks-and-waterline-data-webinar
• http://www.ibmbigdatahub.com/blog/charting-data-lake-rethinking-data-models-data-lakes
• https://www.slideshare.net/fabien_gandon/ontologies-in-computer-science-and-on-the-web
• http://www.ibmbigdatahub.com/blog/charting-data-lake-rethinking-data-models-data-lakes
• https://upside.tdwi.org/articles/2016/03/23/data-lake-become-swamp-1.aspx
• Many other sources
• Indigenous experiments
• Real-world experience
Copyright enSustain

Thank you!
Questions?
prajesh@ensustain.com
Copyright enSustain

Enterprise Data and Analytics Architecture Overview for Electric Utility

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Enterprise Data and Analytics Architecture Overview for Electric Utility

Similar to Enterprise Data and Analytics Architecture Overview for Electric Utility (20)

Recently uploaded

Recently uploaded (20)

Enterprise Data and Analytics Architecture Overview for Electric Utility