3. 3
Agenda
About Enterprise Products Partners
About the SCADA Infrastructure and Cyber Security Team
Where We Were
Where We Are
Where We Are Headed
What You Can Do Too
6. 6
How We Got Started
Recognizing the operational
differences between OT and IT
Recognizing the technical similarities
between OT and IT
Supporting the SCADA Systems
before Splunk
Difficulties meeting SLA’s
(Regulatory)
7. 7
Splunk Enterprise at EPD
AlertsMessages Metrics ChangesScriptsConfiguration
s
Log Files
DatabasesNetworks Servers
Virtual
Machines
Custom
Applications
Security
Tickets
Web
Servers
• Infrastructure and
Applications Ops
• Cyber Security
• Improving SLAs
10. 10
Cyber Security
Protecting Critical Infrastructure Against Threats
• Palo-Alto project
• Supporting VPN environments
• Monitoring firewalls for
alarming activity
• Monitoring of industrial
protocols
11. 11
Improved SLA’s
Adhering to PHMSA requirements with Splunk Enterprise
• Aware of issues within 30 seconds
• Rigorous escalations
• Prescriptive alerting
• Resolution in 4 minutes or less
13. 13
Top Takeaways
OT and IT are both similar and different
Best practices for managing operations, cyber security
and SLA’s with Splunk Enterprise
How you too can be a SCADA superhero with Splunk
Enterprise
Overview of Combined Operations
Pipelines
51,000 miles of natural gas, NGL crude oil, refined products and petrochemical pipelines
Storage (Salt Dome)
225 million barrels (MMBbls) of NGL, refined products and crude oil storage capacity
14 billion cubic feet (Bcf) of natural gas storage capacity
Natural Gas Processing
24 natural gas processing plants
Marine Services
63 tow boats
131 barges
Fractionation
22 NGL and propylene fractionators
Platforms
6 offshore hub platforms
NGL Import/Export Terminals
Houston Ship Channel Import Terminal offloading capacity – 14 MBbls/hr
Houston Ship Channel Export Terminal loading capacity – 7.5 MMBbls/mo
Before Splunk:
Basically, control desks would notice inactivity, data not updating, or all data disappears. Talk to each other, shift supervisor. Call NOC, NOC calls SCADA on-call, escalate to SIG. Level of Service Agreement is 5 minutes of downtime per PHIMSA.
99% of data is datacenter based. 1% are workstations. Alarms and events.
Servers: Universal Forwarders – using deployment servers, apps: Windows, WMI, AD, not gathering data yet from the SCADA applications, but is monitoring failover, etc of SCADA applications from WMI. Plan is to test primary SCADA systems (events, alarms, etc). First priority: Errors and warnings (from SCADA process). Will also be adding connectivity via DBConnect 2 to SQL database (events, alarms, authentication, SCADA DB changes), communications statistics.
Infrastructure and Application Operations
Biggest piece is insight into when services fail and the IT systems don’t catch it. IT org uses SCOM and there are pieces of information that SCOM misses and Splunk catches. Never missed a significant issue with Splunk.
Worked through a number of issues in new system – have to bring back system as quickly as possible – for example – issue shuts down control desk. They are a safety service solution. Quickly act upon this.
Before Splunk:
Basically, control desks would notice inactivity, data not updating, or all data disappears. Talk to each other, shift supervisor. Call NOC, NOC calls SCADA on-call, escalate to SIG. Level of Service Agreement is 5 minutes of downtime per PHIMSA.
After Splunk :
They know within 30 seconds. Rigorous escalation group – after hours, if nobody calls in and responds, every time they have had an issue they’ve been able to resolve in 4 mins or less.
Immediately alerted on issues.
Alerts are prescriptive alerts.
Email contains diagnosis, location, etc.
Security
Palo Alto Project
Looking for ways to do more with less. Palo Alto SCADA monitoring will deliver data to Splunk
Support VPN Environment
IDS
Also watching SCADA firewalls (between SCADA and Corporate LAN)
Various types of protocols across firewall.
Looking at types of protocol data
Modbus
Fisher-ROC
Allen Bradley (All 4 ver)
Example of a few Alerts we have setup.
As well as an email.
Continue to get a handle on Cyber Security in ICS with Enterprise Security
Add host monitoring to correlate with OS level information to help understand system performance.
Also looking into “System” collections to show system status in Splunk
AppEnsure may be able to be used to look for Critical Data processing bottlenecks or control points.