Presented @ 71st Annual Instrumentation and Automation Symposium for the Process Industries, College Station, Texas, January 26-27, 2016
Determining the overall health and security of an industrial control system (ICS) network is currently done by looking at the negative case. If the network infrastructure devices indicate that all the devices are connected and communicating, then the network must be operating correctly. If the controllers indicate that they are able to communicate with the other devices in the system, then the system must be operating correctly. If the security system is not indicating any security events, then the system must be operating correctly. In each of these cases, the assumption is that the system is operating correctly if there are no errors or events being indicated by any of the devices. In reality, the actual health and security of the system can only be determined by positive conditions. The communication streams need to be measured to determine that they are operating within certain limits based upon a desires set of conditions, like rate and maximum latency. Many controllers keep track of these factors for real-time communications, however they are often only recorded as averages and not high-fidelity measurements.
This paper presents an approach to analyzing the real-time network traffic performance of an ICS by measuring the jitter and latency associated with individual network traffic streams in the system. By using statistical and mathematical analysis of the high-fidelity jitter and latency data, a network reliability factor can be determined and used to indicate the health of those traffic streams. The author will present a method to combine the individual network reliability factors into a network reliability monitoring system. Lastly, the author will discuss how network reliability monitoring can be used to indicate potential security problems by observing the network traffic patterns.
Nell’iperspazio con Rocket: il Framework Web di Rust!
Network Reliability Monitoring Using Statistical Modeling and Data Analysis to Measure the Health and Security of ICS
1. 71st Annual Instrumentation and Automation Symposium for the Process Industries
Network Reliability Monitoring Using
Statistical Modeling and Data Analysis
to Measure the Health and Security of
ICS
Jim Gilsinn
Kenexis
2. 71st Annual Instrumentation and Automation Symposium for the Process Industries
Jim Gilsinn
• Senior Investigator, Kenexis Consulting
– ICS Network & Security Assessments &
Designs
– Developer, Dulcet Analytics, Reliability
Monitoring Tool
• International Society of Automation (ISA)
– ISA99 Committee, Co-Chair (ISA/IEC 62443
Standard Series)
– ISA99-WG2, Co-Chair (ICS Security Program)
Kenexis
3. 71st Annual Instrumentation and Automation Symposium for the Process Industries
Overview
• Introduction
• Communications Method Affects Metrics
• Network Security Monitoring
• Communications in ICS/SCADA Networks
• What Can Network Reliability Monitoring
Show?
• When & How to Test
• ICS/SCADA Performance Metrics
• MITM Example
• Summary
4. 71st Annual Instrumentation and Automation Symposium for the Process Industries
Introduction
• Determinism is one key req. for ICS/SCADA
• Determinism can be affected by many factors:
– Individual device performance
– Network performance
– Intra- & inter-system interactions
– Security settings
• Some factors can be planned for
• Some factors need to be measured in place
• Network measurements need to be tailored
for ICS/SCADA
5. 71st Annual Instrumentation and Automation Symposium for the Process Industries
Comm. Method Affects Metrics
Master/Slave Publish/Subscribe Report by Exception
6. 71st Annual Instrumentation and Automation Symposium for the Process Industries
What is NSM?
• “the collection, analysis, and escalation of
indications and warnings to detect and
respond to intrusions.”
• “a way to find intruders on your network and
do something about them before they
damage your enterprise.”
The Practice of Network Security Monitoring, Richard Bejtlich
7. 71st Annual Instrumentation and Automation Symposium for the Process Industries
When NSM Won’t Work?
• “…if you can’t observe the traffic that you
care about, NSM will not work well.”
• “Node-to-node activity, though, is largely
unobserved at the network level.”
The Practice of Network Security Monitoring, Richard Bejtlich
8. 71st Annual Instrumentation and Automation Symposium for the Process Industries
Example ICS/SCADA Network:
Upper-Level Architecture
• Most Traffic
Crosses
Zone
Boundaries
• Less ICS-
Specific
Protocols
• More
Common
Platforms
9. 71st Annual Instrumentation and Automation Symposium for the Process Industries
Example ICS/SCADA Network:
Lower-Level Architecture
• Most Traffic
Remains
Within
Zone
• Mostly ICS-
Specific
Protocols
• ICS-
Specific
Platforms
10. 71st Annual Instrumentation and Automation Symposium for the Process Industries
~1ms Mean Measured Packet Interval
±10µs Jitter*
Beat Patter @ ~30s
Total Test ~65s
So… What Can You See?
Expected Frequency *Jitter is Variation From Expected Frequency
11. 71st Annual Instrumentation and Automation Symposium for the Process Industries
So… What Can You See?
• OS & application operations
– Garbage collection
– Antivirus checks & updates
– On-screen operator commands
• Network anomalies
– Network EMI interference
– Signal degradation
– Flaky connections
• Security-related incidents
12. 71st Annual Instrumentation and Automation Symposium for the Process Industries
When & How To Test
• Baseline Testing
– FAT, SAT, Commissioning
– After major changes
• Periodic Testing vs. Real-Time Testing
• Automated Testing & Analysis
13. 71st Annual Instrumentation and Automation Symposium for the Process Industries
ICS/SCADA Performance Metrics
• Easy
– Mean
– Minimum
– Maximum
• Medium
– Standard Deviation
• More Complex and/or Compute Intensive
– FFT
– Convolution
– Correlation
15. 71st Annual Instrumentation and Automation Symposium for the Process Industries
Summary
• NSM is good
– If you are doing it great
– If not, maybe you should
• NSM can’t detect everything, especially for
ICS/SCADA networks
• There are ways to measure network reliability in
the lower layers
– ICS/SCADA networks are particularly well suited
to this
– Relatively simple metrics are good enough to start
• Testing can show more than just security events