IT Operation Analytic: Using Anomaly Detection , Unsupervised Machine Learning, to distinct normal and abnormal behavior and enhance efficiency of SIEM detection and alert capability.
2. Who am I ?
• Santisook Limpeeticharoenchot .
• Telecom Engineering, Business&Economic.
• 16 years ago : Network Engineer. Implemented NW, Security
for ISP/Telco, Bank, State Enterprise and Government.
• 8 years ago : Managed Service Network & Outsourcing,
Sales&Business Development.
• 5 years ago : Started Machine Data Analytic.
• Current : Sales Director@Stelligence Co.,Ltd : Operational
Intelligence,Big Data &IT Operation Analytic Company.
• Interested in : Big Data, Network & Security, Innovation &
Entrepreneur, Math, BizModel, StartupEcosystem, …
4. What are CIO priorities ?
Pro-active alerting and troubleshooting
SLA
Performance monitoring, trending and tuning
User experience
Detect abnormal behaviors and data exfiltration
Data security
Understanding demographics, behaviors and patterns
Business
intelligence and
analytics
5. Forces driving need for Operation Analytic
• More Data, More Complexity, New Technology
and New Attack
• Dynamic , Big Impact, Required high skill resources
• Lack of completed visibility
• Required Actionable information
7. How to get visibility ?
http://www.datacenterjournal.com/time-analytics-delivers-operations/
8. Big Data Anywhere
“89% of business leaders believe Big Data will revolutionize business
operations in the same way the Internet did”
“83% have pursued Big Data projects in order to seize a competitive
edge”
“Global Big Data and Analytics market will reach $125B in hardware,
software and services revenue this year”
“Banking, communications, media, utilities and wholesale trade
increased their use of Big Data analytics the most in the last 12
months”
10. Big Data Anywhere
BIG DATA "USE CASES" WITHIN BUSINESS
48% Customer Analytics
21% Operational Analytics
12% Fraud and Compliance
10% New Product & Service
Innovation
10% Enterprise Data Warehouse
Optimization
12%
10%
Source : Datameer: Big Data: A Competitive Weapon for the Enterprise.
48%
21%
10%
11. ITOA is IT Operations' next big thing
ITOA is 'On the Rise' on the hype cycle, and
expects it to accelerate to integrate into
mainstream IT operations in the next few
years, with the emergence of an entire
category of IT Operations Analytics
products and services.
Hype Cycle for IT Infrastructure Availability
and Performance Management, 2015
It refers to a set of processes and technologies that:
• Helps discover complex patterns in high volumes of IT system usage and performance data
• Helps to identify problems and system behaviors faster, so as to rectify the problem(s) before they can arise
• Automates the process of collecting, organizing, analyzing, and identifying patterns in a highly distributed,
diverse and continuously changing application data environment.
• Ensures an improved IT system performance and continuity
• Relies heavily on Big Data Analytics.
12. What’s is ITOA?
Streamline data analysis,
automate correlations,
and increase productivity
React quickly to events /
data generated by
infrastructure, software,
services, user devices
Optimize service levels
and workload allocations
Unleash innovation and
create business value
13. Top Benefits Expected from ITOA
67%
Improved IT Staff
Productivity
Better IT infrastructure
utilization and optimization
53%
Improved infrastructure
availability and reduced downtime
51%
Improved application code
quality and defect reduction
Better application
performance service levels
Users look to operations analytics to yield :
(Source : IDC)
14. ITOA is game changing.
Unleashing Innovation and Business Value with
“IT OPERATIONS ANALYTICS”
http://www.itoa-landscape.org/
15. Analytic Required in Security domain
Data Breaches, Detected
Late, Undetected.
• Move from Descriptive to Predictive Analysis
• TopN -> Unsupervised Machine Learning
• Static Threshold-> Dynamic Threshold
• Predefine Correlation Rules -> Auto detect
abnormaly
16. Major Issues in Detection and Response :
Source: Analytics and Intelligence Survey 2014, a SANS Survey,
Written by David Shackleford, October 2014, p8
Source: Advanced Threat Detection with Machine-Generated
Intelligence, Ponemon Institute, September 2015
20. Probabilistic Modeling and Analysis
• Not just simple “Bell Curve” (average,
stddev) that other techniques use
• sophisticated machine-learning
techniques to best-fit the right
statistical model for your data.
• Bayesian distribution modeling, time-
series decomposition, clustering, and
correlation analysis
• Better models = better outlier
detection = less false alarms
21. Models Matter
• Simple models miss real outliers • Automatic Models with “Detectors”
Outliers
<0.01% chance
likelihood
observed values
X
Model
Gaussian
Rare Events
Deviations in
Counts or Values
Unusual vs. peers
=
=
=
“responsetime by host”
“count by error_type”
“rare by EventID”
“rare by process”
“sum(bytes) over client_ip”
22. Use Case 1:
Find metrics deviation in time series
• Automatic periodicity
23. Use Case 2:
Find Important IDS/IPS Events
Challenge:
How do you find the signs of advanced threats amid
thousands of daily high-severity alerts?
Difficulty of creating effective rules
results in a high false positive rate
Advanced Evasion Techniques (AETs)
well-known to attackers
24. Use Case 2:
Find Important IDS/IPS Events
Solution:
Let machine learning filter out normal ‘noise’ and identify unusual
counts, signatures, protocols and destinations by source
• Anomaly Detective generates
a dozen or so alerts per week
• Accuracy & alert detail enable
faster determination of threat
level
I like AD because I haven’t had to
tune a single IDS rule since it was
deployed.
- Craig Merchant, Senior Security
Architect, Oracle
25. Use Case 3:
Detect DNS Tunneling Activity
Challenge:
How do you detect DNS Tunneling (C2, data exfiltrations
or other abuses of DNS) ?
Encrypted messages disguised as
subdomains can contain control or data
payloads
Insufficient monitoring of DNS for
‘tunneling’ activity poses a significant risk
Calculated information content= 3126
Deviations in
Counts or Values
26. Use Case 3:
Detect DNS Tunneling Activity
What impresses me about Anomaly
Detective is its ability to automatically find
anomalous behavior in machine data by
relying on trends in the data itself instead
of hard-coding rules.
- Peter Davis, CTO, Turnberry Solutions
Solution:
By detecting anomalies in DNS query subdomain
characteristics
27. • Use Case: Learn typical processes on each host
• Find rare processes that “start up and communicate”
Use Case 3:
Rare Items as Anomalies
Rare Events
28. Finds FTP process running
for 3 hours on system
that doesn’t normally run
Use Case 4:
Rare Items as Anomalies
= “rare by process”
29. Use Case 5:
Population / Peer Outliers
• Host sending 20,000
requests/hours
• Attempt to hack an IIS
webserver
= “sum(bytes) over host”
Unusual vs. peers
30. Adding Value to existing SIEM
• Better results than threshold based searches
• Example: “Unusual AD access”
• SIEM: 148 notables/day
• Anomaly Detection: 2 significant anomalies/week (500x reduction)
• Example: “Proxy Data Exfiltration”
• SIEM: sum(bytes_out) > 10MB => 50,000 notables/month
• Anomaly Detection: 12 significant anomalies, including exfiltrations <10 MB
• More sophisticated anomaly detection
• Example: DNS Tunneling, Malware Command & Control Activity
Value – Less time/effort for humans to triage
Value – Reduce risk by detecting APTs, malware, rogue users that
otherwise go unnoticed
31. Additional Security Applications
No. Threat Indicator Category Identify… …By Finding Anomalies In
1 Data Exfiltration
Credit card numbers, Electronic Health
Records being stolen
Firewall Logs, Web Proxy Logs, Secure Web
Gateway Logs, DNS Logs
2 Malware Command & Control Activity Infected systems beaconing
Web Proxy Logs, DNS Request Logs, Firewall
Logs
3 Suspicious Account Activity New account creation, privilege changes Server, Directory Logs, Audit Logs
4 Unauthorized Login Attempts/Activity Smart brute force attacks Server, Directory Logs, Audit Logs
5 Compromised Endpoints Spreading malware internally EDR/ AV logs, Netflow records
6 Suspicious Server Behaviors New bit torrents, chat rooms, file services Process starts, network connections
7 Unusual IDS/IPS Events Unusual security events from security tools IDS/IPS/IDP/NGFW logs
8 Unusual Network Activity
Launching DDoS attack, excessive DNS
requests
Firewall Logs, Web Proxy Logs, Secure Web
Gateway Logs, Netflow records, DPI Logs
9 Abusive/Attacking IP Addresses External data scrapers, internal snoopers
Firewall Logs, Web Proxy Logs, Secure Web
Gateway Logs, Netflow records, DPI Logs
10 Disabled/Interrupted Logging Attempts to hide tracks All types of log data
32. SANS: “Organizations Need To Understand Their
Environment And What Constitutes Normal And Abnormal
Behavior, Train Staff On How To Use Analytic Tools
And Define The Data They Need To Collect.”[1]
[1] Analytics and Intelligence Survey 2014, a SANS Survey, Written by David
Shackleford, October 2014 , p8.
http://www.sans.org/reading-room/whitepapers/analyst/analytics-intelligence-
survey-2014-35507
[2] http://digital-forensics.sans.org/media/poster_2014_find_evil.pdf
[2]
33. Summary Advantages of ITOA
• Reduces mean-time-to repair (MTTR) and Avoids downtime
• Increases insights into correlation of end-user interaction and
business activity
• Reduces operations cost with the efficient use of skilled
personal
• Applies pattern and statistics based algorithms
• Helps in extracting meaningful information