Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017

H2
O.ai
Machine Intelligence
Identifying Network Attack Behaviour
Ashrith Barthur, PhD
Principal Security Scientist, H2O.ai
Cybersecurity and Fraud

H2
O.ai
Classification of Attacks
• Short Term Goals
o DDoS - Layer 2 and above
o Physical Attacks
• Long Term Goals
o Network/Service Reconnaissance
o Enterprise Service attacks - attack on infrastructure
o Phishing, Spear Phishing (more focussed)
o Social Engineering - Out-of-loop
•

H2
O.ai
Anatomy of An Attack - Short Term
• Identify Target
• Identify Service of Attack
• Overwhelm the service
• Post-Attack Analysis
o Attack mechanism is simple.
o Variations occur in source of attack, protocols levels.
o Relatively short lived.
o Damage quantifiable.

H2
O.ai
Anatomy of An Attack - Long Term
• Identify Target
• Reconnaissance
• Identify Infrastructure Vulnerability / Or means of phishing
• Network Foothold
• Lateral movement and service compromises
• Data Exfiltration/ Network Squatting, or passive sniffing.

H2
O.ai
Anatomy of An Attack - Long Term (cont)
• Post-Attack Analysis (Usually an Illusion)
o Attack might still continue
o Variations can occur based on services, new vulnerabilities, new softwares,
unused access, network segments without VLANs, un-closed, outdated wall
sockets, etc.
o Usually very long term
o Damage assessment is not usually accurate.
•

H2
O.ai
Does This Mean We Focus on Long
Term Attacks?

H2
O.ai
Why?
• Short Term Attack and Long Term Attacks are usually used together.
• Short Term Attacks are used as:
o A means of Reconnaissance
o A method of shielding another attack, or breaking down some basic protection
before an attack is launched.
o It is also used to shield any detection of data exfiltration
• Therefore it is imperative that you look at all the attacks on your networks.
• And look at as connected in time and events.

H2
O.ai
Solutions do exist. Right?
1. Yes. Solutions do exist that correlate events
2. But are limited
3. They are purely rule-based, and mostly stateless.
4. Hardly capable of smartly identifying events related across
time. - A must for identifying long term attacks.

H2
O.ai
Looking at events on the network and
interconnecting them gives more
information (depth of Info)

H2
O.ai
“So You want us to get the NSA SoC?”

H2
O.ai
No, just build smarter Algorithms.
Use Machine Learning - or AI (fancy)

H2
O.ai
CSec Solution Evolution
Rule-based
Model
Feature-base
d Model
Pure Data Driven
Model

H2
O.ai
Features - (Used in Feature-based Model)
1. Features are meta data (Extracted from the data)
2. They help algorithms capture information from the data.
3. Feature engineering is a form of language translation: Between raw data
and the algorithm.
4. Build much better features for your supervised models.

H2
O.ai
Source of Features
1. Past Attack Behaviour
2. Enterprise Traffic Behaviour
3. Enterprise Application/User Profiles
a. Syslogs
b. ASA, IDS, IPS, Suricata,Bro, Argus, etc.

H2
O.ai
Features - Example
1. Average length of connection (too small, too large)
2. Average number of DNS requests (within network/outside network)
3. Average number of new domains
4. Change in MTU ratio vs. Windows/Mac/*Nix machine churn.
5. Packet Utilization - segmentation
6. Window Size
7. Arrival Jitter Variance

H2
O.ai
Features - Example
average tcp connect length
7 Days

H2
O.ai
Features: Advantages
1. Designed Features Highlight Transactional Behaviour
2. Features Continuously Track Network’s Transactional Behaviour
3. Rules Variables can only Identify Threshold Changes

H2
O.ai
Feature-based Model: Advantages
1. Uses AI - artificial intelligence
2. AI with features uses a consistent and objective approach
3. Quick classification
4. Low false positive rate - tweaked based on risk appetite.

H2
O.ai
Successfully Classified
1. DoS
2. Recon
3. Exploits
4. CnC Traffic

H2
O.ai
I will believe all this after you tell me
where you got your labels from!

H2
O.ai
Labeling
Different Methods of Labeling
1. Completely Manual
2. Assisted with Feature Growth and Unsupervised

H2
O.ai
Manual Labeling
Logs
Information
Analytical Inputs:
1. Behavioural Input
2. Univariate Alert score
3. Threat score
Suspicious
Not Suspicious

H2
O.ai
Super slow! Enterprise might not exist by
the time all traffic gets tagged.

H2
O.ai
Assisted Labeling
H2O Unsupervised
Algorithm
1. Features
SoC Analyst
Clustering Output Sampling
Clustering output labeling
Clustering Classification
Output
Logs/Pcap
1. Algo tuning

H2
O.ai
Model Deployment
Information from
rule-based system/Raw
Packet Capture
Not
Suspicious
H2O Machine
Learning Algorithm
Suspicious
1. Traffic logs
2. Pcap Info
3. Alert systems

H2
O.ai
Model Implementation
1. Currently used to assist IDS/IPS systems in decision
making.
2. Also used to alert SoC analysts to further investigate the
incident.

H2
O.ai
Pure Data-driven Model
Not a suspicious
transaction
H2O Machine
Learning - Complex
Algorithm
Suspicious
Transaction
Logs/Information

H2
O.ai
Thank You
Questions?

Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017

Similar to Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017 (20)

More from MLconf

More from MLconf (20)

Recently uploaded

Recently uploaded (20)

Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017