Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ashrith Barthur, Security Scientist,, at MLconf 2017


Published on

Machine Learning Based Attack Vector Modeling for CyberSecurity:
Connections have behavioural patterns that are unique to protocols, loads, window sizes, bandwidth, and mainly the type of traffic. A CDN enterprise behaves completely differently than how a Cloud service company would behave and they both would be different from a corporation. This also means that attack vectors and attack landscapes are different in all these places. In this talk we speak about modeling different kinds of attacks and build a model that is able to identify these different kinds of attacks using ML.
The method we use is to identify different profiles based on many variables that specifically but robustly identify attacks of different kinds. The variables are specific to business, network profile, traffic. The variables are also high-level i.e. aggregate, and packet-level. This way the models are specifically picking up on constant variations in traffic, and create machine learning models to identify these attacks. Using the power of H2O these analyses are not just limited to a research and analysis of the traffic and concluding with a “OH, this was what it was.” moment but to actually deploy code, besides existing IDS and IPS, or deploying highly optimized, independent programs that can handle high thruputs at the rate of 1.2 Million decisions per second making it one of the fastest implementations of ML to identify, defend and protect critical infrastructure that are potentially under threat.

Published in: Technology
  • Login to see the comments

  • Be the first to like this

Ashrith Barthur, Security Scientist,, at MLconf 2017

  1. 1. H2 Machine Intelligence Identifying Network Attack Behaviour Ashrith Barthur, PhD Principal Security Scientist, Cybersecurity and Fraud
  2. 2. H2 Machine Intelligence Classification of Attacks • Short Term Goals o DDoS - Layer 2 and above o Physical Attacks • Long Term Goals o Network/Service Reconnaissance o Enterprise Service attacks - attack on infrastructure o Phishing, Spear Phishing (more focussed) o Social Engineering - Out-of-loop •
  3. 3. H2 Machine Intelligence Anatomy of An Attack - Short Term • Identify Target • Identify Service of Attack • Overwhelm the service • Post-Attack Analysis o Attack mechanism is simple. o Variations occur in source of attack, protocols levels. o Relatively short lived. o Damage quantifiable.
  4. 4. H2 Machine Intelligence Anatomy of An Attack - Long Term • Identify Target • Reconnaissance • Identify Infrastructure Vulnerability / Or means of phishing • Network Foothold • Lateral movement and service compromises • Data Exfiltration/ Network Squatting, or passive sniffing.
  5. 5. H2 Machine Intelligence Anatomy of An Attack - Long Term (cont) • Post-Attack Analysis (Usually an Illusion) o Attack might still continue o Variations can occur based on services, new vulnerabilities, new softwares, unused access, network segments without VLANs, un-closed, outdated wall sockets, etc. o Usually very long term o Damage assessment is not usually accurate. •
  6. 6. H2 Machine Intelligence Does This Mean We Focus on Long Term Attacks?
  7. 7. H2 Machine Intelligence Why? • Short Term Attack and Long Term Attacks are usually used together. • Short Term Attacks are used as: o A means of Reconnaissance o A method of shielding another attack, or breaking down some basic protection before an attack is launched. o It is also used to shield any detection of data exfiltration • Therefore it is imperative that you look at all the attacks on your networks. • And look at as connected in time and events.
  8. 8. H2 Machine Intelligence Solutions do exist. Right? 1. Yes. Solutions do exist that correlate events 2. But are limited 3. They are purely rule-based, and mostly stateless. 4. Hardly capable of smartly identifying events related across time. - A must for identifying long term attacks.
  9. 9. H2 Machine Intelligence Looking at events on the network and interconnecting them gives more information (depth of Info)
  10. 10. H2 Machine Intelligence “So You want us to get the NSA SoC?”
  11. 11. H2 Machine Intelligence No, just build smarter Algorithms. Use Machine Learning - or AI (fancy)
  12. 12. H2 Machine Intelligence CSec Solution Evolution Rule-based Model Feature-base d Model Pure Data Driven Model
  13. 13. H2 Machine Intelligence Features - (Used in Feature-based Model) 1. Features are meta data (Extracted from the data) 2. They help algorithms capture information from the data. 3. Feature engineering is a form of language translation: Between raw data and the algorithm. 4. Build much better features for your supervised models.
  14. 14. H2 Machine Intelligence Source of Features 1. Past Attack Behaviour 2. Enterprise Traffic Behaviour 3. Enterprise Application/User Profiles a. Syslogs b. ASA, IDS, IPS, Suricata,Bro, Argus, etc.
  15. 15. H2 Machine Intelligence Features - Example 1. Average length of connection (too small, too large) 2. Average number of DNS requests (within network/outside network) 3. Average number of new domains 4. Change in MTU ratio vs. Windows/Mac/*Nix machine churn. 5. Packet Utilization - segmentation 6. Window Size 7. Arrival Jitter Variance
  16. 16. H2 Machine Intelligence Features - Example average tcp connect length 7 Days
  17. 17. H2 Machine Intelligence Features: Advantages 1. Designed Features Highlight Transactional Behaviour 2. Features Continuously Track Network’s Transactional Behaviour 3. Rules Variables can only Identify Threshold Changes
  18. 18. H2 Machine Intelligence Feature-based Model: Advantages 1. Uses AI - artificial intelligence 2. AI with features uses a consistent and objective approach 3. Quick classification 4. Low false positive rate - tweaked based on risk appetite.
  19. 19. H2 Machine Intelligence Successfully Classified 1. DoS 2. Recon 3. Exploits 4. CnC Traffic
  20. 20. H2 Machine Intelligence I will believe all this after you tell me where you got your labels from!
  21. 21. H2 Machine Intelligence Labeling Different Methods of Labeling 1. Completely Manual 2. Assisted with Feature Growth and Unsupervised
  22. 22. H2 Machine Intelligence Manual Labeling Logs Information Analytical Inputs: 1. Behavioural Input 2. Univariate Alert score 3. Threat score Suspicious Not Suspicious
  23. 23. H2 Machine Intelligence Super slow! Enterprise might not exist by the time all traffic gets tagged.
  24. 24. H2 Machine Intelligence Assisted Labeling H2O Unsupervised Algorithm 1. Features SoC Analyst Clustering Output Sampling Clustering output labeling Clustering Classification Output Logs/Pcap 1. Algo tuning
  25. 25. H2 Machine Intelligence Model Deployment Information from rule-based system/Raw Packet Capture Not Suspicious H2O Machine Learning Algorithm Suspicious 1. Traffic logs 2. Pcap Info 3. Alert systems
  26. 26. H2 Machine Intelligence Model Implementation 1. Currently used to assist IDS/IPS systems in decision making. 2. Also used to alert SoC analysts to further investigate the incident.
  27. 27. H2 Machine Intelligence Pure Data-driven Model Not a suspicious transaction H2O Machine Learning - Complex Algorithm Suspicious Transaction Logs/Information
  28. 28. H2 Machine Intelligence Thank You Questions?