More Related Content Similar to Обнаружение вредоносного кода в зашифрованном с помощью TLS трафике (без дешифровки) (20) More from Positive Hack Days (20) Обнаружение вредоносного кода в зашифрованном с помощью TLS трафике (без дешифровки)2. В докладе рассматривается работа группы
исследователей компании Cisco, доказывающая
применимость традиционных методов статистического
и поведенческого анализа для обнаружения и
атрибуции вредоносного ПО, использующего TLS в
качестве метода шифрования каналов взаимодействия,
без дешифровки или компрометации TLS-сессии.
О чём этот доклад?
3. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
People
Blake Anderson – Technical Leader
PhD in Computer Science (Machine Learning)
Started at Cisco in 2015
David McGrew – Cisco Fellow
PhD in Physics (Chaos Theory)
Started at Cisco in 1998
BRKSEC-2809 3
4. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Problem: Encrypted Malware Traffic
• TLS usage increasing (this is not bad!)
• String-matching solutions less ineffective
• MITM problems
• Privacy, legal, deployment, expense, non-cooperating clients
InternetPremises
BRKSEC-2809 4
5. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Our Approach: Leverage All Data Features
InternetPremises
Netflow data: SrcIP, DstIP, SrcPort, DstPort, Proto, #Bytes, #Packets
Intraflow data: Packet Lengths & Inter-arrival Times, Byte Distribution, …
TLS metadata: Extensions, Ciphersuites, SNI, Certificate Strings, …
DNS data: Names and TTLs of linked responses
HTTP data: Headers and File Magic for other flows from target
5BRKSEC-2809
6. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Flow Monitoring
Export Collection Analysis StorageObservation
Observation
Observation
srcIP, dstIP, srcPort, dstPort, prot, startTime, stopTime, numBytes, numPackets
BRKSEC-2809 6
7. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Enhanced Telemetry
Export Collection Analysis StorageObservation
Observation
Observation
srcIP, dstIP, srcPort, dstPort, prot, startTime, stopTime, numBytes, numPackets
New
Data
Features
BRKSEC-2809 7
8. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Solution Overview
OS
inference
Endpoint Context
OS, Applications, PMTU, RTT, Infection, …
ETTA data
flow record
labels
flow record
fingerprint
rules
classifier
description
Cisco
Products
ETTA data
Application
inference
Malware
Detection
Malware
Family
Crypto
Audit
BRKSEC-2809 8
9. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Data Collection
Training/Storage
• Metadata
• Packet
lengths
• TLS
• DNS
• HTTP
Benign
Records
Malware
Records
Classifier/Rules
BRKSEC-2809 9BRKSEC-2809 9
10. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Malware Benign
• ThreatGRID pcaps
• 5-minute sandbox runs
• Threat Score = 100/100
• Millions of pcap’s
• ~5,000-15,000 new pcap’s a day
• Hundreds of millions of flows
• Enterprise DMZ
• ~10-15 million flows per day
• ~500 user site
• IP addresses are
anonymized
BRKSEC-2809 10
11. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Exploring Threat Data Features at Scale
joy
pcap2flowpcap json
pcap2flow json
Offline
Online
exporter jsoncollector
https://github.com/cisco/joy
Bill Hudson Philip Perricone
11BRKSEC-2809
12. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
src dst
• SPLT – Sequence of Packet Lengths
and Arrival Times
• Byte Distribution
• Byte Entropy
Enhanced Telemetry Data Types
12BRKSEC-2809
13. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Sequence of Packet Lengths and Times
src dst
ClientpacketsServerpackets
Time
BRKSEC-2809 13
14. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Byte Distribution
48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b
H T T P / 1 . 1 2 0 0 O K
BRKSEC-2809 14
15. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Byte Distribution
48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b
H T T P / 1 . 1 2 0 0 O K
1
BRKSEC-2809 15
16. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Byte Distribution
48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b
H T T P / 1 . 1 2 0 0 O K
1
1
BRKSEC-2809 16
17. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Byte Distribution
48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b
H T T P / 1 . 1 2 0 0 O K
1
2
BRKSEC-2809 17
18. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Byte Distribution
48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b
H T T P / 1 . 1 2 0 0 O K
1
1 2
BRKSEC-2809 18
19. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
TLS Handshake Protocol
Client Server
ClientHello
ServerHello / Certificate
ClientKeyExchange / ChangeCipherSpec
Application Data
ChangeCipherSpec
BRKSEC-2809 19
20. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
• Most popular TLS clients:
• IE 8/11*, Tor, Opera
• Wide variety in the number of distinct ciphersuite offer vectors
• Some malware families only used one while others used hundreds
• 2048-bit DHE_RSA was by far the most popular public key algorithm / size
TLS Clients
BRKSEC-2809 20
21. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
• Most popular servers (taken from the certificate subject):
• Bitcoin wallet (e.g., block.io)
• File sharing sites (e.g., dropbox.com)
• Advertising (e.g., criteo.com)
• Search engines (e.g., google.com, baidu.com)
• DGA-like certificate subjects were very common
• .7% (as opposed to .09% on an enterprise network) used self-signed certificates
• TLS_RSA_WITH_3DES_EDE_CBC_SHA was the most popular selected ciphersuite
• These features are heavily dependent on the malware family
TLS Servers
BRKSEC-2809 21
22. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
• Retain original capitalization
• User-Agent vs user-agent vs User-agent vs USER-AGENT vs User-AgEnt
• Retain original ordering of header fields
• Absence or presence of HTTP fields gives information about environment
• via, x-imforwards
• Common User-Agent values:
• Opera/9.50(WindowsNT6.0;U;en)
• Mozilla/5.0(Windows;U;WindowsNT5.1;en-
US;rv:1.9.2.3)Gecko/20100401Firefox/3.6.1(.NETCLR3.5.30731)
• Mozilla/5.0(WindowsNT6.3;WOW64;Trident/7.0;Touch;rv:11.0)likeGecko
HTTP Fields
BRKSEC-2809 22
23. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
• Common TTL values:
• 100 > 300 > 60 > 3600
• Number of IPs per request:
• 1 > 4 > 11 > 2 > 6
• Most common suffixes:
• com > net > pl > eu > org
• ~95% not found in the Alexa top-1,000,000 list
DNS Information
BRKSEC-2809 23
24. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Server Name Indication
• Extension type: server_name (0x0000)
• Indication from the client about the
hostname of the server
Record Headers
ClientHello
Random Nonce
[Session ID]
Cipher suites
Compression
Methods
Extensions Single Extension
BRKSEC-2809 24
25. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Certificate Information
• Check for self-signed certificates
• Issuer == Subject
• Check for certificates not within validity
period
• Check for discrepancies in the
SubjectAltName (optional certificate
extension and the server_name
extension
Record Headers
Certificate
…
Cert Headers
Cert
Signature Algorithm
Issuer
Validity Period
Subject
Subject Public Key
Optional Info
Cert Sig Alg
Cert Signature
BRKSEC-2809 25
26. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
TLS Client Fingerprinting
Record Headers
ClientHello
Random Nonce
[Session ID]
Cipher suites
Compression
Methods
Extensions
Used for fingerprinting
0.9.8
1.0.0
1.0.1
1.0.2
OpenSSL Versions
BRKSEC-2809 26
27. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Classification Model on SPLT and BD
Length
transition
bins
Time
transition
bins
Classifier
Feature
Selection [0,1]
BRKSEC-2809 28
28. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Evaluating Algorithms
BRKSEC-2809 29
29. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
• Malware
• August 2015 to May 2016 pcaps from ThreatGRID
• TLS (443) traffic, > 100 in and out bytes
• 225,740 flows, Telemetry enhanced with TLS extensions, ciphersuites, and public key lengths
• Benign
• traffic taken from a large enterprise DMZ
• TLS (443) traffic, > 100 in and out bytes
• 225,000 flows, Telemetry enhanced with TLS extensions, ciphersuites, and public key lengths
• 10-fold cross-validation
Test Setup
BRKSEC-2809 30
30. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
src dst
• Packet Lengths and Inter-arrival Times
Modeled as 1st Order Markov Chains
• Probabilities of Byte Values in Payload
• Binary Vector of Offered Ciphersuites and
Extensions, and Client’s Public Key Size
Intraflow Features
BRKSEC-2809 31
31. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Results
• L1-logistic regression
• Meta + SPLT + BD
• 0.01% FDR: 1.3%
• Total Accuracy: 98.9%
• L1-logistic regression
• Meta + SPLT + BD + TLS
• 0.01% FDR: 92.8%
• Total Accuracy: 99.6%
BRKSEC-2809 32
32. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Results (without Schannel)
• L1-logistic regression
• Meta + SPLT + BD
• 0.01 FDR: 0.9%
• Total Accuracy: 98.5%
• L1-logistic regression
• Meta + SPLT + BD + TLS
• 0.01 FDR: 87.2%
• Total Accuracy: 99.6%
BRKSEC-2809 33
33. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Per-Family Classification Results
Malware Family Meta+SPLT+BD Meta+SPLT+BD+TLS
Bergat* 100.0% 100.0%
Sality* 95.0% 97.7%
Dridex 16.5% 78.5%
Skeeyah 95.9% 98.6%
Virlock 100.0% 100.0%
BRKSEC-2809 34
34. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
• Malware
• August 2015 to May 2016 pcaps from ThreatGRID
• TLS (443) traffic, > 100 in and out bytes
• 13,542 flows with all available data (~63.2% of malicious TLS flows)
• Benign
• traffic taken from a large enterprise DMZ
• TLS (443) traffic, > 100 in and out bytes
• 42,927 flows with all available data (~3.8% of benign TLS flows)
• 10-fold cross-validation
Test Setup
BRKSEC-2809 35
35. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
src dst
• Packet Lengths and Inter-arrival Times
Modeled as 1st Order Markov Chains
• Probabilities of Byte Values in Payload
• Binary Vector of Offered Ciphersuites
and Extensions, and Client’s Public Key
Size
Intraflow Features
BRKSEC-2809 36
36. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
• DNS
• Alexa Lists
• Lengths of DN and FQDN
• Suffix
• TTL
• % Numerical Characters
• % Non-alphanumeric Chars
• HTTP
• Outbound/inbound header fields
• Content-Type
• User-Agent
• Accept-Language
• Server
• code
Contextual Flow Features
BRKSEC-2809 37
37. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
• L1-logistic regression
• Traditional Side Channel
Features
• 0.00% FDR: 0.00%
• Total Accuracy: 93.1%
• L1-logistic regression
• All Available Features
• 0.00% FDR: 99.9%
• Total Accuracy: 99.9%
Results (10-fold cross-validation)
BRKSEC-2809 38
38. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Detecting Encrypted Malware
Acc. 0.00% FDR
SPLT+BD+TLS+HTTP+DNS 99.993% 99.978%
TLS 94.836% 50.406%
DNS 99.496% 94.654%
HTTP 99.945% 98.996%
TLS+DNS 99.883% 96.551%
TLS+HTTP 99.955% 99.660%
HTTP+DNS 99.985% 99.956%
SPLT+BD+TLS 99.933% 70.351%
SPLT+BD+TLS+DNS 99.968% 98.043%
SPLT+BD+TLS+HTTP 99.983% 99.956%
TLS DNS
HTTP
SPLT+BD
BRKSEC-2809 39
39. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Interpretability
• Malware Features
• DNS Suffix: org
• DNS TTL: 3600
• TLS_RSA_WITH_RC4_128_SHA
• HTTP Field: location
• DNS Alexa: Not Found
• HTTP Server: nginx
• HTTP Code: 404
• Benign Features
• TLS Ext: extended_master_secret
• Content type: application/octet-stream
• TLS_DHE_RSA_WITH_DES_CBC_SHA
• HTTP Server: Microsoft-IIS/8.5
• DNS Alexa: top-1,000,000
• HTTP User-Agent: Microsoft-CryptoAPI/6.1
BRKSEC-2809 40
40. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
• Datasets are relatively limited
• Evasion is possible
• Not impossible to mimic popular TLS/HTTP/DNS clients/servers/responses
• But, mimicking these features requires an on-going and non-trivial software engineering effort
• Can (hopefully) limit evasion attacks
• Identify most robust network features (several good existing algorithms)
• Include higher order interactions:
• TLS client / HTTP user agent(s)
• Ordering of offered ciphersuites and HTTP fields
Caveats / Biases
BRKSEC-2809 41
41. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
Conclusions
• Malware’s use of TLS is
increasing
• Our research complements
current solutions
• Machine learning/rules applied to passively obtained network data features
can detect malware communication
• https://github.com/cisco/joy
• Note: contact authors to obtain well-trained classifier parameters
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
Percentage of
malware
traffic using
TLS
BRKSEC-2809 42
42. © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public
https://arxiv.org/abs/1607.01639
Название:
«Deciphering Malware's use of TLS (without
Decryption)»
Авторы:
Blake Anderson, Subharthi Paul, David McGrew
Препринт статьи, на основе которой сделана данная
презентация, доступен по адресу:
43
Editor's Notes (Currently Implemented in a Switch and/or NGA) (Currently Implemented in a Switch and/or NGA) (Currently Implemented in a Switch and/or NGA) (Currently Implemented in a Switch and/or NGA)