This document describes a machine learning approach for detecting peer-to-peer (P2P) botnets using signal processing techniques. It proposes extracting features based on discrete Fourier transforms and Shannon's entropy in addition to regular network behavior features. These features are used to build detection models using supervised machine learning algorithms. The approach treats the communication between each pair of nodes as a signal and aims to uncover hidden patterns in botnet command and control traffic by analyzing it in the frequency domain rather than just the time domain. Evaluation using real botnet and benign traffic datasets showed the models could accurately detect storms, waledac and zeus P2P botnets.
Feature selection for detection of peer to-peer botnet traffic
Machine-learning Approaches for P2P Botnet Detection using Signal-processing Techniques
1. Poster template by ResearchPosters.co.za
Machine-learning Approaches for P2P Botnet Detection
using Signal-processing Techniques
Pratik Narang, Vansh Khurana, Chittaranjan Hota
Birla Institute of Technology & Science - Pilani, Hyderabad Campus
References
Packet Validation &
Filtering Module
Conversation Creation
Module
P2P botnets identified
Valid packets Discarded packets Malicious conversation Benign conversation
Feature Set
Extraction Module
Signal-processing
based featuresK-nn
REP trees
ANNs
SVMs Network-behavior based
features
Extracted Features
Flowchart
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
K-nn REP trees ANN SVM
Accuracy
This work was supported by grants from the Department of Information Technology, Govt. of India
1. B. Rahbarinia, R. Perdisci, A. Lanzi, and K. Li.
Peerrush: Mining for unwanted p2p traffic. In Detection
of Intrusions and Malware, and Vulnerability
Assessment, pages 62{82. Springer, 2013.
2. J. Zhang, R. Perdisci, W. Lee, X. Luo, and U. Sarfraz.
Building a scalable system for stealthy p2p-botnet
detection. Information Forensics and Security, IEEE
Transactions on, 9(1):27{38, 2014.
3. X. Yu, X. Dong, G. Yu, Y. Qin, D. Yue, and Y. Zhao.
Online botnet detection based on incremental discrete
fourier transform. Journal of Networks, 5(5), 2010.
Name
No. of
conversations
Storm 10,000
Waledac 10,000
Zeus 2,657
Clean
(multiple P2P apps)
78,000
Dataset used
Results
Abstract
Motivation
• Bots tend to have certain regularity and
periodicity in their C &C communication
with other bot-peers
• We attempt to uncover these hidden
patterns by the use of signal-processing
techniques and thus detect P2P botnets
Approach
• Apart from regular ‘network behavior’
based features, we extract several
`signal-processing‘ based features
• The features are use to build detection
models for P2P botnets
• We validate our approach using several
supervised machine learning algorithms.
Use of Entropy
• We quantify the entropy or randomness
present in a conversation’s payload sizes
• For the payload values in each
conversation, calculate the Expected
Compression using Shannon’s
entropy theory
• Bot C & C communication is more uniform
than benign Internet traffic
• Hence higher compression should be
achieved for conversations of bots.
Use of DFT
• C & C communications of bots follows
certain timing patterns
• Model each conversation as a signal
• Calculate DFT for the Inter-arrival time
and payload lengths of packets in a
conversation
• Sort the DFT values by magnitude.
• For any signal, the first few DFT
coefficients contain most of the energy.
Thus select top DFT coefficients
Network-behavior based Signal-processing based
Avg. payload (forward) Compression for payload
Avg. payload (backward) DFT payload – magnitude 1
Avg. packets sent (forward) DFT payload – phase 1
Avg. packets sent (backward) DFT payload – magnitude 2
Median inter-arrival time DFT payload – phase 2
Variance in Packet size DFT Inter-arrival time – magnitude 1
Duration of the conversation … DFT Inter-arrival time – phase 1 …
Extracted Features
The distributed and decentralized nature of P2P botnets makes their detection a challenging task. Further, the bot-masters continuously try to improve their botnets in
order to evade existing detection mechanisms. Thus, although a lot of research has been seen in this field, their detection continues to be an important area of research.
We propose a novel approach for the detection of Command & Control (C & C) communication of P2P botnets by converting the `time-domain' network communications
of nodes to the `frequency-domain'. We adopt a signal-processing based approach by treating the communication of each pair of nodes seen in the network traffic as a
`signal'. Apart from the regular `network behavior' based features, we extract features based on Discrete Fourier Transforms and Shannon's Entropy to build supervised
machine learning models for the detection of P2P botnets.