SlideShare a Scribd company logo
1 of 1
Download to read offline
Poster template by ResearchPosters.co.za
Machine-learning Approaches for P2P Botnet Detection
using Signal-processing Techniques
Pratik Narang, Vansh Khurana, Chittaranjan Hota
Birla Institute of Technology & Science - Pilani, Hyderabad Campus
References
Packet Validation &
Filtering Module
Conversation Creation
Module
P2P botnets identified
Valid packets Discarded packets Malicious conversation Benign conversation
Feature Set
Extraction Module
Signal-processing
based featuresK-nn
REP trees
ANNs
SVMs Network-behavior based
features
Extracted Features
Flowchart
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
K-nn REP trees ANN SVM
Accuracy
This work was supported by grants from the Department of Information Technology, Govt. of India
1. B. Rahbarinia, R. Perdisci, A. Lanzi, and K. Li.
Peerrush: Mining for unwanted p2p traffic. In Detection
of Intrusions and Malware, and Vulnerability
Assessment, pages 62{82. Springer, 2013.
2. J. Zhang, R. Perdisci, W. Lee, X. Luo, and U. Sarfraz.
Building a scalable system for stealthy p2p-botnet
detection. Information Forensics and Security, IEEE
Transactions on, 9(1):27{38, 2014.
3. X. Yu, X. Dong, G. Yu, Y. Qin, D. Yue, and Y. Zhao.
Online botnet detection based on incremental discrete
fourier transform. Journal of Networks, 5(5), 2010.
Name
No. of
conversations
Storm 10,000
Waledac 10,000
Zeus 2,657
Clean
(multiple P2P apps)
78,000
Dataset used
Results
Abstract
Motivation
• Bots tend to have certain regularity and
periodicity in their C &C communication
with other bot-peers
• We attempt to uncover these hidden
patterns by the use of signal-processing
techniques and thus detect P2P botnets
Approach
• Apart from regular ‘network behavior’
based features, we extract several
`signal-processing‘ based features
• The features are use to build detection
models for P2P botnets
• We validate our approach using several
supervised machine learning algorithms.
Use of Entropy
• We quantify the entropy or randomness
present in a conversation’s payload sizes
• For the payload values in each
conversation, calculate the Expected
Compression using Shannon’s
entropy theory
• Bot C & C communication is more uniform
than benign Internet traffic
• Hence higher compression should be
achieved for conversations of bots.
Use of DFT
• C & C communications of bots follows
certain timing patterns
• Model each conversation as a signal
• Calculate DFT for the Inter-arrival time
and payload lengths of packets in a
conversation
• Sort the DFT values by magnitude.
• For any signal, the first few DFT
coefficients contain most of the energy.
Thus select top DFT coefficients
Network-behavior based Signal-processing based
Avg. payload (forward) Compression for payload
Avg. payload (backward) DFT payload – magnitude 1
Avg. packets sent (forward) DFT payload – phase 1
Avg. packets sent (backward) DFT payload – magnitude 2
Median inter-arrival time DFT payload – phase 2
Variance in Packet size DFT Inter-arrival time – magnitude 1
Duration of the conversation … DFT Inter-arrival time – phase 1 …
Extracted Features
The distributed and decentralized nature of P2P botnets makes their detection a challenging task. Further, the bot-masters continuously try to improve their botnets in
order to evade existing detection mechanisms. Thus, although a lot of research has been seen in this field, their detection continues to be an important area of research.
We propose a novel approach for the detection of Command & Control (C & C) communication of P2P botnets by converting the `time-domain' network communications
of nodes to the `frequency-domain'. We adopt a signal-processing based approach by treating the communication of each pair of nodes seen in the network traffic as a
`signal'. Apart from the regular `network behavior' based features, we extract features based on Discrete Fourier Transforms and Shannon's Entropy to build supervised
machine learning models for the detection of P2P botnets.

More Related Content

What's hot

E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327IJMER
 
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...NU_I_TODALAB
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCCHira Shaukat
 
A Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemA Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemVani011
 
Speaker identification using mel frequency
Speaker identification using mel frequency Speaker identification using mel frequency
Speaker identification using mel frequency Phan Duy
 
Speaker recognition in android
Speaker recognition in androidSpeaker recognition in android
Speaker recognition in androidAnshuli Mittal
 
To Lie or To Comply: Defending against Flood Attacks in Disruption Tolerant N...
To Lie or To Comply: Defending against Flood Attacks in Disruption Tolerant N...To Lie or To Comply: Defending against Flood Attacks in Disruption Tolerant N...
To Lie or To Comply: Defending against Flood Attacks in Disruption Tolerant N...Vamsi IV
 
IRJET- Voice based Gender Recognition
IRJET- Voice based Gender RecognitionIRJET- Voice based Gender Recognition
IRJET- Voice based Gender RecognitionIRJET Journal
 
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...IJERA Editor
 
GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
GENDER RECOGNITION SYSTEM USING SPEECH SIGNALGENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
GENDER RECOGNITION SYSTEM USING SPEECH SIGNALIJCSEIT Journal
 
Real Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and ValidationReal Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and ValidationIDES Editor
 
Wormhole attack mitigation in manet a
Wormhole attack mitigation in manet aWormhole attack mitigation in manet a
Wormhole attack mitigation in manet aIJCNCJournal
 
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...ijceronline
 
Convolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classificationConvolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classificationYunchao He
 
DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITION
DEVELOPMENT OF SPEAKER VERIFICATION  UNDER LIMITED DATA AND CONDITIONDEVELOPMENT OF SPEAKER VERIFICATION  UNDER LIMITED DATA AND CONDITION
DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITIONniranjan kumar
 
disruption of black hole attacks in manet
disruption of black hole attacks in manetdisruption of black hole attacks in manet
disruption of black hole attacks in manetINFOGAIN PUBLICATION
 

What's hot (20)

E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327
 
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCC
 
A Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemA Survey on Speaker Recognition System
A Survey on Speaker Recognition System
 
Speaker identification using mel frequency
Speaker identification using mel frequency Speaker identification using mel frequency
Speaker identification using mel frequency
 
Speaker recognition in android
Speaker recognition in androidSpeaker recognition in android
Speaker recognition in android
 
To Lie or To Comply: Defending against Flood Attacks in Disruption Tolerant N...
To Lie or To Comply: Defending against Flood Attacks in Disruption Tolerant N...To Lie or To Comply: Defending against Flood Attacks in Disruption Tolerant N...
To Lie or To Comply: Defending against Flood Attacks in Disruption Tolerant N...
 
IRJET- Voice based Gender Recognition
IRJET- Voice based Gender RecognitionIRJET- Voice based Gender Recognition
IRJET- Voice based Gender Recognition
 
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
 
GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
GENDER RECOGNITION SYSTEM USING SPEECH SIGNALGENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
GENDER RECOGNITION SYSTEM USING SPEECH SIGNAL
 
Real Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and ValidationReal Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and Validation
 
Speech Signal Processing
Speech Signal ProcessingSpeech Signal Processing
Speech Signal Processing
 
Speaker recognition.
Speaker recognition.Speaker recognition.
Speaker recognition.
 
Wormhole attack mitigation in manet a
Wormhole attack mitigation in manet aWormhole attack mitigation in manet a
Wormhole attack mitigation in manet a
 
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
 
10.1.1.150.595
10.1.1.150.59510.1.1.150.595
10.1.1.150.595
 
Convolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classificationConvolutional neural networks for sentiment classification
Convolutional neural networks for sentiment classification
 
DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITION
DEVELOPMENT OF SPEAKER VERIFICATION  UNDER LIMITED DATA AND CONDITIONDEVELOPMENT OF SPEAKER VERIFICATION  UNDER LIMITED DATA AND CONDITION
DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITION
 
disruption of black hole attacks in manet
disruption of black hole attacks in manetdisruption of black hole attacks in manet
disruption of black hole attacks in manet
 
histogram-based-emotion
histogram-based-emotionhistogram-based-emotion
histogram-based-emotion
 

Viewers also liked

2014.7.9 detecting p2 p botnets through network behavior analysis and machine...
2014.7.9 detecting p2 p botnets through network behavior analysis and machine...2014.7.9 detecting p2 p botnets through network behavior analysis and machine...
2014.7.9 detecting p2 p botnets through network behavior analysis and machine...ericsuboy
 
Botnets behavioral patterns in the network. A Machine Learning study of botne...
Botnets behavioral patterns in the network. A Machine Learning study of botne...Botnets behavioral patterns in the network. A Machine Learning study of botne...
Botnets behavioral patterns in the network. A Machine Learning study of botne...Czech Technical University in Prague
 
Computer security using machine learning
Computer security using machine learningComputer security using machine learning
Computer security using machine learningSandeep Sabnani
 
Investigating Malicious Office Documents: Analyzing Macros Malwares used in C...
Investigating Malicious Office Documents: Analyzing Macros Malwares used in C...Investigating Malicious Office Documents: Analyzing Macros Malwares used in C...
Investigating Malicious Office Documents: Analyzing Macros Malwares used in C...Cysinfo Cyber Security Community
 
Malware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning TechniquesMalware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning TechniquesArshadRaja786
 
Botnet Detection Techniques
Botnet Detection TechniquesBotnet Detection Techniques
Botnet Detection TechniquesTeam Firefly
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningLior Rokach
 

Viewers also liked (12)

2014.7.9 detecting p2 p botnets through network behavior analysis and machine...
2014.7.9 detecting p2 p botnets through network behavior analysis and machine...2014.7.9 detecting p2 p botnets through network behavior analysis and machine...
2014.7.9 detecting p2 p botnets through network behavior analysis and machine...
 
Botnets behavioral patterns in the network. A Machine Learning study of botne...
Botnets behavioral patterns in the network. A Machine Learning study of botne...Botnets behavioral patterns in the network. A Machine Learning study of botne...
Botnets behavioral patterns in the network. A Machine Learning study of botne...
 
Computer security using machine learning
Computer security using machine learningComputer security using machine learning
Computer security using machine learning
 
Botnets 101
Botnets 101Botnets 101
Botnets 101
 
BOTNET
BOTNETBOTNET
BOTNET
 
Investigating Malicious Office Documents: Analyzing Macros Malwares used in C...
Investigating Malicious Office Documents: Analyzing Macros Malwares used in C...Investigating Malicious Office Documents: Analyzing Macros Malwares used in C...
Investigating Malicious Office Documents: Analyzing Macros Malwares used in C...
 
Malware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning TechniquesMalware Detection Using Machine Learning Techniques
Malware Detection Using Machine Learning Techniques
 
Botnet Detection Techniques
Botnet Detection TechniquesBotnet Detection Techniques
Botnet Detection Techniques
 
Image (PNG) Forensic Analysis
Image (PNG) Forensic Analysis	Image (PNG) Forensic Analysis
Image (PNG) Forensic Analysis
 
Malware Detection using Machine Learning
Malware Detection using Machine Learning	Malware Detection using Machine Learning
Malware Detection using Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Similar to Machine-learning Approaches for P2P Botnet Detection using Signal-processing Techniques

An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...
An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...
An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...IRJET Journal
 
New approaches with chord in efficient p2p grid resource discovery
New approaches with chord in efficient p2p grid resource discoveryNew approaches with chord in efficient p2p grid resource discovery
New approaches with chord in efficient p2p grid resource discoveryijgca
 
NEW APPROACHES WITH CHORD IN EFFICIENT P2P GRID RESOURCE DISCOVERY
NEW APPROACHES WITH CHORD IN EFFICIENT P2P GRID RESOURCE DISCOVERYNEW APPROACHES WITH CHORD IN EFFICIENT P2P GRID RESOURCE DISCOVERY
NEW APPROACHES WITH CHORD IN EFFICIENT P2P GRID RESOURCE DISCOVERYijgca
 
Iaetsd a secured based information sharing scheme via
Iaetsd a secured based information sharing scheme viaIaetsd a secured based information sharing scheme via
Iaetsd a secured based information sharing scheme viaIaetsd Iaetsd
 
PeerShark - Detecting Peer-to-Peer Botnets by Tracking Conversations
PeerShark - Detecting Peer-to-Peer Botnets by Tracking ConversationsPeerShark - Detecting Peer-to-Peer Botnets by Tracking Conversations
PeerShark - Detecting Peer-to-Peer Botnets by Tracking ConversationsPratik Narang
 
A machine learning based protocol for efficient routing in opportunistic netw...
A machine learning based protocol for efficient routing in opportunistic netw...A machine learning based protocol for efficient routing in opportunistic netw...
A machine learning based protocol for efficient routing in opportunistic netw...Fellowship at Vodafone FutureLab
 
Routing in Delay Tolerant Networks
Routing in Delay Tolerant NetworksRouting in Delay Tolerant Networks
Routing in Delay Tolerant NetworksAnubhav Mahajan
 
Waterfall: Rapid identification of IP flows using cascade classification
Waterfall: Rapid identification of IP flows using cascade classificationWaterfall: Rapid identification of IP flows using cascade classification
Waterfall: Rapid identification of IP flows using cascade classificationPawel Foremski
 
Online opportunistic routing using Reinforcement learning
Online opportunistic routing using Reinforcement learningOnline opportunistic routing using Reinforcement learning
Online opportunistic routing using Reinforcement learningHarshal Solao
 
AnupVMathur
AnupVMathurAnupVMathur
AnupVMathuranupmath
 
Stat of the art in cognitive radio
Stat of the art in cognitive radioStat of the art in cognitive radio
Stat of the art in cognitive radioMohsen Tantawy
 
Deep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker RecognitionDeep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker RecognitionSai Kiran Kadam
 
THE EFFECTS OF PAUSE TIME ON THE PERFORMANCE OF DSR PROTOCOL IN MOBILE ADHOC ...
THE EFFECTS OF PAUSE TIME ON THE PERFORMANCE OF DSR PROTOCOL IN MOBILE ADHOC ...THE EFFECTS OF PAUSE TIME ON THE PERFORMANCE OF DSR PROTOCOL IN MOBILE ADHOC ...
THE EFFECTS OF PAUSE TIME ON THE PERFORMANCE OF DSR PROTOCOL IN MOBILE ADHOC ...SyafiqahMohamad84
 
Content Sharing over Smartphone-Based Delay-Tolerant Networks
Content Sharing over Smartphone-Based Delay-Tolerant NetworksContent Sharing over Smartphone-Based Delay-Tolerant Networks
Content Sharing over Smartphone-Based Delay-Tolerant NetworksIJERA Editor
 
P2P Lookup Protocols
P2P Lookup ProtocolsP2P Lookup Protocols
P2P Lookup ProtocolsZubin Bhuyan
 

Similar to Machine-learning Approaches for P2P Botnet Detection using Signal-processing Techniques (20)

An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...
An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...
An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...
 
New approaches with chord in efficient p2p grid resource discovery
New approaches with chord in efficient p2p grid resource discoveryNew approaches with chord in efficient p2p grid resource discovery
New approaches with chord in efficient p2p grid resource discovery
 
NEW APPROACHES WITH CHORD IN EFFICIENT P2P GRID RESOURCE DISCOVERY
NEW APPROACHES WITH CHORD IN EFFICIENT P2P GRID RESOURCE DISCOVERYNEW APPROACHES WITH CHORD IN EFFICIENT P2P GRID RESOURCE DISCOVERY
NEW APPROACHES WITH CHORD IN EFFICIENT P2P GRID RESOURCE DISCOVERY
 
Iaetsd a secured based information sharing scheme via
Iaetsd a secured based information sharing scheme viaIaetsd a secured based information sharing scheme via
Iaetsd a secured based information sharing scheme via
 
PeerShark - Detecting Peer-to-Peer Botnets by Tracking Conversations
PeerShark - Detecting Peer-to-Peer Botnets by Tracking ConversationsPeerShark - Detecting Peer-to-Peer Botnets by Tracking Conversations
PeerShark - Detecting Peer-to-Peer Botnets by Tracking Conversations
 
A machine learning based protocol for efficient routing in opportunistic netw...
A machine learning based protocol for efficient routing in opportunistic netw...A machine learning based protocol for efficient routing in opportunistic netw...
A machine learning based protocol for efficient routing in opportunistic netw...
 
Routing in Delay Tolerant Networks
Routing in Delay Tolerant NetworksRouting in Delay Tolerant Networks
Routing in Delay Tolerant Networks
 
P1121106496
P1121106496P1121106496
P1121106496
 
Aa04404164169
Aa04404164169Aa04404164169
Aa04404164169
 
Mat 540 Quiz 3
Mat 540 Quiz 3Mat 540 Quiz 3
Mat 540 Quiz 3
 
Na2522282231
Na2522282231Na2522282231
Na2522282231
 
Waterfall: Rapid identification of IP flows using cascade classification
Waterfall: Rapid identification of IP flows using cascade classificationWaterfall: Rapid identification of IP flows using cascade classification
Waterfall: Rapid identification of IP flows using cascade classification
 
Online opportunistic routing using Reinforcement learning
Online opportunistic routing using Reinforcement learningOnline opportunistic routing using Reinforcement learning
Online opportunistic routing using Reinforcement learning
 
AnupVMathur
AnupVMathurAnupVMathur
AnupVMathur
 
Stat of the art in cognitive radio
Stat of the art in cognitive radioStat of the art in cognitive radio
Stat of the art in cognitive radio
 
Deep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker RecognitionDeep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker Recognition
 
THE EFFECTS OF PAUSE TIME ON THE PERFORMANCE OF DSR PROTOCOL IN MOBILE ADHOC ...
THE EFFECTS OF PAUSE TIME ON THE PERFORMANCE OF DSR PROTOCOL IN MOBILE ADHOC ...THE EFFECTS OF PAUSE TIME ON THE PERFORMANCE OF DSR PROTOCOL IN MOBILE ADHOC ...
THE EFFECTS OF PAUSE TIME ON THE PERFORMANCE OF DSR PROTOCOL IN MOBILE ADHOC ...
 
Content Sharing over Smartphone-Based Delay-Tolerant Networks
Content Sharing over Smartphone-Based Delay-Tolerant NetworksContent Sharing over Smartphone-Based Delay-Tolerant Networks
Content Sharing over Smartphone-Based Delay-Tolerant Networks
 
P2P Lookup Protocols
P2P Lookup ProtocolsP2P Lookup Protocols
P2P Lookup Protocols
 
E42062126
E42062126E42062126
E42062126
 

More from Pratik Narang

Abhishek presentation october 2013
Abhishek presentation october 2013Abhishek presentation october 2013
Abhishek presentation october 2013Pratik Narang
 
Feature selection for detection of peer to-peer botnet traffic
Feature selection for detection of peer to-peer botnet trafficFeature selection for detection of peer to-peer botnet traffic
Feature selection for detection of peer to-peer botnet trafficPratik Narang
 

More from Pratik Narang (6)

Hades_poster_Comad
Hades_poster_ComadHades_poster_Comad
Hades_poster_Comad
 
Hades
HadesHades
Hades
 
Gokul seminar
Gokul seminarGokul seminar
Gokul seminar
 
Abhishek presentation october 2013
Abhishek presentation october 2013Abhishek presentation october 2013
Abhishek presentation october 2013
 
Hota iitd
Hota iitdHota iitd
Hota iitd
 
Feature selection for detection of peer to-peer botnet traffic
Feature selection for detection of peer to-peer botnet trafficFeature selection for detection of peer to-peer botnet traffic
Feature selection for detection of peer to-peer botnet traffic
 

Machine-learning Approaches for P2P Botnet Detection using Signal-processing Techniques

  • 1. Poster template by ResearchPosters.co.za Machine-learning Approaches for P2P Botnet Detection using Signal-processing Techniques Pratik Narang, Vansh Khurana, Chittaranjan Hota Birla Institute of Technology & Science - Pilani, Hyderabad Campus References Packet Validation & Filtering Module Conversation Creation Module P2P botnets identified Valid packets Discarded packets Malicious conversation Benign conversation Feature Set Extraction Module Signal-processing based featuresK-nn REP trees ANNs SVMs Network-behavior based features Extracted Features Flowchart 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 K-nn REP trees ANN SVM Accuracy This work was supported by grants from the Department of Information Technology, Govt. of India 1. B. Rahbarinia, R. Perdisci, A. Lanzi, and K. Li. Peerrush: Mining for unwanted p2p traffic. In Detection of Intrusions and Malware, and Vulnerability Assessment, pages 62{82. Springer, 2013. 2. J. Zhang, R. Perdisci, W. Lee, X. Luo, and U. Sarfraz. Building a scalable system for stealthy p2p-botnet detection. Information Forensics and Security, IEEE Transactions on, 9(1):27{38, 2014. 3. X. Yu, X. Dong, G. Yu, Y. Qin, D. Yue, and Y. Zhao. Online botnet detection based on incremental discrete fourier transform. Journal of Networks, 5(5), 2010. Name No. of conversations Storm 10,000 Waledac 10,000 Zeus 2,657 Clean (multiple P2P apps) 78,000 Dataset used Results Abstract Motivation • Bots tend to have certain regularity and periodicity in their C &C communication with other bot-peers • We attempt to uncover these hidden patterns by the use of signal-processing techniques and thus detect P2P botnets Approach • Apart from regular ‘network behavior’ based features, we extract several `signal-processing‘ based features • The features are use to build detection models for P2P botnets • We validate our approach using several supervised machine learning algorithms. Use of Entropy • We quantify the entropy or randomness present in a conversation’s payload sizes • For the payload values in each conversation, calculate the Expected Compression using Shannon’s entropy theory • Bot C & C communication is more uniform than benign Internet traffic • Hence higher compression should be achieved for conversations of bots. Use of DFT • C & C communications of bots follows certain timing patterns • Model each conversation as a signal • Calculate DFT for the Inter-arrival time and payload lengths of packets in a conversation • Sort the DFT values by magnitude. • For any signal, the first few DFT coefficients contain most of the energy. Thus select top DFT coefficients Network-behavior based Signal-processing based Avg. payload (forward) Compression for payload Avg. payload (backward) DFT payload – magnitude 1 Avg. packets sent (forward) DFT payload – phase 1 Avg. packets sent (backward) DFT payload – magnitude 2 Median inter-arrival time DFT payload – phase 2 Variance in Packet size DFT Inter-arrival time – magnitude 1 Duration of the conversation … DFT Inter-arrival time – phase 1 … Extracted Features The distributed and decentralized nature of P2P botnets makes their detection a challenging task. Further, the bot-masters continuously try to improve their botnets in order to evade existing detection mechanisms. Thus, although a lot of research has been seen in this field, their detection continues to be an important area of research. We propose a novel approach for the detection of Command & Control (C & C) communication of P2P botnets by converting the `time-domain' network communications of nodes to the `frequency-domain'. We adopt a signal-processing based approach by treating the communication of each pair of nodes seen in the network traffic as a `signal'. Apart from the regular `network behavior' based features, we extract features based on Discrete Fourier Transforms and Shannon's Entropy to build supervised machine learning models for the detection of P2P botnets.