SlideShare a Scribd company logo
1 of 22
Download to read offline
BITS Pilani
Hyderabad Campus
Pratik Narang, Jagan Mohan Reddy, Chittaranjan Hota
BITS Pilani, Hyderabad Campus
narangpratik@gmail.com
23rd August 2013
ACM Compute 2013, Vellore
Feature Selection for Detection of
Peer-to-Peer Botnet traffic
Outline
• Introduction
o P2P Networks
o P2P Botnets
• Work overview
• Related Work
• Our work
o Generating traffic
o Feature extraction & selection
o Evaluation of feature selection techniques
o Future scope of work
What is a P2P Network?
A
D
E F
G
H
F
H
GA
E
C
C
B
P2P
overlay
layer
Native IP
layer
D
B
AS1
AS2
AS3
AS4
AS5
AS6
Generic P2P Architecture
Capability &
Configuration
Peer Role Selection
Operating System
NAT/ Firewall Traversal
Routing and Forwarding
Neighbor Discovery
Join/Leave
Bootstrap
Overlay Messaging API
Content
Storage
Search API
Uses & Misuses
5
Traditional Botnets
Bot-Master
Peer-to-Peer Botnets
Bot-Master
Work overview
 Evaluation of 3 feature selection algorithms-
 Correlation-based Feature Selection
 Consistency-based Subset Evaluation
 Principal Component Analysis
 Models built with 3 machine learning algorithms-
 Naïve Bayes classifier
 Bayes Networks
 C4.5 Decision trees
 Performance evaluation for the detection of some
recent and well-known P2P botnets.
Related work
• Early work using feature selection algorithms [1] [2]
used the DARPA dataset, which is no longer suitable
for today’s security research.
• Early approaches for P2P botnet detection [3]
applied static, port based analysis- easily defeated
by modern botnets.
• Recent work [4] [5] has employed machine learning
and data mining techniques for detection of P2P
botnets.
Our work
Machine Learning Algorithms
Bayes Network Naïve Bayes C4.5 Decision Trees
Feature Selection
Correlation-based Feature Selection Consistency-based Subset Evaluation Principal Component Analysis
Feature Extraction
source min. packet size dest. TCP Push flag count source avg. packet size dest. total volume duration …
Flow Extraction
<Source IP, Source port, Destination IP, Destination port, Protocol>
Network captures
jNetPcap Library with Java module
Generating Traffic
Botnet traffic generation
Internet
Info.
Sec.
Lab
Dist.
Sys. Lab Multimedia
Lab
Hostels
Wing
Data collection for P2P
and web traffic
Anonymization
(Anon tool)
Botnet
detection
module
Firewall
Core
Switch 6509
Distribution
Switch 4500
Access
Switch 2500
Content
Mgmt.
Application
Servers
DB
Cluster
IDS
Ethernet
Dataset
Data Application Number of flows
Benign data
HTTP, HTTPS, SMTP, FTP, POP 30,000 flows
P2P apps- eMule, BitTorrent, Mute, Gnutella etc. 50,000 flows
Botnet data
[4,5]
Zero Access 720 flows
SkyNet 770 flows
Waledac 80,000 flows
Storm 2,20,000 flows
Feature Extraction &
Selection
• A ‘Flow’ defined by:
• <Source IP, Source port, Dest. IP, Dest. port, Protocol>
• Features extracted from each flow:
• Packet count (bi-directional)
• Packet size (bytes) (min, max, mean and standard deviation)
(bi-directional)
• Total volume (bytes) (bi-directional)
• Inter-arrival times (min, max, mean and standard deviation)
(bi-directional)
• TCP Push flag count (bi-directional)
• Duration of the flow (no context of direction)
• TOTAL - 23 features extracted from each flow
Feature Extraction &
Selection
• Three Feature Selection techniques used:
1. Correlation-based Feature Selection (CFS)
2. Consistency-based Subset Evaluation (CSE)
3. Principal Component Analysis (PCA)
• Evaluated with three algorithms:
1. Naïve Bayes
2. Bayes Network
3. C4.5 Decision Trees
Feature Extraction &
Selection
Feature Selection Search
method
No. of
features
Description
CFS
Best first
search
5
source packet count, source min.
packet size, source max. packet
size, dest. max. packet size,
source inter-arrival time std.
CSE
Best first
search
8
source min. packet size, source
max. packet size, dest. max.
packet size, source avg. packet
size, dest. avg. packet size,
source max. inter-arrival time,
flow duration, source volume
PCA - 12 A linear combination of features
Evaluation of Feature
Selection Techniques
0
10
20
30
40
50
60
70
80
90
100
NaiveBayes BayesNet C4.5
85.2
97.08 98.23
81.51
95.92 98.18
80.24
96.2 98.23
82.16
96.67 98.17
Accuracyin%
Classification Algorithm
Full CFS CSE PCA
93
94
95
96
97
98
99
NaiveBayes BayesNet C4.5
98.9
96.9
98.9
95.2 95.3
98.9
96.1
95.7
99
95.4
96.2
98.9
DetectionRatein%
Classification algorithm
Full CFS CSE PCA
FNTNFPTP
TNTP
Accuracy



FNTP
TP
rate

Detection
Evaluation of Feature
Selection Techniques
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
NaiveBayes BayesNet C4.5
Normalizedclassificationspeed
Classification Algorithm
Full CFS CSE PCA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
NaiveBayes BayesNet C4.5
NormalizedBuildTimes
Classification Algorithm
FULL CFS CSE PCA
Primary Observations
Future Scope
 Ensemble of classifiers
(Work in Progress- paper submitted to I-CARE 2013)
 Close-to-real-time Detection Tool
(Work in progress)
 Space-efficient data structures
References
1. A. H. Sung and S. Mukkamala. The feature selection and intrusion detection
problems. In Advances in Computer Science-ASIAN 2004. Higher-Level
Decision Making, pages 468–482. Springer, 2005.
2. S. Chebrolu, A. Abraham, and J. P. Thomas. Feature deduction and
ensemble design of intrusion detection systems. Computers & Security,
24(4):295–307, 2005.
3. R. Schoof and R. Koning. Detecting peer-to-peer botnets. University of
Amsterdam, 2007.
4. S. Saad, I. Traore, A. Ghorbani, B. Sayed, D. Zhao, W. Lu, J. Felix, and P.
Hakimian. Detecting p2p botnets through network behavior analysis and
machine learning. In Privacy, Security and Trust (PST), 2011 Ninth Annual
International Conference on, pages 174–180. IEEE, 2011.
5. B. Rahbarinia, R. Perdisci, A. Lanzi, and K. Li. Peerrush: Mining for unwanted
p2p traffic. In DIMVA. 2013.
narangpratik@gmail.com
Visit our Research Group: www.netclique.in

More Related Content

What's hot

Early application identification. CONEXT 2006
Early application identification. CONEXT 2006Early application identification. CONEXT 2006
Early application identification. CONEXT 2006Laurent Bernaille
 
Network Packet Analysis with Wireshark
Network Packet Analysis with WiresharkNetwork Packet Analysis with Wireshark
Network Packet Analysis with WiresharkJim Gilsinn
 
Network Analysis using Wireshark 5: display filters
Network Analysis using Wireshark 5: display filtersNetwork Analysis using Wireshark 5: display filters
Network Analysis using Wireshark 5: display filtersYoram Orzach
 
lesson 7- Network analysis Using Wireshark - advanced statistics tools
lesson 7- Network analysis Using Wireshark - advanced statistics toolslesson 7- Network analysis Using Wireshark - advanced statistics tools
lesson 7- Network analysis Using Wireshark - advanced statistics toolsYoram Orzach
 
Network Measurement and Monitori - Assigment 1, Group3, "Classification"
Network Measurement and Monitori - Assigment 1, Group3, "Classification"Network Measurement and Monitori - Assigment 1, Group3, "Classification"
Network Measurement and Monitori - Assigment 1, Group3, "Classification"Valentin Thirion
 
Co se skrývá v datovém provozu? - Pavel Minařík
Co se skrývá v datovém provozu? - Pavel MinaříkCo se skrývá v datovém provozu? - Pavel Minařík
Co se skrývá v datovém provozu? - Pavel MinaříkSecurity Session
 
Fukuoka University Public NTP Service and BCP38
Fukuoka University Public NTP Service and BCP38Fukuoka University Public NTP Service and BCP38
Fukuoka University Public NTP Service and BCP38APNIC
 
Empirically Characterizing the Buffer Behaviour of Real Devices
Empirically Characterizing the Buffer Behaviour of Real DevicesEmpirically Characterizing the Buffer Behaviour of Real Devices
Empirically Characterizing the Buffer Behaviour of Real DevicesJose Saldana
 
Network Situational Awareness with d00gle
Network Situational Awareness with d00gleNetwork Situational Awareness with d00gle
Network Situational Awareness with d00gleDug Song
 
2014.7.9 detecting p2 p botnets through network behavior analysis and machine...
2014.7.9 detecting p2 p botnets through network behavior analysis and machine...2014.7.9 detecting p2 p botnets through network behavior analysis and machine...
2014.7.9 detecting p2 p botnets through network behavior analysis and machine...ericsuboy
 
CapAnalysis - Deep Packet Inspection
CapAnalysis - Deep Packet InspectionCapAnalysis - Deep Packet Inspection
CapAnalysis - Deep Packet InspectionChris Harrington
 
SSL basics and SSL packet analysis using wireshark
SSL basics and SSL packet analysis using wiresharkSSL basics and SSL packet analysis using wireshark
SSL basics and SSL packet analysis using wiresharkAl Imran, CISA
 
Wireshark Inroduction Li In
Wireshark Inroduction  Li InWireshark Inroduction  Li In
Wireshark Inroduction Li Inmhaviv
 
Ch 08 -- Ethernet & LAN Switching Troubleshooting
Ch 08 -- Ethernet & LAN Switching TroubleshootingCh 08 -- Ethernet & LAN Switching Troubleshooting
Ch 08 -- Ethernet & LAN Switching TroubleshootingYoram Orzach
 
Network Analysis Using Wireshark -Chapter 6- basic statistics tools
Network Analysis Using Wireshark -Chapter 6- basic statistics toolsNetwork Analysis Using Wireshark -Chapter 6- basic statistics tools
Network Analysis Using Wireshark -Chapter 6- basic statistics toolsYoram Orzach
 

What's hot (20)

Early application identification. CONEXT 2006
Early application identification. CONEXT 2006Early application identification. CONEXT 2006
Early application identification. CONEXT 2006
 
Wireshark Basics
Wireshark BasicsWireshark Basics
Wireshark Basics
 
Wireshark ppt
Wireshark pptWireshark ppt
Wireshark ppt
 
Network Packet Analysis with Wireshark
Network Packet Analysis with WiresharkNetwork Packet Analysis with Wireshark
Network Packet Analysis with Wireshark
 
Network Analysis using Wireshark 5: display filters
Network Analysis using Wireshark 5: display filtersNetwork Analysis using Wireshark 5: display filters
Network Analysis using Wireshark 5: display filters
 
lesson 7- Network analysis Using Wireshark - advanced statistics tools
lesson 7- Network analysis Using Wireshark - advanced statistics toolslesson 7- Network analysis Using Wireshark - advanced statistics tools
lesson 7- Network analysis Using Wireshark - advanced statistics tools
 
Network Measurement and Monitori - Assigment 1, Group3, "Classification"
Network Measurement and Monitori - Assigment 1, Group3, "Classification"Network Measurement and Monitori - Assigment 1, Group3, "Classification"
Network Measurement and Monitori - Assigment 1, Group3, "Classification"
 
Co se skrývá v datovém provozu? - Pavel Minařík
Co se skrývá v datovém provozu? - Pavel MinaříkCo se skrývá v datovém provozu? - Pavel Minařík
Co se skrývá v datovém provozu? - Pavel Minařík
 
Fukuoka University Public NTP Service and BCP38
Fukuoka University Public NTP Service and BCP38Fukuoka University Public NTP Service and BCP38
Fukuoka University Public NTP Service and BCP38
 
Empirically Characterizing the Buffer Behaviour of Real Devices
Empirically Characterizing the Buffer Behaviour of Real DevicesEmpirically Characterizing the Buffer Behaviour of Real Devices
Empirically Characterizing the Buffer Behaviour of Real Devices
 
Presentation1
Presentation1Presentation1
Presentation1
 
Wireshark
WiresharkWireshark
Wireshark
 
Network Situational Awareness with d00gle
Network Situational Awareness with d00gleNetwork Situational Awareness with d00gle
Network Situational Awareness with d00gle
 
2014.7.9 detecting p2 p botnets through network behavior analysis and machine...
2014.7.9 detecting p2 p botnets through network behavior analysis and machine...2014.7.9 detecting p2 p botnets through network behavior analysis and machine...
2014.7.9 detecting p2 p botnets through network behavior analysis and machine...
 
CapAnalysis - Deep Packet Inspection
CapAnalysis - Deep Packet InspectionCapAnalysis - Deep Packet Inspection
CapAnalysis - Deep Packet Inspection
 
SSL basics and SSL packet analysis using wireshark
SSL basics and SSL packet analysis using wiresharkSSL basics and SSL packet analysis using wireshark
SSL basics and SSL packet analysis using wireshark
 
Wireshark Inroduction Li In
Wireshark Inroduction  Li InWireshark Inroduction  Li In
Wireshark Inroduction Li In
 
Ch 08 -- Ethernet & LAN Switching Troubleshooting
Ch 08 -- Ethernet & LAN Switching TroubleshootingCh 08 -- Ethernet & LAN Switching Troubleshooting
Ch 08 -- Ethernet & LAN Switching Troubleshooting
 
Zmap talk-sec13
Zmap talk-sec13Zmap talk-sec13
Zmap talk-sec13
 
Network Analysis Using Wireshark -Chapter 6- basic statistics tools
Network Analysis Using Wireshark -Chapter 6- basic statistics toolsNetwork Analysis Using Wireshark -Chapter 6- basic statistics tools
Network Analysis Using Wireshark -Chapter 6- basic statistics tools
 

Viewers also liked

Machine Learning Based Botnet Detection
Machine Learning Based Botnet DetectionMachine Learning Based Botnet Detection
Machine Learning Based Botnet Detectionbutest
 
Introduction to Neural Networks - Perceptron
Introduction to Neural Networks - PerceptronIntroduction to Neural Networks - Perceptron
Introduction to Neural Networks - PerceptronHannes Hapke
 
Artificial intelligence Pattern recognition system
Artificial intelligence Pattern recognition systemArtificial intelligence Pattern recognition system
Artificial intelligence Pattern recognition systemREHMAT ULLAH
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications Ahmed_hashmi
 
Lecture artificial neural networks and pattern recognition
Lecture   artificial neural networks and pattern recognitionLecture   artificial neural networks and pattern recognition
Lecture artificial neural networks and pattern recognitionHưng Đặng
 
State of the Word 2011
State of the Word 2011State of the Word 2011
State of the Word 2011photomatt
 

Viewers also liked (8)

Machine Learning Based Botnet Detection
Machine Learning Based Botnet DetectionMachine Learning Based Botnet Detection
Machine Learning Based Botnet Detection
 
Introduction to Neural Networks - Perceptron
Introduction to Neural Networks - PerceptronIntroduction to Neural Networks - Perceptron
Introduction to Neural Networks - Perceptron
 
Neural networks introduction
Neural networks introductionNeural networks introduction
Neural networks introduction
 
Artificial intelligence Pattern recognition system
Artificial intelligence Pattern recognition systemArtificial intelligence Pattern recognition system
Artificial intelligence Pattern recognition system
 
Introduction to pattern recognition
Introduction to pattern recognitionIntroduction to pattern recognition
Introduction to pattern recognition
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
 
Lecture artificial neural networks and pattern recognition
Lecture   artificial neural networks and pattern recognitionLecture   artificial neural networks and pattern recognition
Lecture artificial neural networks and pattern recognition
 
State of the Word 2011
State of the Word 2011State of the Word 2011
State of the Word 2011
 

Similar to Feature selection for detection of peer to-peer botnet traffic

Network State Awareness & Troubleshooting
Network State Awareness & TroubleshootingNetwork State Awareness & Troubleshooting
Network State Awareness & TroubleshootingAPNIC
 
A Brief Incursion into Botnet Detection
A Brief Incursion into Botnet DetectionA Brief Incursion into Botnet Detection
A Brief Incursion into Botnet DetectionAnant Narayanan
 
Cotopaxi - IoT testing toolkit (Black Hat Asia 2019 Arsenal)
Cotopaxi - IoT testing toolkit (Black Hat Asia 2019 Arsenal)Cotopaxi - IoT testing toolkit (Black Hat Asia 2019 Arsenal)
Cotopaxi - IoT testing toolkit (Black Hat Asia 2019 Arsenal)Jakub Botwicz
 
Application of the Actor Model to Large Scale NDE Data Analysis
Application of the Actor Model to Large Scale NDE Data AnalysisApplication of the Actor Model to Large Scale NDE Data Analysis
Application of the Actor Model to Large Scale NDE Data AnalysisChrisCoughlin9
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Flink for Everyone: Self-Service Data Analytics with StreamPipes
Flink for Everyone: Self-Service Data Analytics with StreamPipesFlink for Everyone: Self-Service Data Analytics with StreamPipes
Flink for Everyone: Self-Service Data Analytics with StreamPipesApache StreamPipes
 
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)Spark Summit
 
Applied Detection and Analysis Using Flow Data - MIRCon 2014
Applied Detection and Analysis Using Flow Data - MIRCon 2014Applied Detection and Analysis Using Flow Data - MIRCon 2014
Applied Detection and Analysis Using Flow Data - MIRCon 2014chrissanders88
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataJames Sirota
 
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...Altinity Ltd
 
Network Monitoring System ppt.pdf
Network Monitoring System ppt.pdfNetwork Monitoring System ppt.pdf
Network Monitoring System ppt.pdfkristinatemen
 
network monitoring system ppt
network monitoring system pptnetwork monitoring system ppt
network monitoring system pptashutosh rai
 
Introduction to NBL
Introduction to NBLIntroduction to NBL
Introduction to NBLFei Ji Siao
 
Tcp congestion avoidance algorithm identification
Tcp congestion avoidance algorithm identificationTcp congestion avoidance algorithm identification
Tcp congestion avoidance algorithm identificationBala Lavanya
 
Kalman Graffi - Disputation Talk - Monitoring and Management of P2P Systems -...
Kalman Graffi - Disputation Talk - Monitoring and Management of P2P Systems -...Kalman Graffi - Disputation Talk - Monitoring and Management of P2P Systems -...
Kalman Graffi - Disputation Talk - Monitoring and Management of P2P Systems -...Kalman Graffi
 

Similar to Feature selection for detection of peer to-peer botnet traffic (20)

Network State Awareness & Troubleshooting
Network State Awareness & TroubleshootingNetwork State Awareness & Troubleshooting
Network State Awareness & Troubleshooting
 
A Brief Incursion into Botnet Detection
A Brief Incursion into Botnet DetectionA Brief Incursion into Botnet Detection
A Brief Incursion into Botnet Detection
 
Cotopaxi - IoT testing toolkit (Black Hat Asia 2019 Arsenal)
Cotopaxi - IoT testing toolkit (Black Hat Asia 2019 Arsenal)Cotopaxi - IoT testing toolkit (Black Hat Asia 2019 Arsenal)
Cotopaxi - IoT testing toolkit (Black Hat Asia 2019 Arsenal)
 
Application of the Actor Model to Large Scale NDE Data Analysis
Application of the Actor Model to Large Scale NDE Data AnalysisApplication of the Actor Model to Large Scale NDE Data Analysis
Application of the Actor Model to Large Scale NDE Data Analysis
 
NetBrain CE 5.0
NetBrain CE 5.0NetBrain CE 5.0
NetBrain CE 5.0
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Flink for Everyone: Self-Service Data Analytics with StreamPipes
Flink for Everyone: Self-Service Data Analytics with StreamPipesFlink for Everyone: Self-Service Data Analytics with StreamPipes
Flink for Everyone: Self-Service Data Analytics with StreamPipes
 
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
 
Applied Detection and Analysis Using Flow Data - MIRCon 2014
Applied Detection and Analysis Using Flow Data - MIRCon 2014Applied Detection and Analysis Using Flow Data - MIRCon 2014
Applied Detection and Analysis Using Flow Data - MIRCon 2014
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking Data
 
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
 
Io t data streaming
Io t data streamingIo t data streaming
Io t data streaming
 
Решения WANDL и NorthStar для операторов
Решения WANDL и NorthStar для операторовРешения WANDL и NorthStar для операторов
Решения WANDL и NorthStar для операторов
 
Network Monitoring System ppt.pdf
Network Monitoring System ppt.pdfNetwork Monitoring System ppt.pdf
Network Monitoring System ppt.pdf
 
network monitoring system ppt
network monitoring system pptnetwork monitoring system ppt
network monitoring system ppt
 
Introduction to NBL
Introduction to NBLIntroduction to NBL
Introduction to NBL
 
Tcp congestion avoidance algorithm identification
Tcp congestion avoidance algorithm identificationTcp congestion avoidance algorithm identification
Tcp congestion avoidance algorithm identification
 
Kalman Graffi - Disputation Talk - Monitoring and Management of P2P Systems -...
Kalman Graffi - Disputation Talk - Monitoring and Management of P2P Systems -...Kalman Graffi - Disputation Talk - Monitoring and Management of P2P Systems -...
Kalman Graffi - Disputation Talk - Monitoring and Management of P2P Systems -...
 
deCaptcha
deCaptchadeCaptcha
deCaptcha
 
Stress your DUT
Stress your DUTStress your DUT
Stress your DUT
 

More from Pratik Narang

Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...Pratik Narang
 
Abhishek presentation october 2013
Abhishek presentation october 2013Abhishek presentation october 2013
Abhishek presentation october 2013Pratik Narang
 

More from Pratik Narang (6)

Hades_poster_Comad
Hades_poster_ComadHades_poster_Comad
Hades_poster_Comad
 
Hades
HadesHades
Hades
 
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
Machine-learning Approaches for P2P Botnet Detection using Signal-processing...
 
Gokul seminar
Gokul seminarGokul seminar
Gokul seminar
 
Abhishek presentation october 2013
Abhishek presentation october 2013Abhishek presentation october 2013
Abhishek presentation october 2013
 
Hota iitd
Hota iitdHota iitd
Hota iitd
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 

Feature selection for detection of peer to-peer botnet traffic

  • 1. BITS Pilani Hyderabad Campus Pratik Narang, Jagan Mohan Reddy, Chittaranjan Hota BITS Pilani, Hyderabad Campus narangpratik@gmail.com 23rd August 2013 ACM Compute 2013, Vellore Feature Selection for Detection of Peer-to-Peer Botnet traffic
  • 2. Outline • Introduction o P2P Networks o P2P Botnets • Work overview • Related Work • Our work o Generating traffic o Feature extraction & selection o Evaluation of feature selection techniques o Future scope of work
  • 3. What is a P2P Network? A D E F G H F H GA E C C B P2P overlay layer Native IP layer D B AS1 AS2 AS3 AS4 AS5 AS6
  • 4. Generic P2P Architecture Capability & Configuration Peer Role Selection Operating System NAT/ Firewall Traversal Routing and Forwarding Neighbor Discovery Join/Leave Bootstrap Overlay Messaging API Content Storage Search API
  • 8. Work overview  Evaluation of 3 feature selection algorithms-  Correlation-based Feature Selection  Consistency-based Subset Evaluation  Principal Component Analysis  Models built with 3 machine learning algorithms-  Naïve Bayes classifier  Bayes Networks  C4.5 Decision trees  Performance evaluation for the detection of some recent and well-known P2P botnets.
  • 9. Related work • Early work using feature selection algorithms [1] [2] used the DARPA dataset, which is no longer suitable for today’s security research. • Early approaches for P2P botnet detection [3] applied static, port based analysis- easily defeated by modern botnets. • Recent work [4] [5] has employed machine learning and data mining techniques for detection of P2P botnets.
  • 10. Our work Machine Learning Algorithms Bayes Network Naïve Bayes C4.5 Decision Trees Feature Selection Correlation-based Feature Selection Consistency-based Subset Evaluation Principal Component Analysis Feature Extraction source min. packet size dest. TCP Push flag count source avg. packet size dest. total volume duration … Flow Extraction <Source IP, Source port, Destination IP, Destination port, Protocol> Network captures jNetPcap Library with Java module
  • 11. Generating Traffic Botnet traffic generation Internet Info. Sec. Lab Dist. Sys. Lab Multimedia Lab Hostels Wing Data collection for P2P and web traffic Anonymization (Anon tool) Botnet detection module Firewall Core Switch 6509 Distribution Switch 4500 Access Switch 2500 Content Mgmt. Application Servers DB Cluster IDS Ethernet
  • 12. Dataset Data Application Number of flows Benign data HTTP, HTTPS, SMTP, FTP, POP 30,000 flows P2P apps- eMule, BitTorrent, Mute, Gnutella etc. 50,000 flows Botnet data [4,5] Zero Access 720 flows SkyNet 770 flows Waledac 80,000 flows Storm 2,20,000 flows
  • 13. Feature Extraction & Selection • A ‘Flow’ defined by: • <Source IP, Source port, Dest. IP, Dest. port, Protocol> • Features extracted from each flow: • Packet count (bi-directional) • Packet size (bytes) (min, max, mean and standard deviation) (bi-directional) • Total volume (bytes) (bi-directional) • Inter-arrival times (min, max, mean and standard deviation) (bi-directional) • TCP Push flag count (bi-directional) • Duration of the flow (no context of direction) • TOTAL - 23 features extracted from each flow
  • 14.
  • 15. Feature Extraction & Selection • Three Feature Selection techniques used: 1. Correlation-based Feature Selection (CFS) 2. Consistency-based Subset Evaluation (CSE) 3. Principal Component Analysis (PCA) • Evaluated with three algorithms: 1. Naïve Bayes 2. Bayes Network 3. C4.5 Decision Trees
  • 16. Feature Extraction & Selection Feature Selection Search method No. of features Description CFS Best first search 5 source packet count, source min. packet size, source max. packet size, dest. max. packet size, source inter-arrival time std. CSE Best first search 8 source min. packet size, source max. packet size, dest. max. packet size, source avg. packet size, dest. avg. packet size, source max. inter-arrival time, flow duration, source volume PCA - 12 A linear combination of features
  • 17. Evaluation of Feature Selection Techniques 0 10 20 30 40 50 60 70 80 90 100 NaiveBayes BayesNet C4.5 85.2 97.08 98.23 81.51 95.92 98.18 80.24 96.2 98.23 82.16 96.67 98.17 Accuracyin% Classification Algorithm Full CFS CSE PCA 93 94 95 96 97 98 99 NaiveBayes BayesNet C4.5 98.9 96.9 98.9 95.2 95.3 98.9 96.1 95.7 99 95.4 96.2 98.9 DetectionRatein% Classification algorithm Full CFS CSE PCA FNTNFPTP TNTP Accuracy    FNTP TP rate  Detection
  • 18. Evaluation of Feature Selection Techniques 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 NaiveBayes BayesNet C4.5 Normalizedclassificationspeed Classification Algorithm Full CFS CSE PCA 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 NaiveBayes BayesNet C4.5 NormalizedBuildTimes Classification Algorithm FULL CFS CSE PCA
  • 20. Future Scope  Ensemble of classifiers (Work in Progress- paper submitted to I-CARE 2013)  Close-to-real-time Detection Tool (Work in progress)  Space-efficient data structures
  • 21. References 1. A. H. Sung and S. Mukkamala. The feature selection and intrusion detection problems. In Advances in Computer Science-ASIAN 2004. Higher-Level Decision Making, pages 468–482. Springer, 2005. 2. S. Chebrolu, A. Abraham, and J. P. Thomas. Feature deduction and ensemble design of intrusion detection systems. Computers & Security, 24(4):295–307, 2005. 3. R. Schoof and R. Koning. Detecting peer-to-peer botnets. University of Amsterdam, 2007. 4. S. Saad, I. Traore, A. Ghorbani, B. Sayed, D. Zhao, W. Lu, J. Felix, and P. Hakimian. Detecting p2p botnets through network behavior analysis and machine learning. In Privacy, Security and Trust (PST), 2011 Ninth Annual International Conference on, pages 174–180. IEEE, 2011. 5. B. Rahbarinia, R. Perdisci, A. Lanzi, and K. Li. Peerrush: Mining for unwanted p2p traffic. In DIMVA. 2013.
  • 22. narangpratik@gmail.com Visit our Research Group: www.netclique.in