This document discusses using data mining techniques to improve intrusion detection systems (IDS). It begins by introducing computer network risks and limitations of existing IDS approaches. It then discusses using data mining algorithms like ID3, k-means clustering, and Apriori pattern mining within a hybrid IDS framework. The framework includes sensors to collect host and network data, a data warehouse for storage, and an analysis engine using misuse detection, anomaly detection and data mining algorithms to detect intrusions. It concludes that data mining allows IDS to detect both known and unknown attacks more efficiently.
2. Introduction
Computer Networks And Related Risks
Limitations of currently available measures
Intrusion Detection Systems (IDS)to the rescue
Difficulties in implementing IDS
Limitations Of IDS
Data Mining as an Improvement
Hybrid IDS
3. Contents
Intrusion and the methods to prevent it
Intrusion Detection Technology
Data Mining Technology
Data Mining Algorithms In Intrusion
Detection
A Framework for Hybrid IDS
4. Attacks and Intrusion
Malicious, externally induced operational fault
Any set of actions that attempts to compromise the integrity, confidentiality or
availability of a resource
An attack is not an intrusion
We need to detect intrusion! Why not just make it impossible?
Intrusion Detection Technology
Definition
System that identifies and deals with the malicious use a computer and
network resources
Classification, based on
Objects of the system detection – Host based , Network Based and Hybrid
System architecture- Centralized and distributed
Differences on data analysis methods-
Misuse detection
Anomaly detection
Primitive Approaches To Detect Intrusion
Rule- based analysis (Manual coding) for pattern comparison
Expert Systems as an improvement
5. Data Mining
Cyber Crime
Methods useful for mining audit data
Link Analysis
Determines relations between fields in the
database records
e.g. Correlation between command and
arguments like “javac a.java”
Classification
Maps a data item into one of several pre-
processed defined categories
E.g. it may group cyber attacks as “Severe”,
“average”, “harmless” and then use profiles to
detect when an attack occurs
6. Data mining Algorithms used in Intrusion
Detection
Classification algorithm
ID3 Algorithm ID3.ppt
Clustering algorithm
K means algorithm k-Means clustering
Algorithm.htm
Pattern Mining algorithm
Apriori algorithm Apriori_algorithm.htm
7. The framework of Hybrid IDS
FIG: The Hybrid IDS based data mining system
The Hybrid IDS consists of four parts: Data warehouse,
sensors, analysis engine and alarm systems
8. Data warehouse
Improves speed of data mining and analysis engine
Different component asynchronously handle same
amount of data
Time-related data collection, supports multi-threading
& multi-process technology
Sensors
Host Sensors
Gather info from monitored hosts (security logs,
application logs, event logs, running apps, registry
changes)
Hook functions to intercept API calls
Network Sensors
Collect the network connection information
“netstat” tool in windows NT or UNIX/LINUX
9. Analysis Engine
Network/host sensor manager
Receives data , analyzes them, makes them
warehouse compatible and then stores them
Misuse and anomaly detector
Anomaly – intrusion comes out if any behavior is not
consistent with the known behaviors
Misuse – intrusion comes out if any behavior is
consistent with the known behaviors
Mining algorithm and pattern mining
Clustering analysis – K-means algorithm
Classification analysis – ID3 algorithm
Pattern matching analysis – Apriori algorithm
Alarm systems
10. Conclusions
IDSs are essential because intrusion is inevitable
Early IDSs used Expert systems an improvement
over rule-based systems
Data mining technologies added dynamicity in
IDSs
Hybrid IDS is efficient to detect known & unknown
attacks
Future trend in this field is implementing such
systems with a database centric approach
11. References
[1] Wenke Lee, S. J. Stolfo, “A data mining Framework for Building Intrusion detection
models” IEEE 1999.
[2] He Min, Luo Laijun “Search of network real-time intrusion detection system base on
data mining” in proceedings of the 2011 IEEE Internatioanl conference of uncertainty
reasoning and knowledge engineering.
[3] Sanjay Kumar Sharma, Pankaj Pandey, Susheel Kumar Tiwari & Mahendra singh
Sisodia, “An improved network intrusion detection technology based on k-means
clustering via Naiive bayes classifiaction” in proceedings of IEEE-International
Conference On Advances In Engineering, Science And Management (ICAESM -2012)
March 30, 31, 2012
[4] Mrutyunjaya Panda, Manas Ranjan Patra, “A comparative study of data mining
algorithms for network intrusion detection” in proceedings of First International
Conference on Emerging Trends in Engineering and Technology 2008 IEEE
[5] Yan Yu “A Novel intrusion detection approaches based on data mining” IEEE 2010
[6] Dyuanyang Zhao, Zhilin Feng, Qingxiang Xu, “Analysis and design for Intrusion
detection system based on data mining” in proceedings of 2010 IEEE second
international workshop on education technology and computer science
[7] Ming Xue, Changjun Zhu “Applied research on data mining algorithm in network
intrusion detection” in proceedings of 2009 IEEE international joint conference on
artificial intellengence
[8] LIU Dihua, WANG Hongzhi , WANG Xiumei “Data mining for intrusion detection”
IEEE 2001