HMM-Web: a framework for the detection of attacks against Web applications

PRA Pattern Recognition and Applications Group!

HMM-Web: a framework for the
detection off attacks against Web
Applications
I. Corona, D. Ariu, G. Giacinto

Presenter
Davide Ariu

Pattern Recognition and Applications Group
P R A Department of Electrical and Electronic Engineering
University of Cagliari, Italy
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 1

Outline

• Motivations
• HMM-Web vs. Web Application Firewalls
• Description of the IDS Scheme
• Noise inside the training set
• Sequences codification
• Experimental Setup
• Experimental Results
• Conclusions


Motivations

Why we do address the problem
of securing Web Applications?


Motivations

Source: X-Force® 2008 Trend & Risk Report – January 2009

Protection of Web Applications
• Web Applications can be protected using a
Web-Application Firewall (WAF)
– WAF filter applications’ input using a set of rules.
• Writing rules for a Web-Application Firewall is a
procedure:
– Vulnerable to zero-days attacks
• WAF can’t stop an attack if it doesn’t have a rule
against it
– Time Expensive
• Rules must be written by hand by the administrator
– Prone to errors
• Requires the administrator having an in-depth
knowledge of applications which reside on the
Web-Server

HMM-Web

• HMM-Web addresses all of the weaknesses of
Web-Application Firewalls because is an Intrusion
Detection System:
– Anomaly Based
• This means which is also able to face with zero-days
attacks
– Fully Automated for what concerns the training
procedure
• Time saving
• Doesn’t require the administrator having knowledge
of applications which reside on the Web-Server


An usage scenario


Request URI Modelling

• As attacks like XSS and SQL-Injection exploit input
validation flaws, we want to model the input
provided by the user.
• User-provided data are passed by the browser to
the Web-Server (then to the application) using a
sequence of attribute-value pairs.
• Consequently, we want to model:
– The sequence of attributes
– The value of each attribute


Request URI Modelling

• From the example request URI
GET /search.php?cat=32&key=hmm HTTP/1.1
we extract:
– The name of the application: “search.php”
– The sequence of attributes: “cat-key”
– The value of each attribute:
• “32” for the attribute cat
• “hmm” for the attribute key

• These are the elements that HMM-Web analyses


Classifier Ensemble

• HMM-Web is based on Hidden Markov Models
• For each application running on the Web Server
HMM-Web creates a module consisting of
– An HMM-Ensemble to model the sequence of attributes
• This feature allows to detect request URI modified by
hand
– An HMM-Ensemble for each one of attributes received
by the Web Application
• This feature allows to detect if one attribute is
receiving an anomalous value.


IDS-Scheme


Noise in the training set

• HMM-Web is trained on a training set made of
requests toward the Web-Server we want to
protect.
• This means that this training set might contain
both legitimate and attack requests.
• From a Pattern Recognition point of view,this is a
problem of training on noisy data..

How does this noise affect HMM-Web
performances?


• The assumption that the most part of queries inside the
training set is legitimate is not reasonable for applications
which are rarely interrogated.


Countermeasure
• We propose to model the fraction of attacks
inside the training set as:
M
1
α = ∑α i ⋅ | q(w i ) |
N i=1
• Where:
– M is the number of applications on the Web Server
– N is the number of queries in the training set
– | q(w i ) | is the number of queries on the i-th application
– α i is the fraction of attacks on the i-th application
€
How can we estimate effectively αi
€
€ for each application?

Countermeasure
• Experimental results show that even a rough
estimate of the amount of attacks inside the
training set, allows to improve the performances
of the IDS.
• A good estimate of α i is that provided by the
following formula:
α
αi = , ∀i ∈ [1, M ]
M ⋅ freq(w i )
• freq(w i ) is €
simply the ratio between the number
of queries toward the i-th application and the
overall number of queries.
€

€ June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 16

Attribute value codification

• The values passed to the attributes might
contain digits, alphabetic letters or meta-
characters.
• As it is not important distinguishing between
elements belonging to each one of these
categories, HMM-Web
– Replaces all the digits with the symbol “N”
– Replaces all the alphabetic letters with the symbol “A”
– Leaves immutate meta-characters
• E.g. The attribute value “/dir/sub/1,2” becomes
“/AAA/AAA/N,N”


Experimental Setup

• We tested HMM-Web on a production Web-
Server of our Academic Institution.
• The Web-Server hosts 52 Applications:
– 24 provide services for registered users
– 28 provide public services
• Dataset D: 150.000 queries toward the Web –
Server
• Dataset A: 38 attacks against 18 applications
– 19 Cross Site Scripting Attacks
– 19 SQL Injection Attacks


Experimental Results
Effectiveness of attributes’ codification

The curve on the right has been obtained using the codification proposed by Kruegel et al. In
“A multimodel approach to the detection of web-based attacks”, Computer Networks, 2005.


Experimental Result
Effectiveness of the MCS Approach


Conclusions
• In this work we propose an anomaly-based IDS for
the protection of Web-Applications

• Respect to traditional WAF HMM-Web is able to face
with zero-days attacks and doesn’t require the
administrator having an in-dept knowledge of
applications to be protected.

• We suggest also a solution for the codification of
queries toward the web server and a strategy to take
into account the noise into the training set.
• HMM-Web achieves excellent results in terms of
detection/false positive rate, even against attacks
that are similar to those inside the training set.


Questions?


HMM-Web: a framework for the detection of attacks against Web applications

Recommended

Recommended

More Related Content

Similar to HMM-Web: a framework for the detection of attacks against Web applications

Similar to HMM-Web: a framework for the detection of attacks against Web applications (20)

More from Pluribus One

More from Pluribus One (20)

Recently uploaded

Recently uploaded (20)

HMM-Web: a framework for the detection of attacks against Web applications