SlideShare a Scribd company logo
1 of 55
Download to read offline
Pa#ern	Recogni-on		
and	Applica-ons	Lab	
																
	
University	
of	Cagliari,	Italy	
	
Department	of	
Electrical	and	Electronic	
Engineering	
Machine Learning Under Attack:
Vulnerability Exploitation and Security Measures
BaAsta	Biggio	
baAsta.biggio@diee.unica.it	
	
Dept.	Of	Electrical	and	Electronic	Engineering	
University	of	Cagliari,	Italy	
Vigo,	Spain,	June	21,	2016	IH&MMSec
http://pralab.diee.unica.it
Recent Applications of ML and PR
•  Machine Learning (ML) and Pattern Recognition (PR) increasingly
used in Personal and Consumer applications
2
http://pralab.diee.unica.it
iPhone 5s with Fingerprint Recognition…
3
http://pralab.diee.unica.it
… Cracked a Few Days After Its Release
4	
EU FP7 Project: TABULA RASA
http://pralab.diee.unica.it
Smart Fridge Caught Sending Spam
•  Jan., 2014: A fridge has been caught sending spam after a web
attack managed to compromise smart gadgets
•  The fridge was one of the 100,000 compromised devices used in
the spam campaign
5	http://www.bbc.com/news/technology-25780908
http://pralab.diee.unica.it
New Challenges for ML/PR
•  We are living exciting time for ML/PR technologies
–  Our work feeds a lot of consumer technologies
for personal applications
•  This opens up new big possibilities
but also new security risks
•  Proliferation and sophistication
of attacks and cyberthreats
–  Skilled / economically-motivated
attackers (e.g., ransomware)
•  Several security systems use machine learning to detect attacks
–  but … is machine learning secure enough?
6
http://pralab.diee.unica.it
Are we ready for this?
Can we use classical pattern recognition
and machine learning techniques under
attack?
No, we cannot. We are facing an
adversarial setting…
We should learn to find secure patterns
7
http://pralab.diee.unica.it
Secure Patterns in Nature
•  Learning of secure patterns is a well-known problem in nature
–  Mimicry and camouflage
–  Arms race between predators and preys
8
http://pralab.diee.unica.it
Secure Patterns in Computer Security
•  Similar phenomenon in machine learning and computer security
–  Obfuscation and polymorphism to hide malicious content
Spam emails Malware
Start 2007 with a bang!

Make WBFS YOUR PORTFOLIO’s

first winner of the year

...
<script type="text/javascript"
src=”http://
palwas.servehttp.com//ml.php"></
script>
...	
var PGuDO0uq19+PGuDO0uq20;
EbphZcei=PVqIW5sV.replace(/jTUZZ/
g,"%"); var eWfleJqh=unescape;
Var NxfaGVHq=“pqXdQ23KZril30”;
q9124=this; var SkuyuppD=

q9124["WYd1GoGYc2uG1mYGe2YnltY".r
eplace(/[Y12WlG:]/g,
"")];SkuyuppD.write(eWfleJqh(Ebph
Zcei));
...
9
http://pralab.diee.unica.it
•  Adaptation/evolution is crucial to survive!
Arms Race
Attackers
Evasion techniques
System designers
Design of effective countermeasures
text analysis on spam emails
visual analysis of attached images
Image-based spam
…	
10
http://pralab.diee.unica.it
Machine	Learning	in	Computer	Security	
11
http://pralab.diee.unica.it
Design of Learning-based Systems
Training phase
x	 x	x	
x	 x	
x	
x	
x	
x	
x	
x	
x	
x	 x	x	x	
x	
x1
x2
...
xd
pre-processing	and	
feature	extrac-on	
	
training	data	
(with	labels)	
classifier	learning	
Linear	classifiers:	
assign	a	weight	to	each	feature	
and	classify	a	sample	based		
on	the	sign	of	its	score	
f (x) = sign(wT
x)
+1, malicious
−1, legitimate
"
#
$
%$
start
bang

portfolio

winner
year

...

university

campus
Start 2007
with a bang!

Make WBFS
YOUR
PORTFOLIO’s

first winner
of the year

...
start
bang

portfolio

winner
year

...

university

campus
1

1
1
1
1
...
0
0
x	SPAM	 start
bang

portfolio

winner
year

...

university

campus
+2

+1
+1
+1
+1
...
-3
-4
w	
12
http://pralab.diee.unica.it
Design of Learning-based Systems
Test phase
x1
x2
...
xd
pre-processing	and	
feature	extrac-on	
	
test	data	
x	
x	
x	
x	
x	x	
x	
x	
x	
x	
x	
x	
x	 x	
x	x	
x	
classifica-on	and	
performance	evalua-on	
e.g.,	classifica-on	accuracy	
	
Start 2007
with a bang!

Make WBFS
YOUR
PORTFOLIO’s

first winner
of the year

...
start
bang

portfolio

winner
year

...

university

campus
1

1
1
1
1
...
0
0
start
bang

portfolio

winner
year

...

university

campus
+2

+1
+1
+1
+1
...
-3
-4
+6 > 0, SPAM
(correctly	classified)	
x	 w	
Linear	classifiers:	
assign	a	weight	to	each	feature	
and	classify	a	sample	based		
on	the	sign	of	its	score	
f (x) = sign(wT
x)
+1, malicious
−1, legitimate
"
#
$
%$
13
http://pralab.diee.unica.it
Can Machine Learning Be Secure?
•  Problem: how to evade a linear (trained) classifier?
Start 2007
with a bang!

Make WBFS
YOUR
PORTFOLIO’s

first winner
of the year

...
start
bang

portfolio

winner
year

...

university

campus
1

1
1
1
1
...
0
0
+6 > 0, SPAM
(correctly	classified)	
f (x) = sign(wT
x)
x	
start
bang

portfolio

winner
year

...

university

campus
+2

+1
+1
+1
+1
...
-3
-4
w	
x’	
St4rt 2007
with a b4ng!

Make WBFS
YOUR
PORTFOLIO’s

first winner
of the year

... campus
start
bang

portfolio

winner
year

...

university

campus
0

0
1
1
1
...
0
1
+3 -4 < 0, HAM
(misclassified	email)	
f (x) = sign(wT
x)
14
http://pralab.diee.unica.it
Can Machine Learning Be Secure?
•  Underlying assumption of machine learning techniques
–  Training and test data are sampled from the same distribution
•  In practice
–  The classifier generalizes well from known examples (+ random noise)
–  … but can not cope with carefully-crafted attacks!
•  It should be taught how to do that
–  Explicitly taking into account adversarial data manipulation
–  Adversarial machine learning
•  Problem: how can we assess classifier security in a more
systematic manner?
(1)  M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar. Can machine
learning be secure? ASIACCS 2006
(2)  B. Biggio, G. Fumera, F. Roli. Security evaluation of pattern classifiers under
attack. IEEE Trans. on Knowl. and Data Engineering, 2014 15
http://pralab.diee.unica.it
Security	Evalua<on	of	Pa>ern	Classifiers	
(1)  B. Biggio, G. Fumera, F. Roli. Security evaluation of pattern classifiers under
attack. IEEE Trans. on Knowl. and Data Engineering, 2014
(2)  B. Biggio et al., Security evaluation of SVMs. SVM applications. Springer, 2014 16
http://pralab.diee.unica.it
Adversary model
•  Goal of the attack
•  Knowledge of the attacked system
•  Capability of manipulating data
•  Attack strategy as an optimization problem
Security Evaluation of Pattern Classifiers
[B. Biggio, G. Fumera, F. Roli, IEEE Trans. KDE 2014]
Bounded adversary!
C1
C2
accuracy
Performance of more secure classifiers should degrade more gracefully under attack
0	
Bound on knowledge / capability (e.g., number of modified words)
performance degradation of
text classifiers in spam filtering
against a different number of
modified words in spam emails
Security Evaluation Curve
17
http://pralab.diee.unica.it
Adversary’s Goal
1.  Security violation [Barreno et al. ASIACCS06]
–  Integrity: evade detection without
compromising system operation
–  Availability: classification errors
to compromise system operation
–  Privacy: gaining confidential
information about system users
2.  Attack’s specificity [Barreno et al. ASIACCS06]
–  Targeted/Indiscriminate: misclassification of a specific set/any sample
Integrity
Availability Privacy
Spearphishing	vs	phishing	
18
http://pralab.diee.unica.it
Adversary’s Knowledge
•  Perfect knowledge
–  upper bound on the performance degradation under attack
TRAINING DATA
FEATURE
REPRESENTATION
LEARNING
ALGORITHM
e.g., SVM
x1
x2
...
xd
x	 x	x	
x	 x	
x	
x	
x	
x	
x	
x	
x	
x	 x	x	x	
x	
- Learning algorithm
- Parameters (e.g., feature weights)
- Feedback on decisions
19
http://pralab.diee.unica.it
•  Attack’s influence [Barreno et al. ASIACCS06]
–  Manipulation of training/test data
•  Constraints on data manipulation
–  maximum number of samples that can be added to the training data
•  the attacker usually controls only a small fraction of the training samples
–  maximum amount of modifications
•  application-specific constraints
in feature space
•  e.g., max. number of words that
are modified in spam emails
Adversary’s Capability
d(x, !x ) ≤ dmax
x2	
x1	
f(x)
x
Feasible domain
x '
20
http://pralab.diee.unica.it
Main Attack Scenarios
•  Evasion attacks
–  Goal: integrity violation, indiscriminate attack
–  Knowledge: perfect / limited
–  Capability: manipulating test samples
e.g., manipulation of spam emails at test time to evade detection
•  Poisoning attacks
–  Goal: availability violation, indiscriminate attack
–  Knowledge: perfect / limited
–  Capability: injecting samples into the training data
e.g., send spam with some ‘good words’ to poison the anti-spam filter,
which may subsequently misclassify legitimate emails containing such
‘good words’
21
http://pralab.diee.unica.it
Targeted classifier: SVM
•  Maximum-margin linear classifier f (x) = sign(g(x)), g(x) = wT
x + b
min
w,b
1
2
wT
w+C max 0,1− yi f (xi )( )
i
∑
1/margin	 classifica-on	error	on	training	data	(hinge	loss)	
22
http://pralab.diee.unica.it
•  Enables learning and classification
using only dot products between samples
•  Kernel functions for nonlinear classification
–  e.g., RBF Kernel
Kernels and Nonlinearity
w = αi yi xi
i
∑ → g(x) = αi yi x, xi
i
∑ + b
support vectors
k(x, xi ) = exp −γ x − xi
2
( )−2−1.5−1−0.500.511.5
23
http://pralab.diee.unica.it
Evasion	A>acks	
24	
1.  B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto, and F. Roli.
Evasion attacks against machine learning at test time. ECML PKDD, 2013.
2.  B. Biggio et al., Security evaluation of SVMs. SVM applications. Springer, 2014
3.  F. Zhang et al., Adversarial feature selection against evasion attacks, IEEE TCYB 2016.
http://pralab.diee.unica.it
A Simple Example
•  Problem: how to evade a linear (trained) classifier?
–  We have seen this already…
•  But… what if the classifier is nonlinear?
–  Decision functions can be arbitrarily complicated, with no clear
relationship between features (x) and classifier parameters (w)
St4rt 2007
with a b4ng!

Make WBFS
YOUR
PORTFOLIO’s

first winner
of the year

... campus
start
bang

portfolio

winner
year

...

university

campus
0

0
1
1
1
...
0
1
start
bang

portfolio

winner
year

...

university

campus
+2

+1
+1
+1
+1
...
-3
-4
+3 -4 < 0, HAM
(misclassified	email)	
f (x) = sign(wT
x)
x’	 w	
25
http://pralab.diee.unica.it
Gradient-descent Evasion Attacks
•  Goal: maximum-confidence evasion
•  Knowledge: perfect
•  Attack strategy:
•  Non-linear, constrained optimization
–  Gradient descent: approximate
solution for smooth functions
•  Gradients of g(x) can be analytically
computed in many cases
–  SVMs, Neural networks
−2−1.5−1−0.500.51
x
f (x) = sign g(x)( )=
+1, malicious
−1, legitimate
"
#
$
%$
min
x'
g(x')
s.t. d(x, x') ≤ dmax
x '
26
http://pralab.diee.unica.it
Computing Descent Directions
Support vector machines
Neural networks
x1	
xd	
δ1	
δk	
δm	
xf	 g(x)	
w1	
wk	
wm	
v11	
vmd	
vk1	
……
……
g(x) = αi yik(x,
i
∑ xi )+ b, ∇g(x) = αi yi∇k(x, xi )
i
∑
g(x) = 1+exp − wkδk (x)
k=1
m
∑
#
$
%
&
'
(
)
*
+
,
-
.
−1
∂g(x)
∂xf
= g(x) 1− g(x)( ) wkδk (x) 1−δk (x)( )vkf
k=1
m
∑
RBF kernel gradient: ∇k(x,xi
) = −2γ exp −γ || x − xi
||2
{ }(x − xi
)
27
http://pralab.diee.unica.it
An Example on Handwritten Digits
•  Nonlinear SVM (RBF kernel) to discriminate between ‘3’ and ‘7’
•  Features: gray-level pixel values
–  28 x 28 image = 784 features
Before attack (3 vs 7)
5 10 15 20 25
5
10
15
20
25
After attack, g(x)=0
5 10 15 20 25
5
10
15
20
25
After attack, last iter.
5 10 15 20 25
5
10
15
20
25
0 500
−2
−1
0
1
2
g(x)
number of iterations
Number of modified
gray-level values
Few modifications are
enough to evade detection!
… without even mimicking
the targeted class (‘7’)
28
http://pralab.diee.unica.it
Bounding the Adversary’s Knowledge
Limited knowledge attacks
•  Only feature representation and learning algorithm are known
•  Surrogate data sampled from the same distribution as the
classifier’s training data
•  Classifier’s feedback to label surrogate data
PD(X,Y)data	
Surrogate
training data
f(x)
Send queries
Get labels
Learn
surrogate
classifier
f’(x)
29
http://pralab.diee.unica.it
Experiments on PDF Malware Detection
•  PDF: hierarchy of interconnected objects (keyword/value pairs)
•  Adversary’s capability
–  adding up to dmax objects to the PDF
–  removing objects may
compromise the PDF file
(and embedded malware code)!
/Type 	 	2	
/Page 	 	1	
/Encoding 	1	
…	
13	0	obj	
<<	/Kids	[	1	0	R	11	0	R	]	
/Type	/Page	
...	>>	end	obj	
17	0	obj	
<<	/Type	/Encoding	
/Differences	[	0	/C0032	]	>>	
endobj	
	
Features:	keyword	count	
min
x'
g(x')
s.t. d(x, x') ≤ dmax
x ≤ x'
30
http://pralab.diee.unica.it
Experiments on PDF Malware Detection
•  Dataset: 500 malware samples (Contagio), 500 benign (Internet)
–  Targeted (surrogate) classifier trained on 500 (100) samples
•  Evasion rate (FN) at FP=1% vs max. number of added keywords
–  Averaged on 5 repetitions
–  Perfect knowledge (PK); Limited knowledge (LK)
0 10 20 30 40 50
0
0.2
0.4
0.6
0.8
1
dmax
FN
SVM (Linear), λ=0
PK (C=1)
LK (C=1)
Linear SVM
0 10 20 30 40 50
0
0.2
0.4
0.6
0.8
1
dmax
FN
SVM (RBF), λ=0
PK (C=1)
LK (C=1)
Nonlinear SVM (RBF Kernel)
Number of added keywords to each PDFNumber of added keywords to each PDF
31
http://pralab.diee.unica.it
Security Measures against Evasion Attacks
•  Multiple Classifier Systems (MCSs)
–  Feature Equalization
[Kolcz and Teo, CEAS 2009; Biggio et al., IJMLC 2010]
–  1.5-class classification
[Biggio et al., MCS 2015]
•  Adversarial Feature Selection
[Zhang et al., IEEE TCYB 2016]
•  Learning with Invariances: Nightmare at Test Time (InvarSVM)
–  Robust optimization (zero-sum games)
[Globerson and Teo, ICML 2006]
•  Game Theory (NashSVM)
–  Classifier vs. Adversary (non-zero-sum games)
[Brueckner et al., JMLR 2012]
32
http://pralab.diee.unica.it
MCSs for Feature Equalization
•  Rationale: more uniform feature weight distributions require the
attacker to modify more features to evade detection
33	
1.  Kolcz and C. H. Teo. Feature weighting for improved classifier robustness, CEAS 2009.
2.  B. Biggio, G. Fumera, and F. Roli. Multiple classifier systems for robust classifier
design in adversarial environments. Int’l J. Mach. Learn. Cyb., 1(1):27–41, 2010.
buy
cheap
kid
game
buy cheap
kid game
Buy cheap!
…
weights
weights
1
K
fk (x)
k=1
K
∑
f1(x) = wi
1
xi + w0
1
∑
fK (x) = wi
K
xi + w0
K
∑
…	
DATA
bagging,
RSMMCSs can be
exploited to
obtain more
uniform feature
weights
http://pralab.diee.unica.it
Wrapper-based Adversarial Feature Selection (WAFS)
[F. Zhang et al., IEEE TCYB ‘16]
•  Rationale: feature selection based on accuracy and security
–  wrapper-based backward/forward feature selection
–  main limitation: computational complexity
34	
0 5 10 15 20
0
0.2
0.4
0.6
0.8
1
Feature set size: 100
TPatFP=1%
Traditional (PK)
WAFS (PK)
0 5 10 15 20
0
0.2
0.4
0.6
0.8
1
TPatFP=1%
max. num. of modified words
Traditional (LK)
WAFS (LK)
0 5 10 15 20
0
0.2
0.4
0.6
0.8
1
Feature set size: 200
TPatFP=1%
Traditional (PK)
WAFS (PK)
0 5 10 15 20
0
0.2
0.4
0.6
0.8
1
TPatFP=1%
max. num. of modified words
Traditional (LK)
WAFS (LK)
0.
0.
0.
0.
TPatFP=1%
0.
0.
0.
0.
TPatFP=1%
20
)
0 5 10 15 20
0
0.2
0.4
0.6
0.8
1
Feature set size: 200
TPatFP=1%
Traditional (PK)
WAFS (PK)
0.6
0.8
1
P=1%
Traditional (LK)
WAFS (LK)
0 5 10 15 20
0
0.2
0.4
0.6
0.8
1
Feature set size: 300
TPatFP=1%
Traditional (PK)
WAFS (PK
0.6
0.8
1
P=1%
Traditional (LK)
WAFS (LK)
0
0.2
0.4
0.6
0.8
1
TPatFP=1%
0.6
0.8
1
P=1%
Experimental results on spam filtering (linear SVM)
http://pralab.diee.unica.it
1.5-class Classification
Underlying rationale
35	
2−class classification
−5 0 5
−5
0
5
1−class classification (legitimate)
−5 0 5
−5
0
5
•  2-class classification is usually more accurate in the absence of attack
•  … but potentially more vulnerable under attack (not enclosing legitimate data)
1.5C classification (MCS)
−5 0 5
−5
0
5
1.5-class classification aims at retaining high accuracy and security under attack
http://pralab.diee.unica.it
data
1C Classifier
(malicious)
Feature
Extraction
malicious
1C Classifier
(legitimate)
2C Classifier
1C Classifier
(legitimate)
legitimate
x
g1(x)
g2(x)
g3(x)
g(x) ≥ t
g(x)
true
false
Secure 1.5-class Classification with MCSs
•  Heuristic approach to 1.5-class classification
36	
0 5 10 15 20 25 30
0
0.2
0.4
0.6
0.8
1
maximum number of modified words
AUC1%
(PK)
2C SVM
1C SVM (L)
1C SVM (M)
1.5C MCS
0 5 10 15 20 25 30
0
0.2
0.4
0.6
0.8
1
maximum number of modified words
AUC
1%
(LK)
2C SVM
1C SVM (L)
1C SVM (M)
1.5C MCS
Spam	filtering
http://pralab.diee.unica.it
Poisoning	Machine	Learning	
37	
1.  B. Biggio, B. Nelson, P. Laskov. Poisoning attacks against SVMs. ICML, 2012
2.  B. Biggio et al., Security evaluation of SVMs. SVM applications. Springer, 2014
3.  H. Xiao et al., Is feature selection secure against training data poisoning? ICML, 2015
http://pralab.diee.unica.it
Classifier	
Web	Server	
Tr	
HTTP	requests	
Poisoning Attacks against SVMs
[B. Biggio, B. Nelson, P. Laskov, ICML 2012]
	
h#p://www.vulnerablehotel.com/components/	
com_hbssearch/longDesc.php?h_id=1&	
id=-2%20union%20select%20concat%28username,	
0x3a,password%29%20from%20jos_users--	
malicious	
	
h#p://www.vulnerablehotel.com/login/	
legitimate	
✔	
✔	
38
http://pralab.diee.unica.it
•  Poisoning classifier to cause denial of service
Classifier	
Web	Server	
Tr	
HTTP	requests	
poisoning	a#ack	
Poisoning Attacks against SVMs
[B. Biggio, B. Nelson, P. Laskov, ICML 2012]
	
h#p://www.vulnerablehotel.com/components/	
com_hbssearch/longDesc.php?h_id=1&	
id=-2%20union%20select%20concat%28username,	
0x3a,password%29%20from%20jos_users--	
legitimate	
	
h#p://www.vulnerablehotel.com/login/	
malicious	
✖	
✖	
	
h#p://www.vulnerablehotel.com/login/components/	
	
h#p://www.vulnerablehotel.com/login/longDesc.php?h_id=1&	
39
http://pralab.diee.unica.it
•  Adversary model
–  Goal: to maximize classification error (availiability, indiscriminate)
–  Knowledge: perfect knowledge (trained SVM and TR set are known)
–  Capability: injecting samples into TR
•  Attack strategy
–  optimal attack point xc in TR that maximizes classification error
xc
classifica-on	error	=	0.039	classifica-on	error	=	0.022	
Adversary Model and Attack Strategy
40
http://pralab.diee.unica.it
•  Adversary model
–  Goal: to maximize classification error (availiability, indiscriminate)
–  Knowledge: perfect knowledge (trained SVM and TR set are known)
–  Capability: injecting samples into TR
•  Attack strategy
–  optimal attack point xc in TR that maximizes classification error
classifica-on	error	=	0.022	
xc
classifica-on	error	as	a	func-on	of	xc	
Adversary Model and Attack Strategy
41
http://pralab.diee.unica.it
•  Max. classification error L(xc)
w.r.t. xc through gradient ascent
•  Gradient is not easy to compute
–  The training point affects the
classification function
–  Details of the derivation
are in the paper
Poisoning Attack Algorithm
xc
(0)
xc
xc
(0) xc
42	1.  B. Biggio, B. Nelson, P. Laskov. Poisoning attacks against SVMs. ICML, 2012
http://pralab.diee.unica.it
Experiments on the MNIST digits
Single-point attack
•  Linear SVM; 784 features; TR: 100; VAL: 500; TS: about 2000
–  ‘0’ is the malicious (attacking) class
–  ‘4’ is the legitimate (attacked) one
xc
(0)
xc
43
http://pralab.diee.unica.it
Experiments on MNIST digits
Multiple-point attack
•  Linear SVM; 784 features; TR: 100; VAL: 500; TS: about 2000
–  ‘0’ is the malicious (attacking) class
–  ‘4’ is the legitimate (attacked) one
44
http://pralab.diee.unica.it
Poisoning linear models for feature selection
[H. Xiao et al., ICML ’15]
•  Linear models
–  Select features according to |w|
45	
LASSO		
Tibshirani, 1996
Ridge	Regression	
Hoerl & Kennard, 1970
Elas<c	Net	
Zou & Hastie, 2005
A#acker	maximizes	
classifica-on	error	
w.r.t.	the	a#ack	point	
xc	(gradient	ascent)
http://pralab.diee.unica.it
Experiments on PDF Malware Detection
•  PDF: hierarchy of interconnected objects (keyword/value pairs)
•  Learner’s task: to classify benign vs malware PDF files
•  Attacker’s task: to maximize classification error by injecting
poisoning attack samples
–  Only feature increments are considered (object insertion)
•  Object removal may compromise the PDF file
/Type 	 	2	
/Page 	 	1	
/Encoding 	1	
…	
13	0	obj	
<<	/Kids	[	1	0	R	11	0	R	]	
/Type	/Page	
...	>>	end	obj	
	
17	0	obj	
<<	/Type	/Encoding	
/Differences	[	0	/C0032	]	>>	
endobj	
Features:	keyword	counts	
46	
Maiorca et al., 2012; 2013;
Smutz & Stavrou, 2012;
Srndic & Laskov, 2013
http://pralab.diee.unica.it
Experimental Results
47	
PerfectKnowledge
Data: 300 (TR) and 5,000 (TS) samples – 114 features
Similar results obtained for limited-knowledge attacks!
http://pralab.diee.unica.it
Security Measures against Poisoning
•  Rationale: poisoning injects outlying training samples
•  Two main strategies for countering this threat
1.  Data sanitization: remove poisoning samples from training data
•  Bagging for fighting poisoning attacks
•  Reject-On-Negative-Impact (RONI) defense
2.  Robust Learning: learning algorithms that are robust in the presence
of poisoning samples
48	
xc
(0)
xc
xc
(0) xc
http://pralab.diee.unica.it
Security Measures against Poisoning
Data Sanitization :: Multiple Classifier Systems
•  (Weighted) Bagging for fighting poisoning attacks
–  Underlying idea: resampling outlying samples with lower probability
•  Two-step algorithm:
1.  Density estimation to assign lower resampling weights to outliers
2.  Bagging to train an MCS
•  Promising results on
spam (see plot) and
web-based intrusion detection
–  S: standard classifier
–  B: standard bagging
–  WB: weighted bagging
(numbers in the legend
correspond to different ensemble sizes)
49	
1.  B. Biggio, I. Corona, G. Fumera, G. Giacinto, and F. Roli. Bagging classifiers for
fighting poisoning attacks in adversarial classification tasks. MCS, 2011.
http://pralab.diee.unica.it
A>acking	Clustering	
50	
1.  B. Biggio, I. Pillai, S. R. Bulò, D. Ariu, M. Pelillo, and F. Roli. Is data clustering in
adversarial settings secure? AISec, 2013
2.  B. Biggio, S. R. Bulò, I. Pillai, M. Mura, E. Z. Mequanint, M. Pelillo, and F. Roli.
Poisoning complete-linkage hierarchical clustering. S+SSPR, 2014
http://pralab.diee.unica.it
Attacking Clustering
•  So far, we have considered supervised learning
–  Training data consisting of samples and class labels
•  In many applications, labels are not available or costly to obtain
–  Unsupervised learning
•  Training data only include samples – no labels!
•  Malware clustering
–  To identify variants of existing malware or new malware families
x	 x	x	
x	 x	
x	
x	
x	
x	
x	
x	
x	
x	 x	x	x	
x	
x1
x2
...
xd
feature extraction
(e.g., URL length,
num. of parameters, etc.)
data collection
(honeypots)
clustering of
malware families
(e.g., similar HTTP
requests)
data analysis /
countermeasure design
(e.g., signature generation)
for	each	cluster	
			if	…	
						then	…	
			else	…	
51
http://pralab.diee.unica.it
Is Data Clustering Secure?
•  Attackers can poison input data to subvert malware clustering
x	
x	
x	
x	 x	
x	
x	
x	
x	
x	
x	
x	x	
x	
x	x	
x	
x1
x2
...
xd
feature extraction
(e.g., URL length,
num. of parameters, etc.)
data collection
(honeypots)
clustering of
malware families
(e.g., similar HTTP
requests)
data analysis /
countermeasure design
(e.g., signature generation)
for	each	cluster	
			if	…	
						then	…	
			else	…	
Well-cra2ed	HTTP	requests	to	subvert	clustering	
	
h#p://www.vulnerablehotel.com/…	
h#p://www.vulnerablehotel.com/…	
h#p://www.vulnerablehotel.com/…	
h#p://www.vulnerablehotel.com/…	
… is significantly
compromised
… becomes
useless (too many
false alarms, low
detection rate)
52
http://pralab.diee.unica.it
Our Work
•  A framework to identify/design attacks against clustering algorithms
–  Poisoning: add samples to maximally compromise the clustering output
–  Obfuscation: hide samples within existing clusters
•  Some clustering algorithms can be very sensitive to poisoning!
–  single- and complete-linkage hierarchical clustering can be easily
compromised by creating heterogeneous clusters
•  Details on the attack derivation and implementation are in the papers
Clustering on untainted data (80 samples) Clustering after adding 10 attack samples
53	
1.  B. Biggio et al. Is data clustering in adversarial settings secure? AISec, 2013
2.  B. Biggio et al. Poisoning complete-linkage hierarchical clustering. S+SSPR, 2014
http://pralab.diee.unica.it
Conclusions and Future Work
•  Learning-based systems are vulnerable to well-crafted,
sophisticated attacks devised by skilled attackers
–  … that exploit specific vulnerabilities of machine learning algorithms!
Secure learning algorithms
Attacks against learning
54
http://pralab.diee.unica.it
Joint work with …
?	Any questions
Thanks	for	your	a#en-on!	
55	
… and many others

More Related Content

What's hot

Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...Pluribus One
 
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...Pluribus One
 
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019Pluribus One
 
adversarial robustness through local linearization
 adversarial robustness through local linearization adversarial robustness through local linearization
adversarial robustness through local linearizationtaeseon ryu
 
Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)butest
 
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...Kishor Datta Gupta
 
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...Kishor Datta Gupta
 
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019anant90
 
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...GeekPwn Keen
 
Machine learning in computer security
Machine learning in computer securityMachine learning in computer security
Machine learning in computer securityKishor Datta Gupta
 
Bringing Red vs. Blue to Machine Learning
Bringing Red vs. Blue to Machine LearningBringing Red vs. Blue to Machine Learning
Bringing Red vs. Blue to Machine LearningBobby Filar
 
Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)Kishor Datta Gupta
 
Who is responsible for adversarial defense
Who is responsible for adversarial defenseWho is responsible for adversarial defense
Who is responsible for adversarial defenseKishor Datta Gupta
 
AI model security and robustness
AI model security and robustnessAI model security and robustness
AI model security and robustnessRajib Biswas
 
TreeHuggr: Discovering Where Tree-based Classifiers are Vulnerable to Adversa...
TreeHuggr: Discovering Where Tree-based Classifiers are Vulnerable to Adversa...TreeHuggr: Discovering Where Tree-based Classifiers are Vulnerable to Adversa...
TreeHuggr: Discovering Where Tree-based Classifiers are Vulnerable to Adversa...Bobby Filar
 
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.anant90
 
Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...
Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...
Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...Pluribus One
 
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...Pluribus One
 
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...Pluribus One
 

What's hot (20)

Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
 
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
 
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
WILD PATTERNS - Introduction to Adversarial Machine Learning - ITASEC 2019
 
adversarial robustness through local linearization
 adversarial robustness through local linearization adversarial robustness through local linearization
adversarial robustness through local linearization
 
Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)Keyboards, Privacy, and Sensor Webs (Part II)
Keyboards, Privacy, and Sensor Webs (Part II)
 
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
 
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
 
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
 
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
 
Machine learning in computer security
Machine learning in computer securityMachine learning in computer security
Machine learning in computer security
 
Bringing Red vs. Blue to Machine Learning
Bringing Red vs. Blue to Machine LearningBringing Red vs. Blue to Machine Learning
Bringing Red vs. Blue to Machine Learning
 
Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)
 
Who is responsible for adversarial defense
Who is responsible for adversarial defenseWho is responsible for adversarial defense
Who is responsible for adversarial defense
 
AI model security and robustness
AI model security and robustnessAI model security and robustness
AI model security and robustness
 
TreeHuggr: Discovering Where Tree-based Classifiers are Vulnerable to Adversa...
TreeHuggr: Discovering Where Tree-based Classifiers are Vulnerable to Adversa...TreeHuggr: Discovering Where Tree-based Classifiers are Vulnerable to Adversa...
TreeHuggr: Discovering Where Tree-based Classifiers are Vulnerable to Adversa...
 
Robustness in deep learning
Robustness in deep learningRobustness in deep learning
Robustness in deep learning
 
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.
Mozfest 2018 session slides: Let's fool modern A.I. systems with stickers.
 
Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...
Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...
Wild patterns - Ten years after the rise of Adversarial Machine Learning - Ne...
 
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning - 2019 Int...
 
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...
 

Similar to Machine Learning under Attack: Vulnerability Exploitation and Security Measures

2014 IEEE JAVA DATA MINING PROJECT Security evaluation of pattern classifiers...
2014 IEEE JAVA DATA MINING PROJECT Security evaluation of pattern classifiers...2014 IEEE JAVA DATA MINING PROJECT Security evaluation of pattern classifiers...
2014 IEEE JAVA DATA MINING PROJECT Security evaluation of pattern classifiers...IEEEMEMTECHSTUDENTSPROJECTS
 
IEEE 2014 JAVA DATA MINING PROJECTS Security evaluation of pattern classifier...
IEEE 2014 JAVA DATA MINING PROJECTS Security evaluation of pattern classifier...IEEE 2014 JAVA DATA MINING PROJECTS Security evaluation of pattern classifier...
IEEE 2014 JAVA DATA MINING PROJECTS Security evaluation of pattern classifier...IEEEFINALYEARSTUDENTPROJECTS
 
Tackle Unknown Threats with Symantec Endpoint Protection 14 Machine Learning
Tackle Unknown Threats with Symantec Endpoint Protection 14 Machine LearningTackle Unknown Threats with Symantec Endpoint Protection 14 Machine Learning
Tackle Unknown Threats with Symantec Endpoint Protection 14 Machine LearningSymantec
 
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...Pluribus One
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsRahul Mohandas
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsRahul Mohandas
 
Evasion Attack Detection using Adaboost Learning Classifier
Evasion Attack Detection using Adaboost Learning ClassifierEvasion Attack Detection using Adaboost Learning Classifier
Evasion Attack Detection using Adaboost Learning ClassifierIRJET Journal
 
Stopping threats with Votiro's Advanced Content Disarm and Reconstruction tec...
Stopping threats with Votiro's Advanced Content Disarm and Reconstruction tec...Stopping threats with Votiro's Advanced Content Disarm and Reconstruction tec...
Stopping threats with Votiro's Advanced Content Disarm and Reconstruction tec...Jasmin Hami
 
The security mindset securing social media integrations and social learning...
The security mindset   securing social media integrations and social learning...The security mindset   securing social media integrations and social learning...
The security mindset securing social media integrations and social learning...franco_bb
 
IRJET- Penetration Testing using Metasploit Framework: An Ethical Approach
IRJET- Penetration Testing using Metasploit Framework: An Ethical ApproachIRJET- Penetration Testing using Metasploit Framework: An Ethical Approach
IRJET- Penetration Testing using Metasploit Framework: An Ethical ApproachIRJET Journal
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR ijcax
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR ijcax
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR ijcax
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILRSPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILRijcax
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR ijcax
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR ijcax
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILRSPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILRijcax
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILRSPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILRijcax
 
Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...
Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...
Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...cyberprosocial
 
Course overview Cybersecurity and its applications
Course overview Cybersecurity and its applicationsCourse overview Cybersecurity and its applications
Course overview Cybersecurity and its applicationsSanket Shikhar
 

Similar to Machine Learning under Attack: Vulnerability Exploitation and Security Measures (20)

2014 IEEE JAVA DATA MINING PROJECT Security evaluation of pattern classifiers...
2014 IEEE JAVA DATA MINING PROJECT Security evaluation of pattern classifiers...2014 IEEE JAVA DATA MINING PROJECT Security evaluation of pattern classifiers...
2014 IEEE JAVA DATA MINING PROJECT Security evaluation of pattern classifiers...
 
IEEE 2014 JAVA DATA MINING PROJECTS Security evaluation of pattern classifier...
IEEE 2014 JAVA DATA MINING PROJECTS Security evaluation of pattern classifier...IEEE 2014 JAVA DATA MINING PROJECTS Security evaluation of pattern classifier...
IEEE 2014 JAVA DATA MINING PROJECTS Security evaluation of pattern classifier...
 
Tackle Unknown Threats with Symantec Endpoint Protection 14 Machine Learning
Tackle Unknown Threats with Symantec Endpoint Protection 14 Machine LearningTackle Unknown Threats with Symantec Endpoint Protection 14 Machine Learning
Tackle Unknown Threats with Symantec Endpoint Protection 14 Machine Learning
 
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
 
Understand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day ThreatsUnderstand How Machine Learning Defends Against Zero-Day Threats
Understand How Machine Learning Defends Against Zero-Day Threats
 
Evasion Attack Detection using Adaboost Learning Classifier
Evasion Attack Detection using Adaboost Learning ClassifierEvasion Attack Detection using Adaboost Learning Classifier
Evasion Attack Detection using Adaboost Learning Classifier
 
Stopping threats with Votiro's Advanced Content Disarm and Reconstruction tec...
Stopping threats with Votiro's Advanced Content Disarm and Reconstruction tec...Stopping threats with Votiro's Advanced Content Disarm and Reconstruction tec...
Stopping threats with Votiro's Advanced Content Disarm and Reconstruction tec...
 
The security mindset securing social media integrations and social learning...
The security mindset   securing social media integrations and social learning...The security mindset   securing social media integrations and social learning...
The security mindset securing social media integrations and social learning...
 
IRJET- Penetration Testing using Metasploit Framework: An Ethical Approach
IRJET- Penetration Testing using Metasploit Framework: An Ethical ApproachIRJET- Penetration Testing using Metasploit Framework: An Ethical Approach
IRJET- Penetration Testing using Metasploit Framework: An Ethical Approach
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILRSPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILRSPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILRSPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
 
Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...
Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...
Unveiling the Shadows: A Comprehensive Guide to Malware Analysis for Ensuring...
 
Course overview Cybersecurity and its applications
Course overview Cybersecurity and its applicationsCourse overview Cybersecurity and its applications
Course overview Cybersecurity and its applications
 

More from Pluribus One

Smart Textiles - Prospettive di mercato - Davide Ariu
Smart Textiles - Prospettive di mercato - Davide Ariu Smart Textiles - Prospettive di mercato - Davide Ariu
Smart Textiles - Prospettive di mercato - Davide Ariu Pluribus One
 
On Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsOn Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsPluribus One
 
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringBattista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringPluribus One
 
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Pluribus One
 
Zahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesZahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesPluribus One
 
Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Pluribus One
 
Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Pluribus One
 
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...Pluribus One
 
Understanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environmentsUnderstanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environmentsPluribus One
 
Amilab IJCB 2011 Poster
Amilab IJCB 2011 PosterAmilab IJCB 2011 Poster
Amilab IJCB 2011 PosterPluribus One
 
Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Pluribus One
 
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - PosterAriu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - PosterPluribus One
 
Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011Pluribus One
 
Ariu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern AnalysisAriu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern AnalysisPluribus One
 
Ariu - Workshop on Multiple Classifier Systems 2011
Ariu - Workshop on Multiple Classifier Systems 2011Ariu - Workshop on Multiple Classifier Systems 2011
Ariu - Workshop on Multiple Classifier Systems 2011Pluribus One
 
Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...
Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...
Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...Pluribus One
 

More from Pluribus One (17)

Smart Textiles - Prospettive di mercato - Davide Ariu
Smart Textiles - Prospettive di mercato - Davide Ariu Smart Textiles - Prospettive di mercato - Davide Ariu
Smart Textiles - Prospettive di mercato - Davide Ariu
 
On Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsOn Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial Settings
 
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringBattista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
 
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
 
Zahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesZahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense Slides
 
Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...
 
Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...
 
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
 
Understanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environmentsUnderstanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environments
 
Amilab IJCB 2011 Poster
Amilab IJCB 2011 PosterAmilab IJCB 2011 Poster
Amilab IJCB 2011 Poster
 
Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011
 
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - PosterAriu - Workshop on Applications of Pattern Analysis 2010 - Poster
Ariu - Workshop on Applications of Pattern Analysis 2010 - Poster
 
Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011Ariu - Workshop on Multiple Classifier Systems - 2011
Ariu - Workshop on Multiple Classifier Systems - 2011
 
Ariu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern AnalysisAriu - Workshop on Applications of Pattern Analysis
Ariu - Workshop on Applications of Pattern Analysis
 
Ariu - Workshop on Multiple Classifier Systems 2011
Ariu - Workshop on Multiple Classifier Systems 2011Ariu - Workshop on Multiple Classifier Systems 2011
Ariu - Workshop on Multiple Classifier Systems 2011
 
Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...
Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...
Robustness of Multimodal Biometric Systems under Realistic Spoof Attacks agai...
 
Wiamis2010 poster
Wiamis2010 posterWiamis2010 poster
Wiamis2010 poster
 

Recently uploaded

IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxPurva Nikam
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 

Recently uploaded (20)

IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptx
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 

Machine Learning under Attack: Vulnerability Exploitation and Security Measures

  • 1. Pa#ern Recogni-on and Applica-ons Lab University of Cagliari, Italy Department of Electrical and Electronic Engineering Machine Learning Under Attack: Vulnerability Exploitation and Security Measures BaAsta Biggio baAsta.biggio@diee.unica.it Dept. Of Electrical and Electronic Engineering University of Cagliari, Italy Vigo, Spain, June 21, 2016 IH&MMSec
  • 2. http://pralab.diee.unica.it Recent Applications of ML and PR •  Machine Learning (ML) and Pattern Recognition (PR) increasingly used in Personal and Consumer applications 2
  • 3. http://pralab.diee.unica.it iPhone 5s with Fingerprint Recognition… 3
  • 4. http://pralab.diee.unica.it … Cracked a Few Days After Its Release 4 EU FP7 Project: TABULA RASA
  • 5. http://pralab.diee.unica.it Smart Fridge Caught Sending Spam •  Jan., 2014: A fridge has been caught sending spam after a web attack managed to compromise smart gadgets •  The fridge was one of the 100,000 compromised devices used in the spam campaign 5 http://www.bbc.com/news/technology-25780908
  • 6. http://pralab.diee.unica.it New Challenges for ML/PR •  We are living exciting time for ML/PR technologies –  Our work feeds a lot of consumer technologies for personal applications •  This opens up new big possibilities but also new security risks •  Proliferation and sophistication of attacks and cyberthreats –  Skilled / economically-motivated attackers (e.g., ransomware) •  Several security systems use machine learning to detect attacks –  but … is machine learning secure enough? 6
  • 7. http://pralab.diee.unica.it Are we ready for this? Can we use classical pattern recognition and machine learning techniques under attack? No, we cannot. We are facing an adversarial setting… We should learn to find secure patterns 7
  • 8. http://pralab.diee.unica.it Secure Patterns in Nature •  Learning of secure patterns is a well-known problem in nature –  Mimicry and camouflage –  Arms race between predators and preys 8
  • 9. http://pralab.diee.unica.it Secure Patterns in Computer Security •  Similar phenomenon in machine learning and computer security –  Obfuscation and polymorphism to hide malicious content Spam emails Malware Start 2007 with a bang!
 Make WBFS YOUR PORTFOLIO’s
 first winner of the year
 ... <script type="text/javascript" src=”http:// palwas.servehttp.com//ml.php"></ script> ... var PGuDO0uq19+PGuDO0uq20; EbphZcei=PVqIW5sV.replace(/jTUZZ/ g,"%"); var eWfleJqh=unescape; Var NxfaGVHq=“pqXdQ23KZril30”; q9124=this; var SkuyuppD=
 q9124["WYd1GoGYc2uG1mYGe2YnltY".r eplace(/[Y12WlG:]/g, "")];SkuyuppD.write(eWfleJqh(Ebph Zcei)); ... 9
  • 10. http://pralab.diee.unica.it •  Adaptation/evolution is crucial to survive! Arms Race Attackers Evasion techniques System designers Design of effective countermeasures text analysis on spam emails visual analysis of attached images Image-based spam … 10
  • 12. http://pralab.diee.unica.it Design of Learning-based Systems Training phase x x x x x x x x x x x x x x x x x x1 x2 ... xd pre-processing and feature extrac-on training data (with labels) classifier learning Linear classifiers: assign a weight to each feature and classify a sample based on the sign of its score f (x) = sign(wT x) +1, malicious −1, legitimate " # $ %$ start bang
 portfolio
 winner year
 ...
 university
 campus Start 2007 with a bang!
 Make WBFS YOUR PORTFOLIO’s
 first winner of the year
 ... start bang
 portfolio
 winner year
 ...
 university
 campus 1
 1 1 1 1 ... 0 0 x SPAM start bang
 portfolio
 winner year
 ...
 university
 campus +2
 +1 +1 +1 +1 ... -3 -4 w 12
  • 13. http://pralab.diee.unica.it Design of Learning-based Systems Test phase x1 x2 ... xd pre-processing and feature extrac-on test data x x x x x x x x x x x x x x x x x classifica-on and performance evalua-on e.g., classifica-on accuracy Start 2007 with a bang!
 Make WBFS YOUR PORTFOLIO’s
 first winner of the year
 ... start bang
 portfolio
 winner year
 ...
 university
 campus 1
 1 1 1 1 ... 0 0 start bang
 portfolio
 winner year
 ...
 university
 campus +2
 +1 +1 +1 +1 ... -3 -4 +6 > 0, SPAM (correctly classified) x w Linear classifiers: assign a weight to each feature and classify a sample based on the sign of its score f (x) = sign(wT x) +1, malicious −1, legitimate " # $ %$ 13
  • 14. http://pralab.diee.unica.it Can Machine Learning Be Secure? •  Problem: how to evade a linear (trained) classifier? Start 2007 with a bang!
 Make WBFS YOUR PORTFOLIO’s
 first winner of the year
 ... start bang
 portfolio
 winner year
 ...
 university
 campus 1
 1 1 1 1 ... 0 0 +6 > 0, SPAM (correctly classified) f (x) = sign(wT x) x start bang
 portfolio
 winner year
 ...
 university
 campus +2
 +1 +1 +1 +1 ... -3 -4 w x’ St4rt 2007 with a b4ng!
 Make WBFS YOUR PORTFOLIO’s
 first winner of the year
 ... campus start bang
 portfolio
 winner year
 ...
 university
 campus 0
 0 1 1 1 ... 0 1 +3 -4 < 0, HAM (misclassified email) f (x) = sign(wT x) 14
  • 15. http://pralab.diee.unica.it Can Machine Learning Be Secure? •  Underlying assumption of machine learning techniques –  Training and test data are sampled from the same distribution •  In practice –  The classifier generalizes well from known examples (+ random noise) –  … but can not cope with carefully-crafted attacks! •  It should be taught how to do that –  Explicitly taking into account adversarial data manipulation –  Adversarial machine learning •  Problem: how can we assess classifier security in a more systematic manner? (1)  M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar. Can machine learning be secure? ASIACCS 2006 (2)  B. Biggio, G. Fumera, F. Roli. Security evaluation of pattern classifiers under attack. IEEE Trans. on Knowl. and Data Engineering, 2014 15
  • 16. http://pralab.diee.unica.it Security Evalua<on of Pa>ern Classifiers (1)  B. Biggio, G. Fumera, F. Roli. Security evaluation of pattern classifiers under attack. IEEE Trans. on Knowl. and Data Engineering, 2014 (2)  B. Biggio et al., Security evaluation of SVMs. SVM applications. Springer, 2014 16
  • 17. http://pralab.diee.unica.it Adversary model •  Goal of the attack •  Knowledge of the attacked system •  Capability of manipulating data •  Attack strategy as an optimization problem Security Evaluation of Pattern Classifiers [B. Biggio, G. Fumera, F. Roli, IEEE Trans. KDE 2014] Bounded adversary! C1 C2 accuracy Performance of more secure classifiers should degrade more gracefully under attack 0 Bound on knowledge / capability (e.g., number of modified words) performance degradation of text classifiers in spam filtering against a different number of modified words in spam emails Security Evaluation Curve 17
  • 18. http://pralab.diee.unica.it Adversary’s Goal 1.  Security violation [Barreno et al. ASIACCS06] –  Integrity: evade detection without compromising system operation –  Availability: classification errors to compromise system operation –  Privacy: gaining confidential information about system users 2.  Attack’s specificity [Barreno et al. ASIACCS06] –  Targeted/Indiscriminate: misclassification of a specific set/any sample Integrity Availability Privacy Spearphishing vs phishing 18
  • 19. http://pralab.diee.unica.it Adversary’s Knowledge •  Perfect knowledge –  upper bound on the performance degradation under attack TRAINING DATA FEATURE REPRESENTATION LEARNING ALGORITHM e.g., SVM x1 x2 ... xd x x x x x x x x x x x x x x x x x - Learning algorithm - Parameters (e.g., feature weights) - Feedback on decisions 19
  • 20. http://pralab.diee.unica.it •  Attack’s influence [Barreno et al. ASIACCS06] –  Manipulation of training/test data •  Constraints on data manipulation –  maximum number of samples that can be added to the training data •  the attacker usually controls only a small fraction of the training samples –  maximum amount of modifications •  application-specific constraints in feature space •  e.g., max. number of words that are modified in spam emails Adversary’s Capability d(x, !x ) ≤ dmax x2 x1 f(x) x Feasible domain x ' 20
  • 21. http://pralab.diee.unica.it Main Attack Scenarios •  Evasion attacks –  Goal: integrity violation, indiscriminate attack –  Knowledge: perfect / limited –  Capability: manipulating test samples e.g., manipulation of spam emails at test time to evade detection •  Poisoning attacks –  Goal: availability violation, indiscriminate attack –  Knowledge: perfect / limited –  Capability: injecting samples into the training data e.g., send spam with some ‘good words’ to poison the anti-spam filter, which may subsequently misclassify legitimate emails containing such ‘good words’ 21
  • 22. http://pralab.diee.unica.it Targeted classifier: SVM •  Maximum-margin linear classifier f (x) = sign(g(x)), g(x) = wT x + b min w,b 1 2 wT w+C max 0,1− yi f (xi )( ) i ∑ 1/margin classifica-on error on training data (hinge loss) 22
  • 23. http://pralab.diee.unica.it •  Enables learning and classification using only dot products between samples •  Kernel functions for nonlinear classification –  e.g., RBF Kernel Kernels and Nonlinearity w = αi yi xi i ∑ → g(x) = αi yi x, xi i ∑ + b support vectors k(x, xi ) = exp −γ x − xi 2 ( )−2−1.5−1−0.500.511.5 23
  • 24. http://pralab.diee.unica.it Evasion A>acks 24 1.  B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto, and F. Roli. Evasion attacks against machine learning at test time. ECML PKDD, 2013. 2.  B. Biggio et al., Security evaluation of SVMs. SVM applications. Springer, 2014 3.  F. Zhang et al., Adversarial feature selection against evasion attacks, IEEE TCYB 2016.
  • 25. http://pralab.diee.unica.it A Simple Example •  Problem: how to evade a linear (trained) classifier? –  We have seen this already… •  But… what if the classifier is nonlinear? –  Decision functions can be arbitrarily complicated, with no clear relationship between features (x) and classifier parameters (w) St4rt 2007 with a b4ng!
 Make WBFS YOUR PORTFOLIO’s
 first winner of the year
 ... campus start bang
 portfolio
 winner year
 ...
 university
 campus 0
 0 1 1 1 ... 0 1 start bang
 portfolio
 winner year
 ...
 university
 campus +2
 +1 +1 +1 +1 ... -3 -4 +3 -4 < 0, HAM (misclassified email) f (x) = sign(wT x) x’ w 25
  • 26. http://pralab.diee.unica.it Gradient-descent Evasion Attacks •  Goal: maximum-confidence evasion •  Knowledge: perfect •  Attack strategy: •  Non-linear, constrained optimization –  Gradient descent: approximate solution for smooth functions •  Gradients of g(x) can be analytically computed in many cases –  SVMs, Neural networks −2−1.5−1−0.500.51 x f (x) = sign g(x)( )= +1, malicious −1, legitimate " # $ %$ min x' g(x') s.t. d(x, x') ≤ dmax x ' 26
  • 27. http://pralab.diee.unica.it Computing Descent Directions Support vector machines Neural networks x1 xd δ1 δk δm xf g(x) w1 wk wm v11 vmd vk1 …… …… g(x) = αi yik(x, i ∑ xi )+ b, ∇g(x) = αi yi∇k(x, xi ) i ∑ g(x) = 1+exp − wkδk (x) k=1 m ∑ # $ % & ' ( ) * + , - . −1 ∂g(x) ∂xf = g(x) 1− g(x)( ) wkδk (x) 1−δk (x)( )vkf k=1 m ∑ RBF kernel gradient: ∇k(x,xi ) = −2γ exp −γ || x − xi ||2 { }(x − xi ) 27
  • 28. http://pralab.diee.unica.it An Example on Handwritten Digits •  Nonlinear SVM (RBF kernel) to discriminate between ‘3’ and ‘7’ •  Features: gray-level pixel values –  28 x 28 image = 784 features Before attack (3 vs 7) 5 10 15 20 25 5 10 15 20 25 After attack, g(x)=0 5 10 15 20 25 5 10 15 20 25 After attack, last iter. 5 10 15 20 25 5 10 15 20 25 0 500 −2 −1 0 1 2 g(x) number of iterations Number of modified gray-level values Few modifications are enough to evade detection! … without even mimicking the targeted class (‘7’) 28
  • 29. http://pralab.diee.unica.it Bounding the Adversary’s Knowledge Limited knowledge attacks •  Only feature representation and learning algorithm are known •  Surrogate data sampled from the same distribution as the classifier’s training data •  Classifier’s feedback to label surrogate data PD(X,Y)data Surrogate training data f(x) Send queries Get labels Learn surrogate classifier f’(x) 29
  • 30. http://pralab.diee.unica.it Experiments on PDF Malware Detection •  PDF: hierarchy of interconnected objects (keyword/value pairs) •  Adversary’s capability –  adding up to dmax objects to the PDF –  removing objects may compromise the PDF file (and embedded malware code)! /Type 2 /Page 1 /Encoding 1 … 13 0 obj << /Kids [ 1 0 R 11 0 R ] /Type /Page ... >> end obj 17 0 obj << /Type /Encoding /Differences [ 0 /C0032 ] >> endobj Features: keyword count min x' g(x') s.t. d(x, x') ≤ dmax x ≤ x' 30
  • 31. http://pralab.diee.unica.it Experiments on PDF Malware Detection •  Dataset: 500 malware samples (Contagio), 500 benign (Internet) –  Targeted (surrogate) classifier trained on 500 (100) samples •  Evasion rate (FN) at FP=1% vs max. number of added keywords –  Averaged on 5 repetitions –  Perfect knowledge (PK); Limited knowledge (LK) 0 10 20 30 40 50 0 0.2 0.4 0.6 0.8 1 dmax FN SVM (Linear), λ=0 PK (C=1) LK (C=1) Linear SVM 0 10 20 30 40 50 0 0.2 0.4 0.6 0.8 1 dmax FN SVM (RBF), λ=0 PK (C=1) LK (C=1) Nonlinear SVM (RBF Kernel) Number of added keywords to each PDFNumber of added keywords to each PDF 31
  • 32. http://pralab.diee.unica.it Security Measures against Evasion Attacks •  Multiple Classifier Systems (MCSs) –  Feature Equalization [Kolcz and Teo, CEAS 2009; Biggio et al., IJMLC 2010] –  1.5-class classification [Biggio et al., MCS 2015] •  Adversarial Feature Selection [Zhang et al., IEEE TCYB 2016] •  Learning with Invariances: Nightmare at Test Time (InvarSVM) –  Robust optimization (zero-sum games) [Globerson and Teo, ICML 2006] •  Game Theory (NashSVM) –  Classifier vs. Adversary (non-zero-sum games) [Brueckner et al., JMLR 2012] 32
  • 33. http://pralab.diee.unica.it MCSs for Feature Equalization •  Rationale: more uniform feature weight distributions require the attacker to modify more features to evade detection 33 1.  Kolcz and C. H. Teo. Feature weighting for improved classifier robustness, CEAS 2009. 2.  B. Biggio, G. Fumera, and F. Roli. Multiple classifier systems for robust classifier design in adversarial environments. Int’l J. Mach. Learn. Cyb., 1(1):27–41, 2010. buy cheap kid game buy cheap kid game Buy cheap! … weights weights 1 K fk (x) k=1 K ∑ f1(x) = wi 1 xi + w0 1 ∑ fK (x) = wi K xi + w0 K ∑ … DATA bagging, RSMMCSs can be exploited to obtain more uniform feature weights
  • 34. http://pralab.diee.unica.it Wrapper-based Adversarial Feature Selection (WAFS) [F. Zhang et al., IEEE TCYB ‘16] •  Rationale: feature selection based on accuracy and security –  wrapper-based backward/forward feature selection –  main limitation: computational complexity 34 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 Feature set size: 100 TPatFP=1% Traditional (PK) WAFS (PK) 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 TPatFP=1% max. num. of modified words Traditional (LK) WAFS (LK) 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 Feature set size: 200 TPatFP=1% Traditional (PK) WAFS (PK) 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 TPatFP=1% max. num. of modified words Traditional (LK) WAFS (LK) 0. 0. 0. 0. TPatFP=1% 0. 0. 0. 0. TPatFP=1% 20 ) 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 Feature set size: 200 TPatFP=1% Traditional (PK) WAFS (PK) 0.6 0.8 1 P=1% Traditional (LK) WAFS (LK) 0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 Feature set size: 300 TPatFP=1% Traditional (PK) WAFS (PK 0.6 0.8 1 P=1% Traditional (LK) WAFS (LK) 0 0.2 0.4 0.6 0.8 1 TPatFP=1% 0.6 0.8 1 P=1% Experimental results on spam filtering (linear SVM)
  • 35. http://pralab.diee.unica.it 1.5-class Classification Underlying rationale 35 2−class classification −5 0 5 −5 0 5 1−class classification (legitimate) −5 0 5 −5 0 5 •  2-class classification is usually more accurate in the absence of attack •  … but potentially more vulnerable under attack (not enclosing legitimate data) 1.5C classification (MCS) −5 0 5 −5 0 5 1.5-class classification aims at retaining high accuracy and security under attack
  • 36. http://pralab.diee.unica.it data 1C Classifier (malicious) Feature Extraction malicious 1C Classifier (legitimate) 2C Classifier 1C Classifier (legitimate) legitimate x g1(x) g2(x) g3(x) g(x) ≥ t g(x) true false Secure 1.5-class Classification with MCSs •  Heuristic approach to 1.5-class classification 36 0 5 10 15 20 25 30 0 0.2 0.4 0.6 0.8 1 maximum number of modified words AUC1% (PK) 2C SVM 1C SVM (L) 1C SVM (M) 1.5C MCS 0 5 10 15 20 25 30 0 0.2 0.4 0.6 0.8 1 maximum number of modified words AUC 1% (LK) 2C SVM 1C SVM (L) 1C SVM (M) 1.5C MCS Spam filtering
  • 37. http://pralab.diee.unica.it Poisoning Machine Learning 37 1.  B. Biggio, B. Nelson, P. Laskov. Poisoning attacks against SVMs. ICML, 2012 2.  B. Biggio et al., Security evaluation of SVMs. SVM applications. Springer, 2014 3.  H. Xiao et al., Is feature selection secure against training data poisoning? ICML, 2015
  • 38. http://pralab.diee.unica.it Classifier Web Server Tr HTTP requests Poisoning Attacks against SVMs [B. Biggio, B. Nelson, P. Laskov, ICML 2012] h#p://www.vulnerablehotel.com/components/ com_hbssearch/longDesc.php?h_id=1& id=-2%20union%20select%20concat%28username, 0x3a,password%29%20from%20jos_users-- malicious h#p://www.vulnerablehotel.com/login/ legitimate ✔ ✔ 38
  • 39. http://pralab.diee.unica.it •  Poisoning classifier to cause denial of service Classifier Web Server Tr HTTP requests poisoning a#ack Poisoning Attacks against SVMs [B. Biggio, B. Nelson, P. Laskov, ICML 2012] h#p://www.vulnerablehotel.com/components/ com_hbssearch/longDesc.php?h_id=1& id=-2%20union%20select%20concat%28username, 0x3a,password%29%20from%20jos_users-- legitimate h#p://www.vulnerablehotel.com/login/ malicious ✖ ✖ h#p://www.vulnerablehotel.com/login/components/ h#p://www.vulnerablehotel.com/login/longDesc.php?h_id=1& 39
  • 40. http://pralab.diee.unica.it •  Adversary model –  Goal: to maximize classification error (availiability, indiscriminate) –  Knowledge: perfect knowledge (trained SVM and TR set are known) –  Capability: injecting samples into TR •  Attack strategy –  optimal attack point xc in TR that maximizes classification error xc classifica-on error = 0.039 classifica-on error = 0.022 Adversary Model and Attack Strategy 40
  • 41. http://pralab.diee.unica.it •  Adversary model –  Goal: to maximize classification error (availiability, indiscriminate) –  Knowledge: perfect knowledge (trained SVM and TR set are known) –  Capability: injecting samples into TR •  Attack strategy –  optimal attack point xc in TR that maximizes classification error classifica-on error = 0.022 xc classifica-on error as a func-on of xc Adversary Model and Attack Strategy 41
  • 42. http://pralab.diee.unica.it •  Max. classification error L(xc) w.r.t. xc through gradient ascent •  Gradient is not easy to compute –  The training point affects the classification function –  Details of the derivation are in the paper Poisoning Attack Algorithm xc (0) xc xc (0) xc 42 1.  B. Biggio, B. Nelson, P. Laskov. Poisoning attacks against SVMs. ICML, 2012
  • 43. http://pralab.diee.unica.it Experiments on the MNIST digits Single-point attack •  Linear SVM; 784 features; TR: 100; VAL: 500; TS: about 2000 –  ‘0’ is the malicious (attacking) class –  ‘4’ is the legitimate (attacked) one xc (0) xc 43
  • 44. http://pralab.diee.unica.it Experiments on MNIST digits Multiple-point attack •  Linear SVM; 784 features; TR: 100; VAL: 500; TS: about 2000 –  ‘0’ is the malicious (attacking) class –  ‘4’ is the legitimate (attacked) one 44
  • 45. http://pralab.diee.unica.it Poisoning linear models for feature selection [H. Xiao et al., ICML ’15] •  Linear models –  Select features according to |w| 45 LASSO Tibshirani, 1996 Ridge Regression Hoerl & Kennard, 1970 Elas<c Net Zou & Hastie, 2005 A#acker maximizes classifica-on error w.r.t. the a#ack point xc (gradient ascent)
  • 46. http://pralab.diee.unica.it Experiments on PDF Malware Detection •  PDF: hierarchy of interconnected objects (keyword/value pairs) •  Learner’s task: to classify benign vs malware PDF files •  Attacker’s task: to maximize classification error by injecting poisoning attack samples –  Only feature increments are considered (object insertion) •  Object removal may compromise the PDF file /Type 2 /Page 1 /Encoding 1 … 13 0 obj << /Kids [ 1 0 R 11 0 R ] /Type /Page ... >> end obj 17 0 obj << /Type /Encoding /Differences [ 0 /C0032 ] >> endobj Features: keyword counts 46 Maiorca et al., 2012; 2013; Smutz & Stavrou, 2012; Srndic & Laskov, 2013
  • 47. http://pralab.diee.unica.it Experimental Results 47 PerfectKnowledge Data: 300 (TR) and 5,000 (TS) samples – 114 features Similar results obtained for limited-knowledge attacks!
  • 48. http://pralab.diee.unica.it Security Measures against Poisoning •  Rationale: poisoning injects outlying training samples •  Two main strategies for countering this threat 1.  Data sanitization: remove poisoning samples from training data •  Bagging for fighting poisoning attacks •  Reject-On-Negative-Impact (RONI) defense 2.  Robust Learning: learning algorithms that are robust in the presence of poisoning samples 48 xc (0) xc xc (0) xc
  • 49. http://pralab.diee.unica.it Security Measures against Poisoning Data Sanitization :: Multiple Classifier Systems •  (Weighted) Bagging for fighting poisoning attacks –  Underlying idea: resampling outlying samples with lower probability •  Two-step algorithm: 1.  Density estimation to assign lower resampling weights to outliers 2.  Bagging to train an MCS •  Promising results on spam (see plot) and web-based intrusion detection –  S: standard classifier –  B: standard bagging –  WB: weighted bagging (numbers in the legend correspond to different ensemble sizes) 49 1.  B. Biggio, I. Corona, G. Fumera, G. Giacinto, and F. Roli. Bagging classifiers for fighting poisoning attacks in adversarial classification tasks. MCS, 2011.
  • 50. http://pralab.diee.unica.it A>acking Clustering 50 1.  B. Biggio, I. Pillai, S. R. Bulò, D. Ariu, M. Pelillo, and F. Roli. Is data clustering in adversarial settings secure? AISec, 2013 2.  B. Biggio, S. R. Bulò, I. Pillai, M. Mura, E. Z. Mequanint, M. Pelillo, and F. Roli. Poisoning complete-linkage hierarchical clustering. S+SSPR, 2014
  • 51. http://pralab.diee.unica.it Attacking Clustering •  So far, we have considered supervised learning –  Training data consisting of samples and class labels •  In many applications, labels are not available or costly to obtain –  Unsupervised learning •  Training data only include samples – no labels! •  Malware clustering –  To identify variants of existing malware or new malware families x x x x x x x x x x x x x x x x x x1 x2 ... xd feature extraction (e.g., URL length, num. of parameters, etc.) data collection (honeypots) clustering of malware families (e.g., similar HTTP requests) data analysis / countermeasure design (e.g., signature generation) for each cluster if … then … else … 51
  • 52. http://pralab.diee.unica.it Is Data Clustering Secure? •  Attackers can poison input data to subvert malware clustering x x x x x x x x x x x x x x x x x x1 x2 ... xd feature extraction (e.g., URL length, num. of parameters, etc.) data collection (honeypots) clustering of malware families (e.g., similar HTTP requests) data analysis / countermeasure design (e.g., signature generation) for each cluster if … then … else … Well-cra2ed HTTP requests to subvert clustering h#p://www.vulnerablehotel.com/… h#p://www.vulnerablehotel.com/… h#p://www.vulnerablehotel.com/… h#p://www.vulnerablehotel.com/… … is significantly compromised … becomes useless (too many false alarms, low detection rate) 52
  • 53. http://pralab.diee.unica.it Our Work •  A framework to identify/design attacks against clustering algorithms –  Poisoning: add samples to maximally compromise the clustering output –  Obfuscation: hide samples within existing clusters •  Some clustering algorithms can be very sensitive to poisoning! –  single- and complete-linkage hierarchical clustering can be easily compromised by creating heterogeneous clusters •  Details on the attack derivation and implementation are in the papers Clustering on untainted data (80 samples) Clustering after adding 10 attack samples 53 1.  B. Biggio et al. Is data clustering in adversarial settings secure? AISec, 2013 2.  B. Biggio et al. Poisoning complete-linkage hierarchical clustering. S+SSPR, 2014
  • 54. http://pralab.diee.unica.it Conclusions and Future Work •  Learning-based systems are vulnerable to well-crafted, sophisticated attacks devised by skilled attackers –  … that exploit specific vulnerabilities of machine learning algorithms! Secure learning algorithms Attacks against learning 54
  • 55. http://pralab.diee.unica.it Joint work with … ? Any questions Thanks for your a#en-on! 55 … and many others