SlideShare a Scribd company logo
1 of 41
Download to read offline
Causative Adversarial Learning
Huang Xiao, am 24.06.2015
xiaohu(at)in.tum.de
Talk presented on Deep Learning in Action
@Munich
Motivation
Deep networks can be easily fooled … [1]
Evolution Algor.
generated images
99.99%
confidence
“It turns out some DNNs only
focus on discriminative
features in images.”
[1] Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. In Computer Vision and
Pattern Recognition (CVPR '15), IEEE, 2015.
Motivation
Spam alerts
Google brain, 16000 CPUs
Learning is expensive!
Motivation
Adversarial Learning
Reverse engineering of machine learning. It
aims to design robust and secure learning
algorithms.
Big Picture
Are the modern learning systems really secure?
Training
dataset Model
Test
(Validation)
dataset
Training Test
Update
● Increase test error
● Reduce learning accuracy
● Fool the intelligent system
● Achieve personal gain
Big Picture
Are the modern learning systems really secure?
Training
dataset Model
Test
(Validation)
dataset
Training Test
Update
Causative Attack
Exploratory Attack
Attack’s capability
Access to Data Knowledge about
features
Knowledge about the
classifier
Limited Knowledge Partially Maybe Yes
Perfect Knowledge Yes Yes Yes
These are real inputs from users.
Basics
❏ Observations
❏ True signal:
❏ Polynomial curve fitting
❏ is unknown
❏ => learn the green curve
Observation
Original
signal
Least square
Training
Minimize empirical squared error.
Estimated
output
Observed
output
Least square
Training
Minimize empirical squared error.
Overfitting
Estimated
output
Observed
output
Overfitting
❏ Bad on unseen test set
❏ Central problem of ML.
❏ Generalization
❏ E.g., regularization, prior,
more data, model
selection
Bias-Variance
❏ Trade off
❏ Overfitting == low bias, high variance
❏ Underfitting == high bias, low variance
❏ Noise is dominating!
W is very
sensitive
Bias Variance Decomposition
Objective
Increase bias or variance?
Types of Adversaries
● Causative Attack (Poisoning)
○ Understanding how the learning algorithms work
○ Engineering on features or labels of training set
○ Change the discriminant function
● Exploratory Attack (Evasion)
○ Engineering features of a test point
○ Circumvent the legitimate detection
○ Change the discriminant result
Types of Adversaries
● Causative Attack (Poisoning)
○ Understanding how the learning algorithms work
○ Engineering on features or labels of training set
○ Change the discriminant function
● Exploratory Attack (Evasion)
○ Engineering features of a test point
○ Circumvent the legitimate detection
○ Change the discriminant result
Label Noises on SVM
● SVM: One of the state-of-art classifier
● Binary case: +1, -1
● Label flips attack under a certain budget
● Maximizing error on validation set
● Methods:
○ ALFA
○ Distance based: far-first, near-first, random
○ Continuous relaxation gradient ascend
○ Correlated cluster
Basics
We measure the error on a validation set using the function trained on
training set.
A training data set
A validation data set
Classifier trained on
Regularization coefficient
Risk measurement on validation set
Flip Labels
Flip Labels
Huang Xiao, B. Biggio, B. Nelson, Han Xiao, C. Eckert, and F. Roli, “Support Vector Machines under Adversarial Label
Contamination”, Neurocomputing, vol. Special Issue on Advances in Learning with Label Noise, In Press.
Poisoning Attack on SVM
● Noises on features, not on labels
● Design a malicious training point
● Maximizing the error (e.g., test error,
hinge loss, ...)
● Gradient ascend
How to?
Retrain the SVM after injecting a malicious point ,, , move the
point such that the classification error on validation set is maximized.
Validation data set with m samples
SVM trained on training set with a malicious point
Poisoning Attack on SVM
Poisoning Attack on SVM
B. Biggio, B. Nelson, and P. Laskov, “Poisoning attacks against support vector machines”, in 29th Int'l Conf. on Machine
Learning (ICML), 2012.
Walking example
B. Biggio, B. Nelson, and P. Laskov, “Poisoning attacks against support vector machines”, in 29th Int'l Conf. on Machine
Learning (ICML), 2012
You can:
● Mimic the ‘9’ as ‘8’ or,
● Label a ‘9’ as a ‘8’
Poisoning Lasso
● Lasso: feature selection, more generally,
L1 regularization
● Feature selection is often the first step
for many learning system
● Other targets: Rigid regression, elastic
network
● Gradient based method
Lasso
Capture the most relevant features in data set
automatically by shrinking the feature weights.
from:
Tibshirani, R. (1996). Regression shrinkage
and selection via the lasso. J. Royal.
Statist. Soc B., Vol. 58, No. 1, pages 267-
288).
Feature selection
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
5.1 4.6 4.5 4.0 4.0 1.8 0 0 0 0
Non-zero (weight) features are
selected for next stage training!
Feature selection
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
5.13.64.23.14.21.80000
Non-zero (weight) features are
selected for next stage training!
Adding a
malicious point
Training
set
Intuition
# features
#Samples
# features
#Samples
#samples ≪ #features #samples ≫ #features
Intuition
# features
#Samples
# features
#Samples
#samples ≪ #features #samples ≫ #features
Danger!
Add some random noises
Research goals
● Investigating robustness of feature
selection algorithms
● Design a multiple point attack method
● Warning: feature selection might not be
reliable
● A gradient based poisoning framework
Objective function
We inject a malicious point to form a
new compromised Data .
Variable: , we are maximising w.r.t
Remark that is learnt on contaminated data .
Maximise Generalization Error!
Gradient Ascent
Update rule:
descent
ascent
min
max
bound
box
Demonstration
Error surface
Initial attack
point
on each (x, y)
Xiao, Huang, Battista Biggio, Gavin Brown, Giorgio Fumera, Claudia Eckert, and Fabio Roli. Is Feature Selection
Secure against Training Data Poisoning?. In ICML'15,Lille, France, July 2015.
Demonstration
Gradient ascend
path
Xiao, Huang, Battista Biggio, Gavin Brown, Giorgio Fumera, Claudia Eckert, and Fabio Roli. Is Feature Selection
Secure against Training Data Poisoning?. In ICML'15,Lille, France, July 2015.
Wrap up
● Don’t expect your algorithms too fancy
● Don’t expect adversaries too silly
● Setup objective and do the worst-case
study
● Machine learning needs to be more
robust
● There’s no innocent data
Thank you, question?

More Related Content

What's hot

Transformer based approaches for visual representation learning
Transformer based approaches for visual representation learningTransformer based approaches for visual representation learning
Transformer based approaches for visual representation learningRyohei Suzuki
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature EngineeringSri Ambati
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionAdnan Masood
 
Bayesian classification
Bayesian classification Bayesian classification
Bayesian classification Zul Kawsar
 
Support vector machine
Support vector machineSupport vector machine
Support vector machineMusa Hawamdah
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes ClassifierArunabha Saha
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixJaya Kawale
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learningSangath babu
 
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BARThyunyoung Lee
 
Hebbian Learning
Hebbian LearningHebbian Learning
Hebbian LearningESCOM
 
ProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) IntroductionProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) Introductionwahab khan
 
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...Edureka!
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
 
Decision tree lecture 3
Decision tree lecture 3Decision tree lecture 3
Decision tree lecture 3Laila Fatehy
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and DefenseKishor Datta Gupta
 
MACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMMACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMPuneet Kulyana
 
RBM example (MNIST classification), Foolad
RBM example (MNIST classification), FooladRBM example (MNIST classification), Foolad
RBM example (MNIST classification), FooladShima Foolad
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector MachinesCloudxLab
 

What's hot (20)

Transformer based approaches for visual representation learning
Transformer based approaches for visual representation learningTransformer based approaches for visual representation learning
Transformer based approaches for visual representation learning
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief Introduction
 
Bayesian classification
Bayesian classification Bayesian classification
Bayesian classification
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
 
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
 
Hebbian Learning
Hebbian LearningHebbian Learning
Hebbian Learning
 
ProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) IntroductionProLog (Artificial Intelligence) Introduction
ProLog (Artificial Intelligence) Introduction
 
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Decision tree lecture 3
Decision tree lecture 3Decision tree lecture 3
Decision tree lecture 3
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and Defense
 
MACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMMACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHM
 
Xgboost
XgboostXgboost
Xgboost
 
RBM example (MNIST classification), Foolad
RBM example (MNIST classification), FooladRBM example (MNIST classification), Foolad
RBM example (MNIST classification), Foolad
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 

Similar to Causative Adversarial Learning

EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfAnkita Tiwari
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Julien SIMON
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado
 
Machine Duping 101: Pwning Deep Learning Systems
Machine Duping 101: Pwning Deep Learning SystemsMachine Duping 101: Pwning Deep Learning Systems
Machine Duping 101: Pwning Deep Learning SystemsClarence Chio
 
How Can Machine Learning Help Your Research Forward?
How Can Machine Learning Help Your Research Forward?How Can Machine Learning Help Your Research Forward?
How Can Machine Learning Help Your Research Forward?Wouter Deconinck
 
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво....NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...NETFest
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
Machine Learning - Lecture1.pptx.pdf
Machine Learning - Lecture1.pptx.pdfMachine Learning - Lecture1.pptx.pdf
Machine Learning - Lecture1.pptx.pdfNsitTech
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.pptyang947066
 
Brief Tour of Machine Learning
Brief Tour of Machine LearningBrief Tour of Machine Learning
Brief Tour of Machine Learningbutest
 
Learning when to give up: theory, practice and perspectives
Learning when to give up: theory, practice and perspectivesLearning when to give up: theory, practice and perspectives
Learning when to give up: theory, practice and perspectivesGiuseppe (Pino) Di Fabbrizio
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptxssuserf07225
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning IntroductionDong Guo
 
Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Jeet Das
 
Intro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft VenturesIntro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft Venturesmicrosoftventures
 
Getting started with Machine Learning
Getting started with Machine LearningGetting started with Machine Learning
Getting started with Machine LearningGaurav Bhalotia
 

Similar to Causative Adversarial Learning (20)

EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101
 
Machine Duping 101: Pwning Deep Learning Systems
Machine Duping 101: Pwning Deep Learning SystemsMachine Duping 101: Pwning Deep Learning Systems
Machine Duping 101: Pwning Deep Learning Systems
 
How Can Machine Learning Help Your Research Forward?
How Can Machine Learning Help Your Research Forward?How Can Machine Learning Help Your Research Forward?
How Can Machine Learning Help Your Research Forward?
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво....NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
 
ML basics.pptx
ML basics.pptxML basics.pptx
ML basics.pptx
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Machine Learning - Lecture1.pptx.pdf
Machine Learning - Lecture1.pptx.pdfMachine Learning - Lecture1.pptx.pdf
Machine Learning - Lecture1.pptx.pdf
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
Brief Tour of Machine Learning
Brief Tour of Machine LearningBrief Tour of Machine Learning
Brief Tour of Machine Learning
 
Learning when to give up: theory, practice and perspectives
Learning when to give up: theory, practice and perspectivesLearning when to give up: theory, practice and perspectives
Learning when to give up: theory, practice and perspectives
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptx
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)
 
Intro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft VenturesIntro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft Ventures
 
Getting started with Machine Learning
Getting started with Machine LearningGetting started with Machine Learning
Getting started with Machine Learning
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 

Causative Adversarial Learning

  • 1. Causative Adversarial Learning Huang Xiao, am 24.06.2015 xiaohu(at)in.tum.de Talk presented on Deep Learning in Action @Munich
  • 2. Motivation Deep networks can be easily fooled … [1] Evolution Algor. generated images 99.99% confidence “It turns out some DNNs only focus on discriminative features in images.” [1] Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. In Computer Vision and Pattern Recognition (CVPR '15), IEEE, 2015.
  • 4. Google brain, 16000 CPUs Learning is expensive! Motivation
  • 5. Adversarial Learning Reverse engineering of machine learning. It aims to design robust and secure learning algorithms.
  • 6. Big Picture Are the modern learning systems really secure? Training dataset Model Test (Validation) dataset Training Test Update ● Increase test error ● Reduce learning accuracy ● Fool the intelligent system ● Achieve personal gain
  • 7. Big Picture Are the modern learning systems really secure? Training dataset Model Test (Validation) dataset Training Test Update Causative Attack Exploratory Attack
  • 8. Attack’s capability Access to Data Knowledge about features Knowledge about the classifier Limited Knowledge Partially Maybe Yes Perfect Knowledge Yes Yes Yes These are real inputs from users.
  • 9. Basics ❏ Observations ❏ True signal: ❏ Polynomial curve fitting ❏ is unknown ❏ => learn the green curve Observation Original signal
  • 10. Least square Training Minimize empirical squared error. Estimated output Observed output
  • 11. Least square Training Minimize empirical squared error. Overfitting Estimated output Observed output
  • 12. Overfitting ❏ Bad on unseen test set ❏ Central problem of ML. ❏ Generalization ❏ E.g., regularization, prior, more data, model selection
  • 13. Bias-Variance ❏ Trade off ❏ Overfitting == low bias, high variance ❏ Underfitting == high bias, low variance ❏ Noise is dominating! W is very sensitive Bias Variance Decomposition
  • 15. Types of Adversaries ● Causative Attack (Poisoning) ○ Understanding how the learning algorithms work ○ Engineering on features or labels of training set ○ Change the discriminant function ● Exploratory Attack (Evasion) ○ Engineering features of a test point ○ Circumvent the legitimate detection ○ Change the discriminant result
  • 16. Types of Adversaries ● Causative Attack (Poisoning) ○ Understanding how the learning algorithms work ○ Engineering on features or labels of training set ○ Change the discriminant function ● Exploratory Attack (Evasion) ○ Engineering features of a test point ○ Circumvent the legitimate detection ○ Change the discriminant result
  • 17. Label Noises on SVM ● SVM: One of the state-of-art classifier ● Binary case: +1, -1 ● Label flips attack under a certain budget ● Maximizing error on validation set ● Methods: ○ ALFA ○ Distance based: far-first, near-first, random ○ Continuous relaxation gradient ascend ○ Correlated cluster
  • 18. Basics We measure the error on a validation set using the function trained on training set. A training data set A validation data set Classifier trained on Regularization coefficient Risk measurement on validation set
  • 21. Huang Xiao, B. Biggio, B. Nelson, Han Xiao, C. Eckert, and F. Roli, “Support Vector Machines under Adversarial Label Contamination”, Neurocomputing, vol. Special Issue on Advances in Learning with Label Noise, In Press.
  • 22. Poisoning Attack on SVM ● Noises on features, not on labels ● Design a malicious training point ● Maximizing the error (e.g., test error, hinge loss, ...) ● Gradient ascend
  • 23. How to? Retrain the SVM after injecting a malicious point ,, , move the point such that the classification error on validation set is maximized. Validation data set with m samples SVM trained on training set with a malicious point
  • 26. B. Biggio, B. Nelson, and P. Laskov, “Poisoning attacks against support vector machines”, in 29th Int'l Conf. on Machine Learning (ICML), 2012.
  • 27. Walking example B. Biggio, B. Nelson, and P. Laskov, “Poisoning attacks against support vector machines”, in 29th Int'l Conf. on Machine Learning (ICML), 2012 You can: ● Mimic the ‘9’ as ‘8’ or, ● Label a ‘9’ as a ‘8’
  • 28. Poisoning Lasso ● Lasso: feature selection, more generally, L1 regularization ● Feature selection is often the first step for many learning system ● Other targets: Rigid regression, elastic network ● Gradient based method
  • 29. Lasso Capture the most relevant features in data set automatically by shrinking the feature weights. from: Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc B., Vol. 58, No. 1, pages 267- 288).
  • 30. Feature selection x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 5.1 4.6 4.5 4.0 4.0 1.8 0 0 0 0 Non-zero (weight) features are selected for next stage training!
  • 31. Feature selection x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 5.13.64.23.14.21.80000 Non-zero (weight) features are selected for next stage training! Adding a malicious point Training set
  • 32. Intuition # features #Samples # features #Samples #samples ≪ #features #samples ≫ #features
  • 33. Intuition # features #Samples # features #Samples #samples ≪ #features #samples ≫ #features Danger!
  • 34. Add some random noises
  • 35. Research goals ● Investigating robustness of feature selection algorithms ● Design a multiple point attack method ● Warning: feature selection might not be reliable ● A gradient based poisoning framework
  • 36. Objective function We inject a malicious point to form a new compromised Data . Variable: , we are maximising w.r.t Remark that is learnt on contaminated data . Maximise Generalization Error!
  • 38. Demonstration Error surface Initial attack point on each (x, y) Xiao, Huang, Battista Biggio, Gavin Brown, Giorgio Fumera, Claudia Eckert, and Fabio Roli. Is Feature Selection Secure against Training Data Poisoning?. In ICML'15,Lille, France, July 2015.
  • 39. Demonstration Gradient ascend path Xiao, Huang, Battista Biggio, Gavin Brown, Giorgio Fumera, Claudia Eckert, and Fabio Roli. Is Feature Selection Secure against Training Data Poisoning?. In ICML'15,Lille, France, July 2015.
  • 40. Wrap up ● Don’t expect your algorithms too fancy ● Don’t expect adversaries too silly ● Setup objective and do the worst-case study ● Machine learning needs to be more robust ● There’s no innocent data