SlideShare a Scribd company logo
1 of 28
Download to read offline
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 1
Anomaly Detection
through Reinforcement Learning
.
.
Dr. Hari Koduvely
Chief Data Scientist
ZIGHRA.COM
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 2
Outline of Talk:
.
● Zighra and SensifyID Platform
● Sequential Anomaly Detection Problem
● Introduction to Reinforcement Learning
● Markov Decision Process and Q-Learning
● Function Approximation using Neural Networks
● Application to Network Intrusion Detection Problem
● Implementation using TensorFlow
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 3
ZIGHRA.COM
.
● Zighra (https://zighra.com) provides solutions for Continuous Behavioural
Authentication & Threat Detection
● Highlights of our SensifyID Platform:
○ Core is an AI based 6-layer Anomaly Detection System combining
behavioral biometrics with contextual, social and other signals
○ Cover uses cases such as User Verification, Account Takeover,
Remote Attacks and Bot Attacks
○ Can be integrated to any Web, Mobile & IoT application
○ 2 patents granted and 10+ in application stage
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 4
Sequential Anomaly Detection Problem
.
● Classical Anomaly Detection Problem is to find patterns in a dataset that do
not conform to expected normal behavior
● Formulated as a one-class classification task in machine learning
● In many domains the data distribution changes continuously (concept shift)
● An online learning setting is more ideal to deal with concept shifts
current_week_purchase
average_weekly_purchase
Source of image https://www.linkedin.com/pulse/part-2-keep-simple-machine-learning-algorithms-big-dr-dinesh/
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 5
Sequential Anomaly Detection Problem
.
● In Sequential Anomaly Detection problem the goal is to find out if a
subsequence of a sequence of events shows anomaly or not
● Each event in isolation would appear to be normal and only the sequence
of events would indicate an anomaly
○ Username-Password, Username-Password, Username-Password,....
○ Login to corporate network in midnight, Access a DB rarely used, Download lot of data,
Transfer to USB,......
● A straightforward supervised learning is not feasible here because of credit
assignment problem
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 6
Introduction to Reinforcement Learning
.
● In Reinforcement Learning, an autonomous agent interacts with an environment and
takes certain actions at
in each state st
● The environment in return supplies a reward rt
for the action agent performed as a
supervision signal and also a new state st+1
Agent
Environm
ent
at
st
rt
, st+1
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 7
Introduction to Reinforcement Learning
.
● Reinforcement Learning can be formally defined as a Markov Decision Process
● A Markov Decision Process (MDP) is defined by the 5-tuple {st
, at
, P(st+1
|st
, at
), γ , rt
}
○ st
- State at time t
○ at
- Action in state s
○ P(st+1
|st
, at
) - State transition probabilities
○ γ - Discount factor
○ rt
- Reward function
● Objective of MDP is to come up with an Optimum Policy that achieves maximum
cumulative rewards over long period of time
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 8
Q-Learning and Markov Decision Process
.
● Q-value Function Q(s, a) - an estimate of maximum total long term rewards starting from
state s and performing action a
● Bellman Equation:
Q(s, a) = r(s) + γ maxa’
∑s’
P(s’ |s, a) Q(s’, a’)
Q-value for a state-action pair is the current reward plus the expected Q-value of its successor states
● Central theoretical concept used in almost all formulations of reinforcement learning
● It can be proved that starting from random initial conditions, upon iteration of Bellman
equation Q(s, a) will converge to an optimum quality function Q*(s, a)
● Optimum policy is given by
Π*(s) = argmaxa
Q*(s, a)
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 9
Q-Learning and Markov Decision Process
.
● It is difficult to know the state transition probabilities P(st+1
|st
, at
) for a given problem
● Bellman’s equation can be cast in a derivative form where transition probabilities are
not needed
● Only the actual observed state from the environment is used
● Temporal Difference Learning Algorithm:
When an agent makes a transition from state s by performing an action a to state s’,
its Q value is updated as follows:
Q(s, a) ← Q(s, a) + α [ r(s) + γ maxa’
Q(s’, a’) - Q(s, a) ]
α is a learning rate << 1
● The Q-values are adjusted towards the ideal local equilibrium when Bellman’s equation holds.
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 10
Function Approximation using Neural Networks
.
● The Bellman’s equation is a deterministic algorithm
● For problems where the state and action spaces are small one can use a table to
represent Q(s, a)
● In many practical applications, state and action spaces are continuous
● One needs an efficient function approximation method for representing Q(s, a)
● Two standard approaches for this are
○ Tile Coding: Partition continuous space into overlapping set of tiles.
➢ Success depends up on the number and width of tiles.
➢ It is a linear function approximation
○ Neural Networks: Nonlinear function approximation, more powerful representation
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 11
Function Approximation using Neural Networks
.
● One can use Neural Networks to approximate Q(s, a) as follows:
○ Inputs : State s represented by the D-dimensional vector {s1
,s2
,......,sD
}
○ Outputs: Q values for each of the N actions {Q1
, Q2
,........,QN
}
Hidden Layers
s1
s2
s3
sD
Q1
Q2
QN
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 12
Function Approximation using Neural Networks
.
● The loss function for training NN is taken as the difference between Q values predicted
by the DNN and target Q values given by the Bellman’s equation
L = ½ [ (r + γ maxa
Q(s’, a’)) - Q(s,a) ]2
● NN is trained using back propagation as follows:
1. Start an episode of explorations
2. Initialize NN and start from a random state s
3. Do a forward pass of state s through the DNN
and get Q-values for all actions
4. Perform an ε-greedy exploration for choosing an
action a for the current state s
5. Get the next state s’ and reward r from the
environment
6. Pass s’ also through the DNN and compute
maxa
Q(s’, a’)
7. Set the target Q-value for the output node
corresponding to action a to be
r + γ maxa’
Q(s’, a’)
8. For all other nodes, keep the target Q-value
same as that obtained from DNN prediction in
step 2
9. Update the weights using backpropagation
10. Repeat the steps 2-6 till a termination condition
is reached
11. Repeat the episodes till network is trained
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 13
Function Approximation using Neural Networks
.
● High Level TD NN Learning iteration flow
DNN Model
Iteration over episodes
Iteration over exploration
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 14
Network Intrusion Detection
.
● Can we use Reinforcement Learning for Network Intrusion Detection?
● Related research works:
○ James Cannady used a CMAC Neural Network and formulated Network Intrusion
Detection as an online learning problem 1
○ Xin Xu studied the problem of host-based intrusion detection as a multi-stage cyber
attack and applied reinforcement learning 2
○ Arturo Servin studied the DDoS attack as a traffic anomaly problem and used
reinforcement learning for detection 3
○ Kleanthis M used a distributed reinforcement network for network intrusion response
4
● None of these have used a DNN for function approximation
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 15
Network Intrusion Detection
.
● Standard dataset for scientific research NSL-KDD Dataset 5
● Dataset contains 4 categories of attacks in a local area network
○ DOS - Denial of Service Attacks
○ R2L - Remote to Local where remote hacker trying to get local user privileges
○ U2R - Hacker operates as a normal user and exploit vulnerabilities
○ Probing - Hacker scans the machine to determine vulnerabilities
● Dataset contains 125, 973 connections for Training and 22, 543 for Testing
● Training set has 53.5% normal connections and 46.5% abnormal connections
● There are 41 features (32 continuous, 3 nominal and 6 binary)
● Eg. Type of protocol (TCP, UDP), port number, packet size, rate of transmission
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 16
Network Intrusion Detection
.
Source of image https://nycdatascience.com/blog/student-works/network-intrusion-detection-2/
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 17
Network Intrusion Detection
.
● However NLS-KDD dataset cannot be used for sequential anomaly detection
○ There is not time stamp. Dataset is not a time series data
○ There is no way one can identify the different connections are from the same
user/hacker or not
○ One could use it for standard anomaly detection problem using reinforcement
learning
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 18
Network Intrusion Detection
.
● However NLS-KDD dataset cannot be used for sequential anomaly detection
○ There is not time stamp. Dataset is not a time series data
○ There is no way one can identify the different connections are from the same
user/hacker or not
○ One could use the dataset for standard anomaly detection problem using
reinforcement learning
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 19
Network Intrusion Detection
.
● Reinforcement Learning Formulation with NSL-KDD Dataset
○ The states are characterized by the 41 features in the data set
○ For every state the agent takes either of the two actions:
■ Send an alert
■ Not send an alert
○ The rewards generated by the environment:
■ +1 if the state is normal and action is not send alert
■ +1 if the state is malicious and action is send alert
■ -1 if the state is malicious and action is not send alert
■ -1 if the state is normal and action is send alert
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 20
Implementation using TensorFlow
.
● Creation of the Environment
○ Goal of the environment is to stimulate the reward scheme mentioned for the
NSL-KDD dataset and also supply a new state every time
○ This can be done using the Gym toolkit from Open AI
https://github.com/openai/gym/tree/master/gym/envs
gym-network_intrusion/
README.md
setup.py
gym_network_intrusion/
__init__.py
envs/
__init__.py
network_intrusion_env.py
from gym.envs.registration import register
register(
id='NetworkIntrusion-v0',
entry_point='gym_network_intrusion.envs:NetworkIntr
usionEnv',
)
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 21
Implementation using TensorFlow
.
● Creation of the Environment
import gym
from gym import error, spaces, utils
from gym.utils import seeding
class NetworkIntrusionEnv(gym.Env):
def __init__(self):
...
def _step(self, action):
return new_state, reward, episode_over, details
...
def _reset(self):
return initial_state
...
def _get_reward(self, action):
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 22
Implementation using TensorFlow
.
● Implementation using TensorFlow
● Two architectures:
○ Deep NN architecture:
■ Discretize continuous variables and use one hot representation
○ Deep and Wide NN architecture:
■ Useful for combining continuous and discrete variables into one NN model
■ Also combines the power of memorization and generalization
■ https://www.tensorflow.org/tutorials/wide_and_deep
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 23
Implementation using TensorFlow
.
● Implementation a simple NN using TensorFlow
○ Discretize continuous variables and use one hot representation
○ Used binning (#bins = 5) to convert continuous to categorical
○ There are 226 one hot vectors
○ 3 layer feed forward neural network (226 X 10 X 1)
● Code available at https://github.com/harik68/RL4AD
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 24
Implementation using TensorFlow
.
● Model performance (work in progress !)
Baseline DNN-RL Model V0.1
TPR
FPR
Source of image for baseline https://nycdatascience.com/blog/student-works/network-intrusion-detection-2/
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 25
Next Steps
.
● Experiment with different discretization scheme or even tile coding
● Experiment with different NN architectures (Deep and Wide)
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 26
References
.
1. Next Generation Intrusion Detection: Autonomous Reinforcement Learning of Network Attacks,
J. Cannadey, 23rd National Information Systems Security Conference (2000)
2. Sequential anomaly detection based on temporal-difference learning: Principles,
models and case studies, Xin Xu, Applied Soft Computing 10 (2010) 859–867
3. Towards Traffic Anomaly Detection via Reinforcement Learning and Data Flow, A. Servin
[PDF] york.ac.uk
4. Distributed response to network intrusions using multiagent reinforcement learning, Engineering
Applications of Artificial Intelligence, Volume 41 Issue C, May 2015 Pages 270-284
5. NSL-KDD dataset, Canadian Institute for Cyber Security, University of New Brunswick,
(http://www.unb.ca/cic/datasets/nsl.html)
6. Artificial Intelligence a Modern Approach by Peter Norvig and Stuart J. Russell, Prentice Hall
(2009)
PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM
THANK YOU !
We are hiring Data Scientists, Machine Learning Engineers and Mobile Developers
Apply at career@zighra.com
Anomaly Detection through Reinforcement Learning

More Related Content

Similar to Anomaly Detection through Reinforcement Learning

Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5TigerGraph
 
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 ClassificationUsing Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 ClassificationTigerGraph
 
A Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement LearningA Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement LearningGiancarlo Frison
 
Lifecycle Inference on Unreliable Event Data
Lifecycle Inference on Unreliable Event DataLifecycle Inference on Unreliable Event Data
Lifecycle Inference on Unreliable Event DataDatabricks
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fittingWush Wu
 
DDPG algortihm for angry birds
DDPG algortihm for angry birdsDDPG algortihm for angry birds
DDPG algortihm for angry birdsWangyu Han
 
Machine Learning with Python
Machine Learning with PythonMachine Learning with Python
Machine Learning with PythonGLC Networks
 
Designing States, Actions, and Rewards for Using POMDP in Session Search
Designing States, Actions, and Rewards for Using POMDP in Session SearchDesigning States, Actions, and Rewards for Using POMDP in Session Search
Designing States, Actions, and Rewards for Using POMDP in Session SearchGrace Yang
 
Machine Learning with Python
Machine Learning with Python Machine Learning with Python
Machine Learning with Python GLC Networks
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
 
MUM Melbourne : Build Enterprise Wireless with CAPsMAN
MUM Melbourne : Build Enterprise Wireless with CAPsMANMUM Melbourne : Build Enterprise Wireless with CAPsMAN
MUM Melbourne : Build Enterprise Wireless with CAPsMANGLC Networks
 
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...TigerGraph
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper reviewMazen Aly
 
An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learningBig Data Colombia
 
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...Gobinath Loganathan
 
Deep learning approach for network intrusion detection system
Deep learning approach for network intrusion detection systemDeep learning approach for network intrusion detection system
Deep learning approach for network intrusion detection systemAvinash Kumar
 
Botnet detection in SDN by DL techniques
Botnet detection in SDN by DL techniquesBotnet detection in SDN by DL techniques
Botnet detection in SDN by DL techniquesIvan Letteri
 
Policy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionPolicy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionKishor Datta Gupta
 

Similar to Anomaly Detection through Reinforcement Learning (20)

Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
 
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 ClassificationUsing Graph Algorithms for Advanced Analytics - Part 5 Classification
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
 
A Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement LearningA Brief Survey of Reinforcement Learning
A Brief Survey of Reinforcement Learning
 
Lifecycle Inference on Unreliable Event Data
Lifecycle Inference on Unreliable Event DataLifecycle Inference on Unreliable Event Data
Lifecycle Inference on Unreliable Event Data
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
 
IDS for IoT.pptx
IDS for IoT.pptxIDS for IoT.pptx
IDS for IoT.pptx
 
DDPG algortihm for angry birds
DDPG algortihm for angry birdsDDPG algortihm for angry birds
DDPG algortihm for angry birds
 
Machine Learning with Python
Machine Learning with PythonMachine Learning with Python
Machine Learning with Python
 
Designing States, Actions, and Rewards for Using POMDP in Session Search
Designing States, Actions, and Rewards for Using POMDP in Session SearchDesigning States, Actions, and Rewards for Using POMDP in Session Search
Designing States, Actions, and Rewards for Using POMDP in Session Search
 
Machine Learning with Python
Machine Learning with Python Machine Learning with Python
Machine Learning with Python
 
Reinforcement Learning - DQN
Reinforcement Learning - DQNReinforcement Learning - DQN
Reinforcement Learning - DQN
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
MUM Melbourne : Build Enterprise Wireless with CAPsMAN
MUM Melbourne : Build Enterprise Wireless with CAPsMANMUM Melbourne : Build Enterprise Wireless with CAPsMAN
MUM Melbourne : Build Enterprise Wireless with CAPsMAN
 
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper review
 
An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learning
 
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...
Sequence to Sequence Pattern Learning Algorithm for Real-time Anomaly Detecti...
 
Deep learning approach for network intrusion detection system
Deep learning approach for network intrusion detection systemDeep learning approach for network intrusion detection system
Deep learning approach for network intrusion detection system
 
Botnet detection in SDN by DL techniques
Botnet detection in SDN by DL techniquesBotnet detection in SDN by DL techniques
Botnet detection in SDN by DL techniques
 
Policy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionPolicy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detection
 

Recently uploaded

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 

Recently uploaded (20)

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 

Anomaly Detection through Reinforcement Learning

  • 1. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 1 Anomaly Detection through Reinforcement Learning . . Dr. Hari Koduvely Chief Data Scientist ZIGHRA.COM
  • 2. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 2 Outline of Talk: . ● Zighra and SensifyID Platform ● Sequential Anomaly Detection Problem ● Introduction to Reinforcement Learning ● Markov Decision Process and Q-Learning ● Function Approximation using Neural Networks ● Application to Network Intrusion Detection Problem ● Implementation using TensorFlow
  • 3. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 3 ZIGHRA.COM . ● Zighra (https://zighra.com) provides solutions for Continuous Behavioural Authentication & Threat Detection ● Highlights of our SensifyID Platform: ○ Core is an AI based 6-layer Anomaly Detection System combining behavioral biometrics with contextual, social and other signals ○ Cover uses cases such as User Verification, Account Takeover, Remote Attacks and Bot Attacks ○ Can be integrated to any Web, Mobile & IoT application ○ 2 patents granted and 10+ in application stage
  • 4. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 4 Sequential Anomaly Detection Problem . ● Classical Anomaly Detection Problem is to find patterns in a dataset that do not conform to expected normal behavior ● Formulated as a one-class classification task in machine learning ● In many domains the data distribution changes continuously (concept shift) ● An online learning setting is more ideal to deal with concept shifts current_week_purchase average_weekly_purchase Source of image https://www.linkedin.com/pulse/part-2-keep-simple-machine-learning-algorithms-big-dr-dinesh/
  • 5. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 5 Sequential Anomaly Detection Problem . ● In Sequential Anomaly Detection problem the goal is to find out if a subsequence of a sequence of events shows anomaly or not ● Each event in isolation would appear to be normal and only the sequence of events would indicate an anomaly ○ Username-Password, Username-Password, Username-Password,.... ○ Login to corporate network in midnight, Access a DB rarely used, Download lot of data, Transfer to USB,...... ● A straightforward supervised learning is not feasible here because of credit assignment problem
  • 6. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 6 Introduction to Reinforcement Learning . ● In Reinforcement Learning, an autonomous agent interacts with an environment and takes certain actions at in each state st ● The environment in return supplies a reward rt for the action agent performed as a supervision signal and also a new state st+1 Agent Environm ent at st rt , st+1
  • 7. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 7 Introduction to Reinforcement Learning . ● Reinforcement Learning can be formally defined as a Markov Decision Process ● A Markov Decision Process (MDP) is defined by the 5-tuple {st , at , P(st+1 |st , at ), γ , rt } ○ st - State at time t ○ at - Action in state s ○ P(st+1 |st , at ) - State transition probabilities ○ γ - Discount factor ○ rt - Reward function ● Objective of MDP is to come up with an Optimum Policy that achieves maximum cumulative rewards over long period of time
  • 8. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 8 Q-Learning and Markov Decision Process . ● Q-value Function Q(s, a) - an estimate of maximum total long term rewards starting from state s and performing action a ● Bellman Equation: Q(s, a) = r(s) + γ maxa’ ∑s’ P(s’ |s, a) Q(s’, a’) Q-value for a state-action pair is the current reward plus the expected Q-value of its successor states ● Central theoretical concept used in almost all formulations of reinforcement learning ● It can be proved that starting from random initial conditions, upon iteration of Bellman equation Q(s, a) will converge to an optimum quality function Q*(s, a) ● Optimum policy is given by Π*(s) = argmaxa Q*(s, a)
  • 9. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 9 Q-Learning and Markov Decision Process . ● It is difficult to know the state transition probabilities P(st+1 |st , at ) for a given problem ● Bellman’s equation can be cast in a derivative form where transition probabilities are not needed ● Only the actual observed state from the environment is used ● Temporal Difference Learning Algorithm: When an agent makes a transition from state s by performing an action a to state s’, its Q value is updated as follows: Q(s, a) ← Q(s, a) + α [ r(s) + γ maxa’ Q(s’, a’) - Q(s, a) ] α is a learning rate << 1 ● The Q-values are adjusted towards the ideal local equilibrium when Bellman’s equation holds.
  • 10. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 10 Function Approximation using Neural Networks . ● The Bellman’s equation is a deterministic algorithm ● For problems where the state and action spaces are small one can use a table to represent Q(s, a) ● In many practical applications, state and action spaces are continuous ● One needs an efficient function approximation method for representing Q(s, a) ● Two standard approaches for this are ○ Tile Coding: Partition continuous space into overlapping set of tiles. ➢ Success depends up on the number and width of tiles. ➢ It is a linear function approximation ○ Neural Networks: Nonlinear function approximation, more powerful representation
  • 11. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 11 Function Approximation using Neural Networks . ● One can use Neural Networks to approximate Q(s, a) as follows: ○ Inputs : State s represented by the D-dimensional vector {s1 ,s2 ,......,sD } ○ Outputs: Q values for each of the N actions {Q1 , Q2 ,........,QN } Hidden Layers s1 s2 s3 sD Q1 Q2 QN
  • 12. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 12 Function Approximation using Neural Networks . ● The loss function for training NN is taken as the difference between Q values predicted by the DNN and target Q values given by the Bellman’s equation L = ½ [ (r + γ maxa Q(s’, a’)) - Q(s,a) ]2 ● NN is trained using back propagation as follows: 1. Start an episode of explorations 2. Initialize NN and start from a random state s 3. Do a forward pass of state s through the DNN and get Q-values for all actions 4. Perform an ε-greedy exploration for choosing an action a for the current state s 5. Get the next state s’ and reward r from the environment 6. Pass s’ also through the DNN and compute maxa Q(s’, a’) 7. Set the target Q-value for the output node corresponding to action a to be r + γ maxa’ Q(s’, a’) 8. For all other nodes, keep the target Q-value same as that obtained from DNN prediction in step 2 9. Update the weights using backpropagation 10. Repeat the steps 2-6 till a termination condition is reached 11. Repeat the episodes till network is trained
  • 13. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 13 Function Approximation using Neural Networks . ● High Level TD NN Learning iteration flow DNN Model Iteration over episodes Iteration over exploration
  • 14. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 14 Network Intrusion Detection . ● Can we use Reinforcement Learning for Network Intrusion Detection? ● Related research works: ○ James Cannady used a CMAC Neural Network and formulated Network Intrusion Detection as an online learning problem 1 ○ Xin Xu studied the problem of host-based intrusion detection as a multi-stage cyber attack and applied reinforcement learning 2 ○ Arturo Servin studied the DDoS attack as a traffic anomaly problem and used reinforcement learning for detection 3 ○ Kleanthis M used a distributed reinforcement network for network intrusion response 4 ● None of these have used a DNN for function approximation
  • 15. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 15 Network Intrusion Detection . ● Standard dataset for scientific research NSL-KDD Dataset 5 ● Dataset contains 4 categories of attacks in a local area network ○ DOS - Denial of Service Attacks ○ R2L - Remote to Local where remote hacker trying to get local user privileges ○ U2R - Hacker operates as a normal user and exploit vulnerabilities ○ Probing - Hacker scans the machine to determine vulnerabilities ● Dataset contains 125, 973 connections for Training and 22, 543 for Testing ● Training set has 53.5% normal connections and 46.5% abnormal connections ● There are 41 features (32 continuous, 3 nominal and 6 binary) ● Eg. Type of protocol (TCP, UDP), port number, packet size, rate of transmission
  • 16. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 16 Network Intrusion Detection . Source of image https://nycdatascience.com/blog/student-works/network-intrusion-detection-2/
  • 17. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 17 Network Intrusion Detection . ● However NLS-KDD dataset cannot be used for sequential anomaly detection ○ There is not time stamp. Dataset is not a time series data ○ There is no way one can identify the different connections are from the same user/hacker or not ○ One could use it for standard anomaly detection problem using reinforcement learning
  • 18. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 18 Network Intrusion Detection . ● However NLS-KDD dataset cannot be used for sequential anomaly detection ○ There is not time stamp. Dataset is not a time series data ○ There is no way one can identify the different connections are from the same user/hacker or not ○ One could use the dataset for standard anomaly detection problem using reinforcement learning
  • 19. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 19 Network Intrusion Detection . ● Reinforcement Learning Formulation with NSL-KDD Dataset ○ The states are characterized by the 41 features in the data set ○ For every state the agent takes either of the two actions: ■ Send an alert ■ Not send an alert ○ The rewards generated by the environment: ■ +1 if the state is normal and action is not send alert ■ +1 if the state is malicious and action is send alert ■ -1 if the state is malicious and action is not send alert ■ -1 if the state is normal and action is send alert
  • 20. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 20 Implementation using TensorFlow . ● Creation of the Environment ○ Goal of the environment is to stimulate the reward scheme mentioned for the NSL-KDD dataset and also supply a new state every time ○ This can be done using the Gym toolkit from Open AI https://github.com/openai/gym/tree/master/gym/envs gym-network_intrusion/ README.md setup.py gym_network_intrusion/ __init__.py envs/ __init__.py network_intrusion_env.py from gym.envs.registration import register register( id='NetworkIntrusion-v0', entry_point='gym_network_intrusion.envs:NetworkIntr usionEnv', )
  • 21. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 21 Implementation using TensorFlow . ● Creation of the Environment import gym from gym import error, spaces, utils from gym.utils import seeding class NetworkIntrusionEnv(gym.Env): def __init__(self): ... def _step(self, action): return new_state, reward, episode_over, details ... def _reset(self): return initial_state ... def _get_reward(self, action):
  • 22. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 22 Implementation using TensorFlow . ● Implementation using TensorFlow ● Two architectures: ○ Deep NN architecture: ■ Discretize continuous variables and use one hot representation ○ Deep and Wide NN architecture: ■ Useful for combining continuous and discrete variables into one NN model ■ Also combines the power of memorization and generalization ■ https://www.tensorflow.org/tutorials/wide_and_deep
  • 23. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 23 Implementation using TensorFlow . ● Implementation a simple NN using TensorFlow ○ Discretize continuous variables and use one hot representation ○ Used binning (#bins = 5) to convert continuous to categorical ○ There are 226 one hot vectors ○ 3 layer feed forward neural network (226 X 10 X 1) ● Code available at https://github.com/harik68/RL4AD
  • 24. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 24 Implementation using TensorFlow . ● Model performance (work in progress !) Baseline DNN-RL Model V0.1 TPR FPR Source of image for baseline https://nycdatascience.com/blog/student-works/network-intrusion-detection-2/
  • 25. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 25 Next Steps . ● Experiment with different discretization scheme or even tile coding ● Experiment with different NN architectures (Deep and Wide)
  • 26. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM 26 References . 1. Next Generation Intrusion Detection: Autonomous Reinforcement Learning of Network Attacks, J. Cannadey, 23rd National Information Systems Security Conference (2000) 2. Sequential anomaly detection based on temporal-difference learning: Principles, models and case studies, Xin Xu, Applied Soft Computing 10 (2010) 859–867 3. Towards Traffic Anomaly Detection via Reinforcement Learning and Data Flow, A. Servin [PDF] york.ac.uk 4. Distributed response to network intrusions using multiagent reinforcement learning, Engineering Applications of Artificial Intelligence, Volume 41 Issue C, May 2015 Pages 270-284 5. NSL-KDD dataset, Canadian Institute for Cyber Security, University of New Brunswick, (http://www.unb.ca/cic/datasets/nsl.html) 6. Artificial Intelligence a Modern Approach by Peter Norvig and Stuart J. Russell, Prentice Hall (2009)
  • 27. PAGE©2018 ZIGHRA | WWW.ZIGHRA.COM THANK YOU ! We are hiring Data Scientists, Machine Learning Engineers and Mobile Developers Apply at career@zighra.com