Views data as a continuous signal
Split signal into frequency bands
High – short term spikes
Low – long term trends
Fourier Transforms (FFT)
Classically based & rigid
Has inverse function
Original signal can be reconstructed
Wavelet Analysis
More recently studied
Handles discontinuities and spikes better than FFT
Preferred over FFT
FFT
* Fourier analysis is the process of decomposing a complex periodic waveform into a set of sinusoids with different amplitudes, frequencies and phases. The sum of these sinusoids can exactly match the original waveform. This lossless transform presents a new perspective of the signal under study (in the frequency do- main), which has proved useful in very many applications.
* The Inverse Discrete Fourier Transform (IDFT) is used to reconstruct the signal in the time domain; DFT and IDFT can be efficiently implemented by using the FFT.
* filter out the low frequency components in the link traffic time series. In general, low frequency components capture the daily and weekly traffic patterns, while high frequency components represent the sudden changes in traffic behavior.
Wavelet Analysis
* Considered superior to traditional Fourier methods where signal contains transients such as discontinuities and sharp spikes
* Wavelet techniques are one of the most up-to-date modeling tools to exploit both non-stationary and long-range dependence.
ftp://net9.cs.utexas.edu/pub/techreports/tr05-38.pdf (Comparison of PCA to FFT & Wavelets & ARIMA)
http://cegroup.ece.tamu.edu/techpubs/2003/TAMU-ECE-2003-03.pdf (wavelet feature based, batch/real-time, correlation of port-numbers)
http://www.cs.cmu.edu/~srini/15-744/readings/BKPR02.pdf (wavelet signal based; generic features, exposes anomalies even when nested in large amounts of other traffic [noise])
http://www.cse.sc.edu/~huangct/wens06.pdf (framework paper, show various feature-based wavelets and their ability to detect anomalies)
http://pages.cs.wisc.edu/~pb/paper_imw_02.pdf (wavelets & spline filters)
http://users.ece.gatech.edu/~jic/anomaly-book-chap-09.pdf (2.2, background info)
Feature selection algorithm
Reduces dimensionality
Iteratively select uncorrelated features with most variance
Applicable to traffic volume & other features
Source/Destination IP Address
Source/Destination Ports
Packet Size
Volume compliments feature distribution
Batch method of selecting representative data
Sensitive to input parameters due to temporal correlation
* Functional Combination of Features
Variance (accounts for much of the variability of the data as possible)
http://conferences.sigcomm.org/sigcomm/2004/papers/p405-lakhina111.pdf (volume-based, separate into normal component and noisy component [which contains spikes])
http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2006_868.pdfhttp://www.dtic.mil/dtic/tr/fulltext/u2/a465712.pdf (application to intra-network anomaly detection toward addressing scalability concerns, stochastic matrix perturbation theory; claims upper bound to false positives)
http://db.cs.berkeley.edu/papers/infocom07-pca.pdf (stochastic matrix perturbation theory, reduces communication cost by 80-90%, allowing smaller time buckets than Lakhina by detecting at the sensor level, volume based)
https://ics.forth.gr/netlab/mobile/Bibliography/LoadBalancing/LB/PCA_Anomaly_Deytection.pdf (Sensitivity of PCA to number of principle components, volume based)
http://hal.univ-savoie.fr/file/index/docid/620090/filename/infocom2009.pdf (Shows temporal correlation of data breaks PCA to extend work by Ringberg, feature based, uses smoothing filter, shows application to stochastic processes; greater results by removing low—mid frequency trends [daily, weekly])
http://users.ece.gatech.edu/~jic/anomaly-book-chap-09.pdf (2.4, background info)
http://www.researchgate.net/profile/Monowar_Bhuyan/publication/260521527_Network_Anomaly_Detection_Methods_Systems_and_Tools/links/00b49539bad485a81b000000.pdf (Recent Survey Paper)
Feature selection algorithm
Reduces dimensionality
Iteratively select uncorrelated features with most variance
Applicable to traffic volume & other features
Source/Destination IP Address
Source/Destination Ports
Packet Size
Volume compliments feature distribution
Batch method of selecting representative data
Sensitive to input parameters due to temporal correlation
* Functional Combination of Features
Variance (accounts for much of the variability of the data as possible)
http://conferences.sigcomm.org/sigcomm/2004/papers/p405-lakhina111.pdf (volume-based, separate into normal component and noisy component [which contains spikes])
http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2006_868.pdfhttp://www.dtic.mil/dtic/tr/fulltext/u2/a465712.pdf (application to intra-network anomaly detection toward addressing scalability concerns, stochastic matrix perturbation theory; claims upper bound to false positives)
http://db.cs.berkeley.edu/papers/infocom07-pca.pdf (stochastic matrix perturbation theory, reduces communication cost by 80-90%, allowing smaller time buckets than Lakhina by detecting at the sensor level, volume based)
https://ics.forth.gr/netlab/mobile/Bibliography/LoadBalancing/LB/PCA_Anomaly_Deytection.pdf (Sensitivity of PCA to number of principle components, volume based)
http://hal.univ-savoie.fr/file/index/docid/620090/filename/infocom2009.pdf (Shows temporal correlation of data breaks PCA to extend work by Ringberg, feature based, uses smoothing filter, shows application to stochastic processes; greater results by removing low—mid frequency trends [daily, weekly])
http://users.ece.gatech.edu/~jic/anomaly-book-chap-09.pdf (2.4, background info)
http://www.researchgate.net/profile/Monowar_Bhuyan/publication/260521527_Network_Anomaly_Detection_Methods_Systems_and_Tools/links/00b49539bad485a81b000000.pdf (Recent Survey Paper)
Feature selection algorithm
Reduces dimensionality
Iteratively select uncorrelated features with most variance
Applicable to traffic volume & other features
Source/Destination IP Address
Source/Destination Ports
Packet Size
Volume compliments feature distribution
Batch method of selecting representative data
Sensitive to input parameters due to temporal correlation
* Functional Combination of Features
Variance (accounts for much of the variability of the data as possible)
http://conferences.sigcomm.org/sigcomm/2004/papers/p405-lakhina111.pdf (volume-based, separate into normal component and noisy component [which contains spikes])
http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2006_868.pdfhttp://www.dtic.mil/dtic/tr/fulltext/u2/a465712.pdf (application to intra-network anomaly detection toward addressing scalability concerns, stochastic matrix perturbation theory; claims upper bound to false positives)
http://db.cs.berkeley.edu/papers/infocom07-pca.pdf (stochastic matrix perturbation theory, reduces communication cost by 80-90%, allowing smaller time buckets than Lakhina by detecting at the sensor level, volume based)
https://ics.forth.gr/netlab/mobile/Bibliography/LoadBalancing/LB/PCA_Anomaly_Deytection.pdf (Sensitivity of PCA to number of principle components, volume based)
http://hal.univ-savoie.fr/file/index/docid/620090/filename/infocom2009.pdf (Shows temporal correlation of data breaks PCA to extend work by Ringberg, feature based, uses smoothing filter, shows application to stochastic processes; greater results by removing low—mid frequency trends [daily, weekly])
http://users.ece.gatech.edu/~jic/anomaly-book-chap-09.pdf (2.4, background info)
http://www.researchgate.net/profile/Monowar_Bhuyan/publication/260521527_Network_Anomaly_Detection_Methods_Systems_and_Tools/links/00b49539bad485a81b000000.pdf (Recent Survey Paper)
Feature selection algorithm
Reduces dimensionality
Iteratively select uncorrelated features with most variance
Applicable to traffic volume & other features
Source/Destination IP Address
Source/Destination Ports
Packet Size
Volume compliments feature distribution
Batch method of selecting representative data
Sensitive to input parameters due to temporal correlation
* Functional Combination of Features
Variance (accounts for much of the variability of the data as possible)
http://conferences.sigcomm.org/sigcomm/2004/papers/p405-lakhina111.pdf (volume-based, separate into normal component and noisy component [which contains spikes])
http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2006_868.pdfhttp://www.dtic.mil/dtic/tr/fulltext/u2/a465712.pdf (application to intra-network anomaly detection toward addressing scalability concerns, stochastic matrix perturbation theory; claims upper bound to false positives)
http://db.cs.berkeley.edu/papers/infocom07-pca.pdf (stochastic matrix perturbation theory, reduces communication cost by 80-90%, allowing smaller time buckets than Lakhina by detecting at the sensor level, volume based)
https://ics.forth.gr/netlab/mobile/Bibliography/LoadBalancing/LB/PCA_Anomaly_Deytection.pdf (Sensitivity of PCA to number of principle components, volume based)
http://hal.univ-savoie.fr/file/index/docid/620090/filename/infocom2009.pdf (Shows temporal correlation of data breaks PCA to extend work by Ringberg, feature based, uses smoothing filter, shows application to stochastic processes; greater results by removing low—mid frequency trends [daily, weekly])
http://users.ece.gatech.edu/~jic/anomaly-book-chap-09.pdf (2.4, background info)
http://www.researchgate.net/profile/Monowar_Bhuyan/publication/260521527_Network_Anomaly_Detection_Methods_Systems_and_Tools/links/00b49539bad485a81b000000.pdf (Recent Survey Paper)