SlideShare a Scribd company logo
1 of 12
Using GSP data mining algorithm to
detect malicious flows in Lawrence
Berkeley National Laboratory FTP
Amir Razmjou
Pattern-based Techniques and
Today’s Cybersecurity Challenges
• Protocols specifications evolve more rapidly
• Vendor-Specific, Closed Standard Protocols.
• Network traffic verification against protocol
specifications does not always account for
legitimate traffic,
– XML XXE Attacks
– FTP Bounce Attacks
• Unknown attacks.
• That abnormality to user interactions account for
changes.
Sequential Pattern Mining
• It is similar to the frequent item sets mining,
but with consideration of ordering.
• Sequential Pattern Mining is useful in many
application.
– Customer shopping sequences:
– Medical treatments, natural disasters (e.g., earthquakes),
science & eng. processes, stocks and markets, etc.
• Useful for extraction of knowledge from semi-
structured data (i.e. XML)
What is sequence database and
sequential pattern mining
• A sequence database consists of ordered
elements or events where each element is an
unordered set of items.
SID sequences
10 <a(abc)(ac)d(cf)>
20 <(ad)c(bc)(ae)>
30 <(ef)(ab)(df)cb>
40 <eg(af)cbc>
TID itemsets
10 a, b, d
20 a, c, d
30 a, d, e
40 b, e, f
Sequential Shopping Cart
Transaction 1
biscuits
Sequence1
biscuits
Sequence2
biscuits
Sequence3
snack
Sequence4
baking needs
frozen foods frozen foods salads cake
fruit frozen foods chickens
fruit baking needs beef snack
Transaction 2
baking needs cake pet food
cake baking needs lamb
vegetables snack chickens
pet food electrical salads
Transaction 3
snack snack lamb brushware
salads
chickens salads salads
beef chickens
Transaction 5 chickens
electrical
brushware
Sample FTP Flow
Welcome to Microsoft FTP Server 3.4
USER anonymous
331 Guest login ok, send your complete e-mail address as
password.
PASS <password>
230 Guest login ok, access restrictions apply.
TYPE I
200 Type set to I.
CWD xfig
250 CWD command successful.
Data Preparation
Resulting Dataset
Source Destination APP Signature COMMAND CODE
4.251.189.14:33257 131.243.1.10:21 custom1 USER 331
4.251.189.14:33257 131.243.1.10:21 custom1 PASS 230
4.251.189.14:33257 131.243.1.10:21 custom1 REST 350
4.251.189.14:33257 131.243.1.10:21 custom1 TYPE 200
4.251.189.14:33257 131.243.1.10:21 custom1 CWD 250
4.251.189.14:33257 131.243.1.10:21 custom1 TYPE 200
140.114.97.25:33983 131.243.1.10:21 custom1 USER 331
140.114.97.25:33983 131.243.1.10:21 custom1 PASS 230
140.114.97.25:33983 131.243.1.10:21 custom1 SYST 215
140.114.97.25:33983 131.243.1.10:21 custom1 CWD 550
53.55.176.50:10011 131.243.1.10:21 custom1 USER 331
53.55.176.50:10011 131.243.1.10:21 custom1 PASS 230
53.55.176.50:10011 131.243.1.10:21 custom1 FEAT 500
53.55.176.50:10011 131.243.1.10:21 custom1 SYST 215
53.55.176.50:10011 131.243.1.10:21 custom1 PWD 257
Result Sequence Rules
[1] <{USER}{PASS,230}{TYPE,200}{PASV,227}{RETR,150}> 6391
[2] <{USER}{PASS,230}{TYPE,200}{SIZE,213}{RETR,150}> 4853
[3] <{USER,331}{PASS}{TYPE,200}{PASV,227}{RETR,150}> 6391
[4] <{USER,331}{PASS}{TYPE,200}{SIZE,213}{RETR,150}> 4853
[5] <{USER,331}{PASS,230}{CWD,250}{TYPE,200}{150}> 4872
[6] <{USER,331}{PASS,230}{TYPE}{PASV,227}{RETR,150}> 6391
[7] <{USER,331}{PASS,230}{TYPE}{SIZE,213}{RETR,150}> 4853
[8] <{USER,331}{PASS,230}{TYPE,200}{PASV}{RETR,150}> 6392
[9] <{USER,331}{PASS,230}{TYPE,200}{PASV,227}{RETR}> 7927
[10] <{USER,331}{PASS,230}{TYPE,200}{PASV,227}{150}> 8342
[11] <{USER,331}{PASS,230}{TYPE,200}{SIZE}{RETR,150}> 5062
[12] <{USER,331}{PASS,230}{TYPE,200}{SIZE,213}{RETR}> 4893
Abnormal Flows
USER, 331 , PASS, 230, PORT, 200, 500, QUIT, 221, 220, PWD,
257, SYST, 215, CWD, 550, PASV, 227, TYPE, SIZE,213, RETR,
150, 226, MDTM, 250, LIST, 421, ABOR,533, Udd20dfd1U,
U15030ab9U, U54668fafU, Udb6ef1c3U, U7694531dU,
PORTQUIT, U07c4edf9U, U8855979dU, Uab12679fU,
Uc2ca1083U, U5b79257aU, U5f561953U, Ud4a28da8U
wu2616121
custom1
wu2616120
proftpdrc2
general172125
general8
msftp4
msftp
sunos41
sunos56
other
general5
vxworks54
WarFTPd167
• Commands in unmatched flows
• Signatures of FTP servers
in unmatched flows
7%
Sequence Size and Support
References
• Almulhem, A., & Traore, I. (2007). Mining and detecting
connection-chains in network traffic. IFIP International
Federation for Information Processing, 238, 47–57.
http://doi.org/10.1007/978-0-387-73655-6_4
• Bronson, B. J. (2004). Protecting Your Network from ARP
Spoofing-Based Attacks, 1–5.
• Scigocki, M., & Zander, S. (2013). Improving Machine
Learning Network Traffic Classification with Payload-based
Features, (November), 1–7.
• Zander, S., Zander, S., Nguyen, T., Nguyen, T., Armitage, G.,
& Armitage, G. (2005). Automated Traffic Classification and
Application Identification using Machine Learning.
Proceedings of the IEEE.

More Related Content

Similar to Using GSP data mining algorithm to detect malicious flows in Lawrence Berkeley National Laboratory FTP

洛阳市第二中医院网络竣工文档
洛阳市第二中医院网络竣工文档洛阳市第二中医院网络竣工文档
洛阳市第二中医院网络竣工文档
zgxworks
 
The power of linux advanced tracer [POUG18]
The power of linux advanced tracer [POUG18]The power of linux advanced tracer [POUG18]
The power of linux advanced tracer [POUG18]
Mahmoud Hatem
 
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
confluent
 
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOsSPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
Rod Soto
 
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
Cuneyt Goksu
 

Similar to Using GSP data mining algorithm to detect malicious flows in Lawrence Berkeley National Laboratory FTP (20)

DPDK layer for porting IPS-IDS
DPDK layer for porting IPS-IDSDPDK layer for porting IPS-IDS
DPDK layer for porting IPS-IDS
 
Exactpro: Non-functional testing approach
Exactpro: Non-functional testing approachExactpro: Non-functional testing approach
Exactpro: Non-functional testing approach
 
洛阳市第二中医院网络竣工文档
洛阳市第二中医院网络竣工文档洛阳市第二中医院网络竣工文档
洛阳市第二中医院网络竣工文档
 
A New Framework for Detection
A New Framework for DetectionA New Framework for Detection
A New Framework for Detection
 
2013 Collaborate - OAUG - Presentation
2013 Collaborate - OAUG - Presentation2013 Collaborate - OAUG - Presentation
2013 Collaborate - OAUG - Presentation
 
Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!
Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!
Hacker Halted 2014 - Why Botnet Takedowns Never Work, Unless It’s a SmackDown!
 
sap basis transaction codes
sap basis transaction codessap basis transaction codes
sap basis transaction codes
 
DEF CON 27 - ALI ISLAM and DAN REGALADO WEAPONIZING HYPERVISORS
DEF CON 27 - ALI ISLAM and DAN REGALADO WEAPONIZING HYPERVISORSDEF CON 27 - ALI ISLAM and DAN REGALADO WEAPONIZING HYPERVISORS
DEF CON 27 - ALI ISLAM and DAN REGALADO WEAPONIZING HYPERVISORS
 
The power of linux advanced tracer [POUG18]
The power of linux advanced tracer [POUG18]The power of linux advanced tracer [POUG18]
The power of linux advanced tracer [POUG18]
 
PLANT INFORMATION SYSTEM.ppt
PLANT INFORMATION SYSTEM.pptPLANT INFORMATION SYSTEM.ppt
PLANT INFORMATION SYSTEM.ppt
 
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
 
Automated prevention of ransomware with machine learning and gpos
Automated prevention of ransomware with machine learning and gposAutomated prevention of ransomware with machine learning and gpos
Automated prevention of ransomware with machine learning and gpos
 
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOsSPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
 
Splunk App for Stream - Einblicke in Ihren Netzwerkverkehr
Splunk App for Stream - Einblicke in Ihren NetzwerkverkehrSplunk App for Stream - Einblicke in Ihren Netzwerkverkehr
Splunk App for Stream - Einblicke in Ihren Netzwerkverkehr
 
[Collinge] Modern Enterprise Network Connectivity Architecture for SaaS Services
[Collinge] Modern Enterprise Network Connectivity Architecture for SaaS Services[Collinge] Modern Enterprise Network Connectivity Architecture for SaaS Services
[Collinge] Modern Enterprise Network Connectivity Architecture for SaaS Services
 
Scada Based Online Circuit Breaker Monitoring System
Scada Based Online Circuit Breaker Monitoring SystemScada Based Online Circuit Breaker Monitoring System
Scada Based Online Circuit Breaker Monitoring System
 
9Tuts.Com New CCNA 200-120 New CCNA New Questions 2
9Tuts.Com New CCNA 200-120 New CCNA   New Questions 29Tuts.Com New CCNA 200-120 New CCNA   New Questions 2
9Tuts.Com New CCNA 200-120 New CCNA New Questions 2
 
Defects mining in exchanges - medvedev, klimakov, yamkovi
Defects mining in exchanges - medvedev, klimakov, yamkoviDefects mining in exchanges - medvedev, klimakov, yamkovi
Defects mining in exchanges - medvedev, klimakov, yamkovi
 
Labs_BT_20221017.pptx
Labs_BT_20221017.pptxLabs_BT_20221017.pptx
Labs_BT_20221017.pptx
 
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
 

More from Amir Razmjou (7)

Wrapper feature selection method
Wrapper feature selection methodWrapper feature selection method
Wrapper feature selection method
 
Data mining cyber security
Data mining   cyber securityData mining   cyber security
Data mining cyber security
 
Netmap presentation
Netmap presentationNetmap presentation
Netmap presentation
 
Cite track presentation
Cite track presentationCite track presentation
Cite track presentation
 
Motif presentation
Motif presentationMotif presentation
Motif presentation
 
Who creates trends in online social media
Who creates trends in online social mediaWho creates trends in online social media
Who creates trends in online social media
 
Respina shaper presentation
Respina shaper presentationRespina shaper presentation
Respina shaper presentation
 

Recently uploaded

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
HyderabadDolls
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 

Recently uploaded (20)

Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 

Using GSP data mining algorithm to detect malicious flows in Lawrence Berkeley National Laboratory FTP

  • 1. Using GSP data mining algorithm to detect malicious flows in Lawrence Berkeley National Laboratory FTP Amir Razmjou
  • 2. Pattern-based Techniques and Today’s Cybersecurity Challenges • Protocols specifications evolve more rapidly • Vendor-Specific, Closed Standard Protocols. • Network traffic verification against protocol specifications does not always account for legitimate traffic, – XML XXE Attacks – FTP Bounce Attacks • Unknown attacks. • That abnormality to user interactions account for changes.
  • 3. Sequential Pattern Mining • It is similar to the frequent item sets mining, but with consideration of ordering. • Sequential Pattern Mining is useful in many application. – Customer shopping sequences: – Medical treatments, natural disasters (e.g., earthquakes), science & eng. processes, stocks and markets, etc. • Useful for extraction of knowledge from semi- structured data (i.e. XML)
  • 4. What is sequence database and sequential pattern mining • A sequence database consists of ordered elements or events where each element is an unordered set of items. SID sequences 10 <a(abc)(ac)d(cf)> 20 <(ad)c(bc)(ae)> 30 <(ef)(ab)(df)cb> 40 <eg(af)cbc> TID itemsets 10 a, b, d 20 a, c, d 30 a, d, e 40 b, e, f
  • 5. Sequential Shopping Cart Transaction 1 biscuits Sequence1 biscuits Sequence2 biscuits Sequence3 snack Sequence4 baking needs frozen foods frozen foods salads cake fruit frozen foods chickens fruit baking needs beef snack Transaction 2 baking needs cake pet food cake baking needs lamb vegetables snack chickens pet food electrical salads Transaction 3 snack snack lamb brushware salads chickens salads salads beef chickens Transaction 5 chickens electrical brushware
  • 6. Sample FTP Flow Welcome to Microsoft FTP Server 3.4 USER anonymous 331 Guest login ok, send your complete e-mail address as password. PASS <password> 230 Guest login ok, access restrictions apply. TYPE I 200 Type set to I. CWD xfig 250 CWD command successful.
  • 8. Resulting Dataset Source Destination APP Signature COMMAND CODE 4.251.189.14:33257 131.243.1.10:21 custom1 USER 331 4.251.189.14:33257 131.243.1.10:21 custom1 PASS 230 4.251.189.14:33257 131.243.1.10:21 custom1 REST 350 4.251.189.14:33257 131.243.1.10:21 custom1 TYPE 200 4.251.189.14:33257 131.243.1.10:21 custom1 CWD 250 4.251.189.14:33257 131.243.1.10:21 custom1 TYPE 200 140.114.97.25:33983 131.243.1.10:21 custom1 USER 331 140.114.97.25:33983 131.243.1.10:21 custom1 PASS 230 140.114.97.25:33983 131.243.1.10:21 custom1 SYST 215 140.114.97.25:33983 131.243.1.10:21 custom1 CWD 550 53.55.176.50:10011 131.243.1.10:21 custom1 USER 331 53.55.176.50:10011 131.243.1.10:21 custom1 PASS 230 53.55.176.50:10011 131.243.1.10:21 custom1 FEAT 500 53.55.176.50:10011 131.243.1.10:21 custom1 SYST 215 53.55.176.50:10011 131.243.1.10:21 custom1 PWD 257
  • 9. Result Sequence Rules [1] <{USER}{PASS,230}{TYPE,200}{PASV,227}{RETR,150}> 6391 [2] <{USER}{PASS,230}{TYPE,200}{SIZE,213}{RETR,150}> 4853 [3] <{USER,331}{PASS}{TYPE,200}{PASV,227}{RETR,150}> 6391 [4] <{USER,331}{PASS}{TYPE,200}{SIZE,213}{RETR,150}> 4853 [5] <{USER,331}{PASS,230}{CWD,250}{TYPE,200}{150}> 4872 [6] <{USER,331}{PASS,230}{TYPE}{PASV,227}{RETR,150}> 6391 [7] <{USER,331}{PASS,230}{TYPE}{SIZE,213}{RETR,150}> 4853 [8] <{USER,331}{PASS,230}{TYPE,200}{PASV}{RETR,150}> 6392 [9] <{USER,331}{PASS,230}{TYPE,200}{PASV,227}{RETR}> 7927 [10] <{USER,331}{PASS,230}{TYPE,200}{PASV,227}{150}> 8342 [11] <{USER,331}{PASS,230}{TYPE,200}{SIZE}{RETR,150}> 5062 [12] <{USER,331}{PASS,230}{TYPE,200}{SIZE,213}{RETR}> 4893
  • 10. Abnormal Flows USER, 331 , PASS, 230, PORT, 200, 500, QUIT, 221, 220, PWD, 257, SYST, 215, CWD, 550, PASV, 227, TYPE, SIZE,213, RETR, 150, 226, MDTM, 250, LIST, 421, ABOR,533, Udd20dfd1U, U15030ab9U, U54668fafU, Udb6ef1c3U, U7694531dU, PORTQUIT, U07c4edf9U, U8855979dU, Uab12679fU, Uc2ca1083U, U5b79257aU, U5f561953U, Ud4a28da8U wu2616121 custom1 wu2616120 proftpdrc2 general172125 general8 msftp4 msftp sunos41 sunos56 other general5 vxworks54 WarFTPd167 • Commands in unmatched flows • Signatures of FTP servers in unmatched flows 7%
  • 11. Sequence Size and Support
  • 12. References • Almulhem, A., & Traore, I. (2007). Mining and detecting connection-chains in network traffic. IFIP International Federation for Information Processing, 238, 47–57. http://doi.org/10.1007/978-0-387-73655-6_4 • Bronson, B. J. (2004). Protecting Your Network from ARP Spoofing-Based Attacks, 1–5. • Scigocki, M., & Zander, S. (2013). Improving Machine Learning Network Traffic Classification with Payload-based Features, (November), 1–7. • Zander, S., Zander, S., Nguyen, T., Nguyen, T., Armitage, G., & Armitage, G. (2005). Automated Traffic Classification and Application Identification using Machine Learning. Proceedings of the IEEE.