SlideShare a Scribd company logo
1 of 26
Download to read offline
Deep Learning
2015/07/04
Marat Zhanikeev
maratishe@gmail.com
GI研@天神イムズ
PDF: http://bit.do/150704
in Human-Guided
Text MiningvsMultidimensional Classification
.
Deep Learning vs MD Classifiers
• Deep Learning 08 10
◦ Feature-based: image → features → NN
◦ Raw/Pixels : image → raw pixels → NN
• Multi-Dimentional Classification 04 05
◦ assigning classes to items in multiple dimensions
• Human-Guided Text Mining 02
◦ Folksonomy + BigData
◦ learning from empty state with gradually diminishing human feedback
08 A.Nguyen+2 "Deep Neural Networks are Easily Fooled..." IEEE CVPR (2015)
10 G.Goos+2 "Neural Networks: Tricks of the Trade" Springer LNCS vol.7700, 2nd edition (2012)
04 X.Zhu+1 "Introduction to Semi-Supervised Learning" Morgan and Claypool Publishers (2009)
05 D.Koller+1 "Probabilistic Graphical Models: Principles and Techniques" MIT Press (2009)
02 myself+0 "Multidimensional Classification Automation with Human Interface based on Metromaps" 4th AAI (2015)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 2/26
...
2/26
.
Deep Learning
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 3/26
...
3/26
.
Deep Learning (1) Feature-Based
• many feature extraction libraries, normally specific to environments/targets
• problem 1: wide range of errors, can be from 50% up to 96%
• problem 2: who decides on the features?
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 4/26
...
4/26
.
Deep Learning (2) Raw Pixels
• just feed the raw pixels to the Neural Network and let it sort it out for itself
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 5/26
...
5/26
.
Deep Learning (3) Google Faces
• a feature-based method, extremely specific, recently acquired by Google
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 6/26
...
6/26
.
Deep Learning (4) Google Cats
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 7/26
...
7/26
.
Deep Learning (5) Raw/Pixel Method
• a standard process for a pixel-based learning 12
• CSV files are traditional, one image becomes one line
0 1 1 … 0
1 …
0 …
… …
1 …
Handwriting
Black -n-white
Pixel map
Matrix in a CSV file
3
Deep
Learning
3
Training
Testing
12 "MNIST Dataset of Handwritten Digits" http://yann.lecun.com/exdb/mnist/ (2015)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 8/26
...
8/26
.
Multi-Dimensional Classification
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 9/26
...
9/26
.
MDC : Binary Relevance (BR) Classes
• single dimension
• not practical today, when most things exists in multi-dimensional space 06
Training
Tuples
x1 x2 Y1 Y2 Y3
1 0.7 0.4 1 1 0
2 0.6 0.2 1 1 0
3 0.1 0.9 0 0 1
4 0.3 0.1 0 0 0
h1: X → Y1
h2: X → Y2
h3: X → Y3
06 J.Ortigosa-Hernandez+3 "A Semi-supervised Approach to Multi-dimensional Classification..." 6th TAMIDA (2010)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 10/26
...
10/26
.
MDC : PairWise (PW) Sets
• define classes as pairs of base BR classes 06
• lower complexity, higher error rate
Training
Tuples
x1 x2 Y1 Y2 Y3
1 0.7 0.4 1 1 0
2 0.6 0.2 1 1 0
0.1 0.9 0 0 1
0.3 0.1 0 0 0
h1: X → Z1
h2: X → Z2
Z1 Z2
1 0
0 1
0 0
0 0
06 J.Ortigosa-Hernandez+3 "A Semi-supervised Approach to Multi-dimensional Classification..." 6th TAMIDA (2010)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 11/26
...
11/26
.
MDC : Label Combination (LC) Method
• a class for all combinations of base BR 06
• very high complexity, still high error rate
Training
Tuples
x1 x2 Y1 Y2 Y3
1 0.7 0.4 1 1 0
2 0.6 0.2 1 1 0
3 0.1 0.9 0 0 1
4 0.3 0.1 0 0 0
h: X → Z
Z
1
0
0
0
06 J.Ortigosa-Hernandez+3 "A Semi-supervised Approach to Multi-dimensional Classification..." 6th TAMIDA (2010)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 12/26
...
12/26
.
MDC : The CC Method
• CC: Classifier Chains method 07 -- literally, a chain of BR classes
• controlled complexity, much better error rate, but the main problem is which order?
Training
Tuples
x1 x2 Y1 Y2 Y3
1 0.7 0.4 1 1 0
2 0.6 0.2 1 1 0
3 0.1 0.9 0 0 1
4 0.3 0.1 0 0 0
h1: X → Y1
h2: Y1 → Y2
h3: Y2 → Y3
h2h1 h3
07 J.Read+3 "Classifier chains for multi-label classification" Machine Learning, Springer (2011)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 13/26
...
13/26
.
The MetroMap Classifier (MMC)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 14/26
...
14/26
.
The Metromap Concept
• like a map of a train network 01
• main advantage: e2e paths in (ontology) graphs
01 myself+0 "On Context Management Using Metro Maps" 7th SOCA (2014)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 15/26
...
15/26
.
MMC : A Practical Setting
Human
judgment
Auto
judgement
Folksonomy
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 16/26
...
16/26
.
MMC : Processing Logic
• processing based on human-defined metromap, the function is similar to
chaining BR classes, but with higher performance
Metromap
Classifier
Human
Check
Metromap
Fuzzy?
Cold?
Hot?
Robot (Automatic Classification)
Bad
Input
No
Yes
No
No
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 17/26
...
17/26
.
MDC (MMC) vs DL(pixels)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 18/26
...
18/26
.
DL : graphics vs Text
• graphics
◦ pixels are already numeric
◦ images can be resized to provide same-size input -- DL needs fixed-size input
• text requires complex processing
1. tokenize text (words)
2. frequency distribution -- variable size
3. sample distribution -- finally, the same/fixed size!
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 19/26
...
19/26
.
Experimental Setup (1) Humans
• 2 main cases: hot + cold = picked but not used, hot - cold = picked and used
(blackswans) 03
03 myself+0 "Black Swan Disaster Scenarios" IEICE PRMU研 (2014)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 20/26
...
20/26
.
Experimental Setup (2) Process
• the text is not numeric by nature, has to be converted into sampled
frequency distribution
• calculations in R, used h2o package 11 for deep learning
0 1 1 … 0
1 …
0 …
… …
1 …
Text
Matrix in a CSV file
Deep
Learning
Tokenize
Frequency
Distribution
Sample
Bayes
Many
(Chains, Metromap , etc.)
Path 1
Path 2
11 "H2O: R Package for Learning Algorithms" http://cran.r-project.org/web/packages/h2o (2015)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 21/26
...
21/26
.
Results (1) MMC vs BR
0 20 40 60 80 100 120
Time sequence
0
10
20
30
40
50
60
70
80
90
Goodcount
Dumb Classifier
Metromap Classifier
Hits on a timeline
title
0
10
20
30
40
50
60
70
80
Goodcount
title:keywords
0
10
20
30
40
50
60
70
80
90
Goodcount
title:keywords:abstract
0 20 40 60 80 100 120
Time sequence
0 20 40 60 80 100 120
Time sequence
02 myself+0 "Multidimensional Classification Automation with Human Interface based on Metromaps" 4th AAI (2015)
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 22/26
...
22/26
.
Results (2) DL Results
0 20 40 60 80 100
Time sequence
0
20
40
60
80
100
Deeplearninghits
Diagonal/humanDeep learning
keys(title)
rule(cold#yes hot#yes)
0 20 40 60 80 100
Time sequence
0
20
40
60
80
100
Deeplearninghits
keys(title:keywords:abstract)
rule(cold#yes hot#yes)
0 20 40 60 80 100
Time sequence
0
20
40
60
80
100
Deeplearninghits
keys(title:keywords:abstract)
rule(cold#no hot#yes)
0 20 40 60 80 100
Time sequence
0
20
40
60
80
100
Deeplearninghits
keys(title:keywords:abstract)
rule(cold#yes hot#yes)
• compared to x = y
case
• DL performs very
badly
• best performs
when abstract is
used, even then about
25% hits
• same performance for
hot + cold and hot
- cold cases
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 23/26
...
23/26
.
That’s all, thank you ...
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 24/26
...
24/26
.
MDC and Social Robotics Go Together
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 25/26
...
25/26
.
Social Robotics in Text Mining Context
Rebot
(careless)
Input
Human
Human
{structure}
(pinpoint)
Select
Browse
(or use otherwise)
Some
Knowledge
(folksonomies,
knowledge bases,
databases, indexes,
ontologies, etc.)
(metromaps )
M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 26/26
...
26/26

More Related Content

Similar to Deep Learning vs Multidimensional Classification in Human-Guided Text Mining

Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...
Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...
Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...Tokyo University of Science
 
Multidimentional Classification Automation with Human Interface based on Metr...
Multidimentional Classification Automation with Human Interface based on Metr...Multidimentional Classification Automation with Human Interface based on Metr...
Multidimentional Classification Automation with Human Interface based on Metr...Tokyo University of Science
 
MetroMaps versus Facets: What Exactly is the Ontological Context?
MetroMaps versus Facets: What Exactly is the Ontological Context?MetroMaps versus Facets: What Exactly is the Ontological Context?
MetroMaps versus Facets: What Exactly is the Ontological Context?Tokyo University of Science
 
HILDA 2023 Keynote Bill Howe
HILDA 2023 Keynote Bill HoweHILDA 2023 Keynote Bill Howe
HILDA 2023 Keynote Bill Howedomoritz
 
Complexity Resolution Control for Context Based on Metromaps
Complexity Resolution Control for Context Based on MetromapsComplexity Resolution Control for Context Based on Metromaps
Complexity Resolution Control for Context Based on MetromapsTokyo University of Science
 
The Unbearable Lightness of Wiking
The Unbearable Lightness of Wiking The Unbearable Lightness of Wiking
The Unbearable Lightness of Wiking Jie Bao
 
Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012xin wang
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...BigMine
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)Matthew Lease
 
Top (10) challenging problems in data mining
Top (10) challenging problems  in data miningTop (10) challenging problems  in data mining
Top (10) challenging problems in data miningAhmedasbasb
 
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015Ioan Toma
 
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2ODeep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2OSri Ambati
 
Seattle Scalability Mahout
Seattle Scalability MahoutSeattle Scalability Mahout
Seattle Scalability MahoutJake Mannix
 
Classification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different FacetsClassification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different FacetsGeoffrey Fox
 
A Software Design and Algorithms for Multicore Capture in Data Center Forensics
A Software Design and Algorithms for Multicore Capture in Data Center ForensicsA Software Design and Algorithms for Multicore Capture in Data Center Forensics
A Software Design and Algorithms for Multicore Capture in Data Center ForensicsTokyo University of Science
 

Similar to Deep Learning vs Multidimensional Classification in Human-Guided Text Mining (20)

Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...
Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...
Metromaps as a Tool for Minimizing Human Interaction with Learning Bayesian C...
 
Multidimentional Classification Automation with Human Interface based on Metr...
Multidimentional Classification Automation with Human Interface based on Metr...Multidimentional Classification Automation with Human Interface based on Metr...
Multidimentional Classification Automation with Human Interface based on Metr...
 
MetroMaps versus Facets: What Exactly is the Ontological Context?
MetroMaps versus Facets: What Exactly is the Ontological Context?MetroMaps versus Facets: What Exactly is the Ontological Context?
MetroMaps versus Facets: What Exactly is the Ontological Context?
 
HILDA 2023 Keynote Bill Howe
HILDA 2023 Keynote Bill HoweHILDA 2023 Keynote Bill Howe
HILDA 2023 Keynote Bill Howe
 
On Context Management Using Metro Maps
On Context Management Using Metro MapsOn Context Management Using Metro Maps
On Context Management Using Metro Maps
 
Complexity Resolution Control for Context Based on Metromaps
Complexity Resolution Control for Context Based on MetromapsComplexity Resolution Control for Context Based on Metromaps
Complexity Resolution Control for Context Based on Metromaps
 
The Unbearable Lightness of Wiking
The Unbearable Lightness of Wiking The Unbearable Lightness of Wiking
The Unbearable Lightness of Wiking
 
Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
 
The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)The Rise of Crowd Computing (July 7, 2016)
The Rise of Crowd Computing (July 7, 2016)
 
Top (10) challenging problems in data mining
Top (10) challenging problems  in data miningTop (10) challenging problems  in data mining
Top (10) challenging problems in data mining
 
A density based clustering approach for web robot detection
A density based clustering approach for web robot detectionA density based clustering approach for web robot detection
A density based clustering approach for web robot detection
 
Complex Models for Big Data
Complex Models for Big DataComplex Models for Big Data
Complex Models for Big Data
 
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
 
lecture1.pptx
lecture1.pptxlecture1.pptx
lecture1.pptx
 
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2ODeep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
 
Seattle Scalability Mahout
Seattle Scalability MahoutSeattle Scalability Mahout
Seattle Scalability Mahout
 
Classification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different FacetsClassification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different Facets
 
A Server-Assigned Crowdsourcing Framework
A Server-Assigned Crowdsourcing FrameworkA Server-Assigned Crowdsourcing Framework
A Server-Assigned Crowdsourcing Framework
 
A Software Design and Algorithms for Multicore Capture in Data Center Forensics
A Software Design and Algorithms for Multicore Capture in Data Center ForensicsA Software Design and Algorithms for Multicore Capture in Data Center Forensics
A Software Design and Algorithms for Multicore Capture in Data Center Forensics
 

More from Tokyo University of Science

A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...
A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...
A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...Tokyo University of Science
 
Ultrasound Relative Positioning for IoT Devices in Dense Wireless Spaces
Ultrasound Relative Positioning for IoT Devices in Dense Wireless SpacesUltrasound Relative Positioning for IoT Devices in Dense Wireless Spaces
Ultrasound Relative Positioning for IoT Devices in Dense Wireless SpacesTokyo University of Science
 
Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...
Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...
Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...Tokyo University of Science
 
What if We Atomize Student Data and Apps and Put Them on Docker Containers?
What if We Atomize Student Data and Apps and Put Them on Docker Containers?What if We Atomize Student Data and Apps and Put Them on Docker Containers?
What if We Atomize Student Data and Apps and Put Them on Docker Containers?Tokyo University of Science
 
Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...
Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...
Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...Tokyo University of Science
 
On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
On Performance Under Hotspots in Hadoop versus Bigdata Replay PlatformsOn Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
On Performance Under Hotspots in Hadoop versus Bigdata Replay PlatformsTokyo University of Science
 
Taking the Step from Software to Product Development \\ when teaching PBL at ...
Taking the Step from Software to Product Development \\ when teaching PBL at ...Taking the Step from Software to Product Development \\ when teaching PBL at ...
Taking the Step from Software to Product Development \\ when teaching PBL at ...Tokyo University of Science
 
Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...
Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...
Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...Tokyo University of Science
 
The Switchboard Optimization Problem and Heuristics for Cut-Through Networking
The Switchboard Optimization Problem and Heuristics for Cut-Through NetworkingThe Switchboard Optimization Problem and Heuristics for Cut-Through Networking
The Switchboard Optimization Problem and Heuristics for Cut-Through NetworkingTokyo University of Science
 
The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...
The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...
The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...Tokyo University of Science
 
Bulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless Spaces
Bulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless SpacesBulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless Spaces
Bulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless SpacesTokyo University of Science
 
Fog Cloud Caching at Network Edge via Local Hardware Awareness Spaces
Fog Cloud Caching at Network Edge via Local Hardware Awareness SpacesFog Cloud Caching at Network Edge via Local Hardware Awareness Spaces
Fog Cloud Caching at Network Edge via Local Hardware Awareness SpacesTokyo University of Science
 
On a Hybrid Packets-and-Circuits Switching Logic
On a Hybrid Packets-and-Circuits Switching LogicOn a Hybrid Packets-and-Circuits Switching Logic
On a Hybrid Packets-and-Circuits Switching LogicTokyo University of Science
 
Image-Related Uses for Roadside Infrastructure \\ based on Wireless Beacons
Image-Related Uses for Roadside Infrastructure \\ based on Wireless BeaconsImage-Related Uses for Roadside Infrastructure \\ based on Wireless Beacons
Image-Related Uses for Roadside Infrastructure \\ based on Wireless BeaconsTokyo University of Science
 
The Declarative-Coordinated Model for Self-Optimization of Service Networks
The Declarative-Coordinated Model for Self-Optimization of Service NetworksThe Declarative-Coordinated Model for Self-Optimization of Service Networks
The Declarative-Coordinated Model for Self-Optimization of Service NetworksTokyo University of Science
 
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
3-Way Scripts as a Practical Platform for Secure Distributed Code in CloudsTokyo University of Science
 
3-Way Scripts as a Base Unit for Flexible Scale-Out Code
3-Way Scripts as a Base Unit for Flexible Scale-Out Code3-Way Scripts as a Base Unit for Flexible Scale-Out Code
3-Way Scripts as a Base Unit for Flexible Scale-Out CodeTokyo University of Science
 
Towards Social Robotics on Smartphones with Simple XYZV Sensor Feedback
Towards Social Robotics on Smartphones with Simple XYZV Sensor FeedbackTowards Social Robotics on Smartphones with Simple XYZV Sensor Feedback
Towards Social Robotics on Smartphones with Simple XYZV Sensor FeedbackTokyo University of Science
 
Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...
Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...
Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...Tokyo University of Science
 
Browser Visualization using PNGs Generated by HTML5 Workers on Multicore
Browser Visualization using PNGs Generated by HTML5 Workers on MulticoreBrowser Visualization using PNGs Generated by HTML5 Workers on Multicore
Browser Visualization using PNGs Generated by HTML5 Workers on MulticoreTokyo University of Science
 

More from Tokyo University of Science (20)

A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...
A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...
A Method for Cloud-Assisted Secure Wireless Grouping of Client Devices at Net...
 
Ultrasound Relative Positioning for IoT Devices in Dense Wireless Spaces
Ultrasound Relative Positioning for IoT Devices in Dense Wireless SpacesUltrasound Relative Positioning for IoT Devices in Dense Wireless Spaces
Ultrasound Relative Positioning for IoT Devices in Dense Wireless Spaces
 
Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...
Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...
Towards a Packet Traffic Genome Project as a Method for Realtime Sub-Flow Tra...
 
What if We Atomize Student Data and Apps and Put Them on Docker Containers?
What if We Atomize Student Data and Apps and Put Them on Docker Containers?What if We Atomize Student Data and Apps and Put Them on Docker Containers?
What if We Atomize Student Data and Apps and Put Them on Docker Containers?
 
Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...
Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...
Large-Scale Crowdsourcing by Vehicular Data Packets in a Sparse Roadside Infr...
 
On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
On Performance Under Hotspots in Hadoop versus Bigdata Replay PlatformsOn Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
 
Taking the Step from Software to Product Development \\ when teaching PBL at ...
Taking the Step from Software to Product Development \\ when teaching PBL at ...Taking the Step from Software to Product Development \\ when teaching PBL at ...
Taking the Step from Software to Product Development \\ when teaching PBL at ...
 
Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...
Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...
Design and Implementation of a 3-Party Cloud-Backed Handshake for Secure Grou...
 
The Switchboard Optimization Problem and Heuristics for Cut-Through Networking
The Switchboard Optimization Problem and Heuristics for Cut-Through NetworkingThe Switchboard Optimization Problem and Heuristics for Cut-Through Networking
The Switchboard Optimization Problem and Heuristics for Cut-Through Networking
 
The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...
The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...
The Switchboard Traffic Engineering Problem for Mixed Contention/Cut-Through ...
 
Bulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless Spaces
Bulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless SpacesBulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless Spaces
Bulk-n-Pick Method for One-to-Many Data Transfer in Dense Wireless Spaces
 
Fog Cloud Caching at Network Edge via Local Hardware Awareness Spaces
Fog Cloud Caching at Network Edge via Local Hardware Awareness SpacesFog Cloud Caching at Network Edge via Local Hardware Awareness Spaces
Fog Cloud Caching at Network Edge via Local Hardware Awareness Spaces
 
On a Hybrid Packets-and-Circuits Switching Logic
On a Hybrid Packets-and-Circuits Switching LogicOn a Hybrid Packets-and-Circuits Switching Logic
On a Hybrid Packets-and-Circuits Switching Logic
 
Image-Related Uses for Roadside Infrastructure \\ based on Wireless Beacons
Image-Related Uses for Roadside Infrastructure \\ based on Wireless BeaconsImage-Related Uses for Roadside Infrastructure \\ based on Wireless Beacons
Image-Related Uses for Roadside Infrastructure \\ based on Wireless Beacons
 
The Declarative-Coordinated Model for Self-Optimization of Service Networks
The Declarative-Coordinated Model for Self-Optimization of Service NetworksThe Declarative-Coordinated Model for Self-Optimization of Service Networks
The Declarative-Coordinated Model for Self-Optimization of Service Networks
 
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
3-Way Scripts as a Practical Platform for Secure Distributed Code in Clouds
 
3-Way Scripts as a Base Unit for Flexible Scale-Out Code
3-Way Scripts as a Base Unit for Flexible Scale-Out Code3-Way Scripts as a Base Unit for Flexible Scale-Out Code
3-Way Scripts as a Base Unit for Flexible Scale-Out Code
 
Towards Social Robotics on Smartphones with Simple XYZV Sensor Feedback
Towards Social Robotics on Smartphones with Simple XYZV Sensor FeedbackTowards Social Robotics on Smartphones with Simple XYZV Sensor Feedback
Towards Social Robotics on Smartphones with Simple XYZV Sensor Feedback
 
Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...
Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...
Back to Rings but not Tokens: Physical and Logical Designs for Distributed Fi...
 
Browser Visualization using PNGs Generated by HTML5 Workers on Multicore
Browser Visualization using PNGs Generated by HTML5 Workers on MulticoreBrowser Visualization using PNGs Generated by HTML5 Workers on Multicore
Browser Visualization using PNGs Generated by HTML5 Workers on Multicore
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

Deep Learning vs Multidimensional Classification in Human-Guided Text Mining

  • 1. Deep Learning 2015/07/04 Marat Zhanikeev maratishe@gmail.com GI研@天神イムズ PDF: http://bit.do/150704 in Human-Guided Text MiningvsMultidimensional Classification
  • 2. . Deep Learning vs MD Classifiers • Deep Learning 08 10 ◦ Feature-based: image → features → NN ◦ Raw/Pixels : image → raw pixels → NN • Multi-Dimentional Classification 04 05 ◦ assigning classes to items in multiple dimensions • Human-Guided Text Mining 02 ◦ Folksonomy + BigData ◦ learning from empty state with gradually diminishing human feedback 08 A.Nguyen+2 "Deep Neural Networks are Easily Fooled..." IEEE CVPR (2015) 10 G.Goos+2 "Neural Networks: Tricks of the Trade" Springer LNCS vol.7700, 2nd edition (2012) 04 X.Zhu+1 "Introduction to Semi-Supervised Learning" Morgan and Claypool Publishers (2009) 05 D.Koller+1 "Probabilistic Graphical Models: Principles and Techniques" MIT Press (2009) 02 myself+0 "Multidimensional Classification Automation with Human Interface based on Metromaps" 4th AAI (2015) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 2/26 ... 2/26
  • 3. . Deep Learning M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 3/26 ... 3/26
  • 4. . Deep Learning (1) Feature-Based • many feature extraction libraries, normally specific to environments/targets • problem 1: wide range of errors, can be from 50% up to 96% • problem 2: who decides on the features? M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 4/26 ... 4/26
  • 5. . Deep Learning (2) Raw Pixels • just feed the raw pixels to the Neural Network and let it sort it out for itself M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 5/26 ... 5/26
  • 6. . Deep Learning (3) Google Faces • a feature-based method, extremely specific, recently acquired by Google M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 6/26 ... 6/26
  • 7. . Deep Learning (4) Google Cats M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 7/26 ... 7/26
  • 8. . Deep Learning (5) Raw/Pixel Method • a standard process for a pixel-based learning 12 • CSV files are traditional, one image becomes one line 0 1 1 … 0 1 … 0 … … … 1 … Handwriting Black -n-white Pixel map Matrix in a CSV file 3 Deep Learning 3 Training Testing 12 "MNIST Dataset of Handwritten Digits" http://yann.lecun.com/exdb/mnist/ (2015) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 8/26 ... 8/26
  • 9. . Multi-Dimensional Classification M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 9/26 ... 9/26
  • 10. . MDC : Binary Relevance (BR) Classes • single dimension • not practical today, when most things exists in multi-dimensional space 06 Training Tuples x1 x2 Y1 Y2 Y3 1 0.7 0.4 1 1 0 2 0.6 0.2 1 1 0 3 0.1 0.9 0 0 1 4 0.3 0.1 0 0 0 h1: X → Y1 h2: X → Y2 h3: X → Y3 06 J.Ortigosa-Hernandez+3 "A Semi-supervised Approach to Multi-dimensional Classification..." 6th TAMIDA (2010) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 10/26 ... 10/26
  • 11. . MDC : PairWise (PW) Sets • define classes as pairs of base BR classes 06 • lower complexity, higher error rate Training Tuples x1 x2 Y1 Y2 Y3 1 0.7 0.4 1 1 0 2 0.6 0.2 1 1 0 0.1 0.9 0 0 1 0.3 0.1 0 0 0 h1: X → Z1 h2: X → Z2 Z1 Z2 1 0 0 1 0 0 0 0 06 J.Ortigosa-Hernandez+3 "A Semi-supervised Approach to Multi-dimensional Classification..." 6th TAMIDA (2010) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 11/26 ... 11/26
  • 12. . MDC : Label Combination (LC) Method • a class for all combinations of base BR 06 • very high complexity, still high error rate Training Tuples x1 x2 Y1 Y2 Y3 1 0.7 0.4 1 1 0 2 0.6 0.2 1 1 0 3 0.1 0.9 0 0 1 4 0.3 0.1 0 0 0 h: X → Z Z 1 0 0 0 06 J.Ortigosa-Hernandez+3 "A Semi-supervised Approach to Multi-dimensional Classification..." 6th TAMIDA (2010) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 12/26 ... 12/26
  • 13. . MDC : The CC Method • CC: Classifier Chains method 07 -- literally, a chain of BR classes • controlled complexity, much better error rate, but the main problem is which order? Training Tuples x1 x2 Y1 Y2 Y3 1 0.7 0.4 1 1 0 2 0.6 0.2 1 1 0 3 0.1 0.9 0 0 1 4 0.3 0.1 0 0 0 h1: X → Y1 h2: Y1 → Y2 h3: Y2 → Y3 h2h1 h3 07 J.Read+3 "Classifier chains for multi-label classification" Machine Learning, Springer (2011) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 13/26 ... 13/26
  • 14. . The MetroMap Classifier (MMC) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 14/26 ... 14/26
  • 15. . The Metromap Concept • like a map of a train network 01 • main advantage: e2e paths in (ontology) graphs 01 myself+0 "On Context Management Using Metro Maps" 7th SOCA (2014) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 15/26 ... 15/26
  • 16. . MMC : A Practical Setting Human judgment Auto judgement Folksonomy M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 16/26 ... 16/26
  • 17. . MMC : Processing Logic • processing based on human-defined metromap, the function is similar to chaining BR classes, but with higher performance Metromap Classifier Human Check Metromap Fuzzy? Cold? Hot? Robot (Automatic Classification) Bad Input No Yes No No M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 17/26 ... 17/26
  • 18. . MDC (MMC) vs DL(pixels) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 18/26 ... 18/26
  • 19. . DL : graphics vs Text • graphics ◦ pixels are already numeric ◦ images can be resized to provide same-size input -- DL needs fixed-size input • text requires complex processing 1. tokenize text (words) 2. frequency distribution -- variable size 3. sample distribution -- finally, the same/fixed size! M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 19/26 ... 19/26
  • 20. . Experimental Setup (1) Humans • 2 main cases: hot + cold = picked but not used, hot - cold = picked and used (blackswans) 03 03 myself+0 "Black Swan Disaster Scenarios" IEICE PRMU研 (2014) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 20/26 ... 20/26
  • 21. . Experimental Setup (2) Process • the text is not numeric by nature, has to be converted into sampled frequency distribution • calculations in R, used h2o package 11 for deep learning 0 1 1 … 0 1 … 0 … … … 1 … Text Matrix in a CSV file Deep Learning Tokenize Frequency Distribution Sample Bayes Many (Chains, Metromap , etc.) Path 1 Path 2 11 "H2O: R Package for Learning Algorithms" http://cran.r-project.org/web/packages/h2o (2015) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 21/26 ... 21/26
  • 22. . Results (1) MMC vs BR 0 20 40 60 80 100 120 Time sequence 0 10 20 30 40 50 60 70 80 90 Goodcount Dumb Classifier Metromap Classifier Hits on a timeline title 0 10 20 30 40 50 60 70 80 Goodcount title:keywords 0 10 20 30 40 50 60 70 80 90 Goodcount title:keywords:abstract 0 20 40 60 80 100 120 Time sequence 0 20 40 60 80 100 120 Time sequence 02 myself+0 "Multidimensional Classification Automation with Human Interface based on Metromaps" 4th AAI (2015) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 22/26 ... 22/26
  • 23. . Results (2) DL Results 0 20 40 60 80 100 Time sequence 0 20 40 60 80 100 Deeplearninghits Diagonal/humanDeep learning keys(title) rule(cold#yes hot#yes) 0 20 40 60 80 100 Time sequence 0 20 40 60 80 100 Deeplearninghits keys(title:keywords:abstract) rule(cold#yes hot#yes) 0 20 40 60 80 100 Time sequence 0 20 40 60 80 100 Deeplearninghits keys(title:keywords:abstract) rule(cold#no hot#yes) 0 20 40 60 80 100 Time sequence 0 20 40 60 80 100 Deeplearninghits keys(title:keywords:abstract) rule(cold#yes hot#yes) • compared to x = y case • DL performs very badly • best performs when abstract is used, even then about 25% hits • same performance for hot + cold and hot - cold cases M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 23/26 ... 23/26
  • 24. . That’s all, thank you ... M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 24/26 ... 24/26
  • 25. . MDC and Social Robotics Go Together M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 25/26 ... 25/26
  • 26. . Social Robotics in Text Mining Context Rebot (careless) Input Human Human {structure} (pinpoint) Select Browse (or use otherwise) Some Knowledge (folksonomies, knowledge bases, databases, indexes, ontologies, etc.) (metromaps ) M.Zhanikeev -- maratishe@gmail.com -- Deep Learning vs Multidimensional Classification in Human-Guided Text Mining -- http://bit.do/150704 26/26 ... 26/26