SlideShare a Scribd company logo
1 of 55
Document Classification with Neo4j 
(graphs)-[:are]->(everywhere) 
© All Rights Reserved 2014 | Neo Technology, Inc. 
@kennybastani 
Neo4j Developer Evangelist
© All Rights Reserved 2014 | Neo Technology, Inc. 
Agenda 
• Introduction to Neo4j 
• Introduction to Graph-based Document Classification 
• Graph-based Hierarchical Pattern Recognition 
• Generating a Vector Space Model for Recommendations 
• Graphify for Neo4j 
• U.S. Presidential Speech Transcript Analysis 
2
Introduction to Neo4j 
© All Rights Reserved 2014 | Neo Technology, Inc. 
3
The Property Graph Data Model 
© All Rights Reserved 2014 | Neo Technology, Inc. 
4
© All Rights Reserved 2014 | Neo Technology, Inc. 
John 
Sally 
Graph Databases 
Book 
5
© All Rights Reserved 2014 | Neo Technology, Inc. 
name: John 
age: 27 
name: Sally 
age: 32 
FRIEND_OF 
since: 01/09/2013 
title: Graph Databases 
authors: Ian Robinson, 
Jim Webber 
HAS_READ 
on: 2/03/2013 
rating: 5 
HAS_READ 
on: 02/09/2013 
rating: 4 
FRIEND_OF 
since: 01/09/2013 
6
The Relational Table Model 
© All Rights Reserved 2014 | Neo Technology, Inc. 
7
Customers Customer_Accounts Accounts 
© All Rights Reserved 2014 | Neo Technology, Inc. 
8
The Neo4j Browser 
© All Rights Reserved 2014 | Neo Technology, Inc. 
9
Neo4j Browser - finding help 
© All Rights Reserved 2014 | Neo Technology, Inc. 
http://localhost:7474/ 
10
Execute Cypher, Visualize 
© All Rights Reserved 2014 | Neo Technology, Inc. 
11
Introduction to Document Classification 
© All Rights Reserved 2014 | Neo Technology, Inc. 
12
© All Rights Reserved 2014 | Neo Technology, Inc. 
Document Classification 
Automatically assign a document to one or more classes 
Documents may be classified according to their subjects or 
according to other attributes 
Automatically classify unlabeled documents to a set of relevant 
classes using labeled training data 
13
Example Use Cases for Document 
© All Rights Reserved 2014 | Neo Technology, Inc. 
Classification 
14
Sentiment Analysis for Movie Reviews 
Scenario: A movie website allows users to submit reviews describing what they 
either liked or disliked about a particular movie. 
© All Rights Reserved 2014 | Neo Technology, Inc. 
Problem: The user reviews are unstructured text. 
How do I automatically generate a score indicating whether the review was 
positive or negative? 
Solution: Train a natural language parsing model on a dataset that has been 
labeled in previous reviews as either positive or negative. 
15
Recommend Relevant Tags 
Scenario: A Q/A website allows users to submit questions and receive answers 
from other users. 
Problem: Users sometime do not know what tags to apply to their questions in 
order to increase discoverability for receiving answers. 
Solution: Automatically recommend the most relevant tags for questions by 
classifying the text from training on previous questions. 
© All Rights Reserved 2014 | Neo Technology, Inc. 
16
Recommend Similar Articles 
Scenario: A news website provides hundreds of new articles a day to users on a 
broad range of topics. 
Problem: The site needs to increase user engagement and time spent on the site. 
Solution: Train natural language parsing models for daily articles in order to 
provide recommendations for highly relevant articles at the bottom of each page. 
© All Rights Reserved 2014 | Neo Technology, Inc. 
17
How Automated Document Classification Works 
© All Rights Reserved 2014 | Neo Technology, Inc. 
18
Label 
© All Rights Reserved 2014 | Neo Technology, Inc. 
X Y 
Document 
Document 
Document 
Document 
Label Label 
Assign a set of labels that describes the 
document’s text 
Supervised Learning 
Step 1: Create a Training Dataset 
Z 
19
Step 2: Train a Natural Language Parsing Model 
p 
X Y 
= State Machine 
© All Rights Reserved 2014 | Neo Technology, Inc. 
Deep feature representations are selected and 
learned using an evolutionary algorithm 
State machines represent predicates that evaluate to 
0 or 1 for a text match 
State machines map to classes of document labels 
that matched text during training 
Deep Learning 
p p 
p p p 
Class 
Class 
Z 
Class 
20
cos(θ) 
© All Rights Reserved 2014 | Neo Technology, Inc. 
Unlabeled Document 
The natural language parsing model is 
used to classify other unlabeled 
documents 
X 
Class 
Y 
Class 
Z 
Class 
0.99 
0.67 
0.01 
cos(θ) 
cos(θ) 
Step 3: Classify Unlabeled Documents 
21
Hierarchical Pattern Recognition 
© All Rights Reserved 2014 | Neo Technology, Inc. 
(HPR) 
22
What is Hierarchical Pattern Recognition (HPR)? 
HPR is a graph-based deep learning algorithm I 
created that learns deep feature representations in 
linear time — 
I created the algorithm to do graph-based traversals 
using a hierarchy of finite state machines (FSM). 
Designed for scalable performance in P time: 
© All Rights Reserved 2014 | Neo Technology, Inc. 
23
Influences & Inspirations 
+ = 
p 
p p 
p p p 
X Y Z 
© All Rights Reserved 2014 | Neo Technology, Inc. 
24 
Ray Kurzweil 
(Pattern Recognition Theory of Mind) 
Jeff Hawkins 
(Hierarchical Temporal Memory) 
Hierarchical Pattern Recognition
How does feature extraction work? 
p 
© All Rights Reserved 2014 | Neo Technology, Inc. 
25 
Hierarchical Pattern Recognition 
“Deep” feature representations are learned and associated 
with labels that are mapped to documents that the feature 
was discovered in. 
The feature hierarchy is translated into a Vector Space Model 
for classification on feature vectors generated from unlabeled 
text. 
p p 
p p p 
X Y Z 
HPR uses a probabilistic model in combination with an 
evolutionary algorithm to generate hierarchies of deep feature 
representations.
Graph-based feature learning 
© All Rights Reserved 2014 | Neo Technology, Inc. 
26
Learning new features from 
matches on training data 
© All Rights Reserved 2014 | Neo Technology, Inc. 
27
Cost Function for the Generations of Features 
Reproduction occurs after a threshold of matches has been 
exceeded for a feature. 
After replication the cost function is applied to increase that 
threshold every time the feature reproduces. 
is the current threshold on the feature node. 
is the minimum threshold, which I chose as 5 for new features. 
© All Rights Reserved 2014 | Neo Technology, Inc. 
Cost function: 
28
© All Rights 29 Reserved 2014 | Neo Technology, Inc.
Vector Space Model 
© All Rights Reserved 2014 | Neo Technology, Inc. 
30
Generating Feature Vectors 
The natural language parsing model created during training can be 
turned into a global feature index. 
This global feature index is a list of Neo4j internal IDs for every feature 
in the hierarchy. 
Using that global feature index, a multi-dimensional vector space is 
created with a length equal to the number of features in the hierarchy. 
© All Rights Reserved 2014 | Neo Technology, Inc. 
31
Relevance Rankings 
“Relevance rankings of documents in a keyword search can be 
calculated, using the assumptions of document similarities theory, by 
comparing the deviation of angles between each document vector and 
the original query vector where the query is represented as the same 
kind of vector as the documents.” - Wikipedia 
© All Rights Reserved 2014 | Neo Technology, Inc. 
32
Vector-based Cosine Similarity Measure 
In practice, it is easier to calculate the cosine of the angle between the 
vectors, instead of the angle itself: 
© All Rights Reserved 2014 | Neo Technology, Inc. 
33
Cosine Similarity & Vector Space Model 
© All Rights Reserved 2014 | Neo Technology, Inc. 
34
Vector-based Cosine Similarity Measure 
“The resulting similarity ranges from -1 meaning exactly opposite, to 1 
meaning exactly the same, with 0 usually indicating independence, 
and in-between values indicating intermediate similarity or 
dissimilarity.” 
© All Rights Reserved 2014 | Neo Technology, Inc. 
via Wikipedia 
35
Graphify for Neo4j 
© All Rights Reserved 2014 | Neo Technology, Inc. 
36
Graphify for Neo4j 
Graphify is a Neo4j unmanaged extension used for 
document and text classification using graph-based 
hierarchical pattern recognition. 
© All Rights Reserved 2014 | Neo Technology, Inc. 
https://github.com/kbastani/graphify 
37
Example Project 
Head over to the GitHub project page and clone it to your 
local machine. 
Follow the directions listed in the README.md to install the 
extension. 
Navigate to the /examples directory of the project. 
© All Rights Reserved 2014 | Neo Technology, Inc. 
Run: 
examples/graphify-examples-author/src/java/org/neo4j/nlp/examples/author/main.java 
38
U.S. Presidential Speech 
Transcript Analysis 
© All Rights Reserved 2014 | Neo Technology, Inc. 
39
Identify the Political Affiliation of a Presidential Speech 
This example ingests a set of texts from presidential speeches with 
labels from the author of that speech in training phase. After building 
the training models, unlabeled presidential speeches are classified in 
the test phase. 
© All Rights Reserved 2014 | Neo Technology, Inc. 
40
The Presidents 
© All Rights Reserved 2014 | Neo Technology, Inc. 
• Ronald Reagan 
• labels: liberal, republican, ronald-reagan 
• George H.W. Bush 
• labels: conservative, republican, bush41 
• Bill Clinton 
• labels: liberal, democrat, bill-clinton 
• George W. Bush 
• labels: conservative, republican, bush43 
• Barack Obama 
• labels: liberal, democrat, barack-obama 
41
© All Rights Reserved 2014 | Neo Technology, Inc. 
Training 
Each of the presidents in the example have 6 speeches to analyze. 
4 of the speeches are used to build a natural language parsing model. 
2 of the speeches are used to test the validity of that model. 
42
Get Similar Labels/Classes 
© All Rights Reserved 2014 | Neo Technology, Inc. 
43
Ronald Reagan 
republican 0.7182046285385341 
liberal 0.644281223102398 
democrat 0.4854114595950056 
conservative 0.4133639188595147 
bill-clinton 0.4057969121945167 
barack-obama 0.323947855372623 
bush41 0.3222644898334092 
bush43 0.3161309849153592 
© All Rights Reserved 2014 | Neo Technology, Inc. 
Class Similarity 
44
George H.W. Bush 
conservative 0.7032274806766954 
republican 0.6047256274615608 
liberal 0.4439742461594541 
democrat 0.39114918238853674 
bill-clinton 0.3234223107986785 
ronald-reagan 0.3222644898334092 
barack-obama 0.2929260544514002 
bush43 0.29106733975087984 
© All Rights Reserved 2014 | Neo Technology, Inc. 
Class Similarity 
45
democrat 0.8375678825642422 
liberal 0.7847858060182163 
republican 0.5561860529059708 
conservative 0.45365774896422445 
barack-obama 0.4507676679770066 
ronald-reagan 0.4057969121945167 
bush43 0.365042482383354 
bush41 0.3234223107986785 
© All Rights Reserved 2014 | Neo Technology, Inc. 
Bill Clinton 
Class Similarity 
46
George W. Bush 
conservative 0.820636570272315 
republican 0.7056890956512284 
liberal 0.5075788396061254 
democrat 0.4505424322086937 
bill-clinton 0.365042482383354 
barack-obama 0.33801949243378965 
ronald-reagan 0.3161309849153592 
bush41 0.29106733975087984 
© All Rights Reserved 2014 | Neo Technology, Inc. 
Class Similarity 
47
Barack Obama 
democrat 0.7668017370739147 
liberal 0.7184792203867296 
republican 0.4847680475425114 
bill-clinton 0.4507676679770066 
conservative 0.4149264161292232 
bush43 0.33801949243378965 
ronald-reagan 0.323947855372623 
bush41 0.2929260544514002 
© All Rights Reserved 2014 | Neo Technology, Inc. 
Class Similarity 
48
Get involved in the Neo4j community 
© All Rights Reserved 2014 | Neo Technology, Inc. 
49
http://stackoverflow.com/questions/tagged/neo4j 
© All Rights Reserved 2014 | Neo Technology, Inc. 
50
http://groups.google.com/group/neo4j 
© All Rights Reserved 2014 | Neo Technology, Inc. 
51
https://github.com/neo4j/neo4j/issues 
© All Rights Reserved 2014 | Neo Technology, Inc. 
52
http://neo4j.meetup.com/ 
© All Rights Reserved 2014 | Neo Technology, Inc. 
53
© All Rights Reserved 2014 | Neo Technology, Inc. 
(Thank You) 
54
Twitter www.twitter.com/kennybastani 
LinkedIn www.linkedin.com/in/kennybastani 
GitHub www.github.com/kbastani 
© All Rights Reserved 2014 | Neo Technology, Inc. 
Get in touch 
55

More Related Content

What's hot

NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jTobias Lindaaker
 
The Elements Of User Experience
The Elements Of User ExperienceThe Elements Of User Experience
The Elements Of User ExperienceJohn Chen, Jun
 
How Graph Data Science can turbocharge your Knowledge Graph
How Graph Data Science can turbocharge your Knowledge GraphHow Graph Data Science can turbocharge your Knowledge Graph
How Graph Data Science can turbocharge your Knowledge GraphNeo4j
 
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...Simplilearn
 
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...Neo4j
 
The Knowledge Graph Explosion
The Knowledge Graph ExplosionThe Knowledge Graph Explosion
The Knowledge Graph ExplosionNeo4j
 
Neo4j Bloom: Data Visualization for Everyone
Neo4j Bloom: Data Visualization for EveryoneNeo4j Bloom: Data Visualization for Everyone
Neo4j Bloom: Data Visualization for EveryoneNeo4j
 
Demystifying Graph Neural Networks
Demystifying Graph Neural NetworksDemystifying Graph Neural Networks
Demystifying Graph Neural NetworksNeo4j
 
Towards Digital Twin standards following an open source approach
Towards Digital Twin standards following an open source approachTowards Digital Twin standards following an open source approach
Towards Digital Twin standards following an open source approachFIWARE
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented DatabasesFabio Fumarola
 
JSON Data Modeling in Document Database
JSON Data Modeling in Document DatabaseJSON Data Modeling in Document Database
JSON Data Modeling in Document DatabaseDATAVERSITY
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Dawn Anderson MSc DigM
 
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...Neo4j
 
RWDG: Measuring Data Governance Performance
RWDG: Measuring Data Governance PerformanceRWDG: Measuring Data Governance Performance
RWDG: Measuring Data Governance PerformanceDATAVERSITY
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceCambridge Semantics
 
SERVIER Pegasus - Graphe de connaissances pour les phases primaires de recher...
SERVIER Pegasus - Graphe de connaissances pour les phases primaires de recher...SERVIER Pegasus - Graphe de connaissances pour les phases primaires de recher...
SERVIER Pegasus - Graphe de connaissances pour les phases primaires de recher...Neo4j
 
How to Build a Rock-Solid Analytics and Business Intelligence Strategy
How to Build a Rock-Solid Analytics and Business Intelligence StrategyHow to Build a Rock-Solid Analytics and Business Intelligence Strategy
How to Build a Rock-Solid Analytics and Business Intelligence StrategySAP Analytics
 
Introduction to Graph Databases.pdf
Introduction to Graph Databases.pdfIntroduction to Graph Databases.pdf
Introduction to Graph Databases.pdfNeo4j
 

What's hot (20)

NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4j
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
Graph based data models
Graph based data modelsGraph based data models
Graph based data models
 
The Elements Of User Experience
The Elements Of User ExperienceThe Elements Of User Experience
The Elements Of User Experience
 
How Graph Data Science can turbocharge your Knowledge Graph
How Graph Data Science can turbocharge your Knowledge GraphHow Graph Data Science can turbocharge your Knowledge Graph
How Graph Data Science can turbocharge your Knowledge Graph
 
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
What Is Hadoop? | What Is Big Data & Hadoop | Introduction To Hadoop | Hadoop...
 
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
 
The Knowledge Graph Explosion
The Knowledge Graph ExplosionThe Knowledge Graph Explosion
The Knowledge Graph Explosion
 
Neo4j Bloom: Data Visualization for Everyone
Neo4j Bloom: Data Visualization for EveryoneNeo4j Bloom: Data Visualization for Everyone
Neo4j Bloom: Data Visualization for Everyone
 
Demystifying Graph Neural Networks
Demystifying Graph Neural NetworksDemystifying Graph Neural Networks
Demystifying Graph Neural Networks
 
Towards Digital Twin standards following an open source approach
Towards Digital Twin standards following an open source approachTowards Digital Twin standards following an open source approach
Towards Digital Twin standards following an open source approach
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 
JSON Data Modeling in Document Database
JSON Data Modeling in Document DatabaseJSON Data Modeling in Document Database
JSON Data Modeling in Document Database
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
 
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
 
RWDG: Measuring Data Governance Performance
RWDG: Measuring Data Governance PerformanceRWDG: Measuring Data Governance Performance
RWDG: Measuring Data Governance Performance
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data Science
 
SERVIER Pegasus - Graphe de connaissances pour les phases primaires de recher...
SERVIER Pegasus - Graphe de connaissances pour les phases primaires de recher...SERVIER Pegasus - Graphe de connaissances pour les phases primaires de recher...
SERVIER Pegasus - Graphe de connaissances pour les phases primaires de recher...
 
How to Build a Rock-Solid Analytics and Business Intelligence Strategy
How to Build a Rock-Solid Analytics and Business Intelligence StrategyHow to Build a Rock-Solid Analytics and Business Intelligence Strategy
How to Build a Rock-Solid Analytics and Business Intelligence Strategy
 
Introduction to Graph Databases.pdf
Introduction to Graph Databases.pdfIntroduction to Graph Databases.pdf
Introduction to Graph Databases.pdf
 

Viewers also liked

Natural language search using Neo4j
Natural language search using Neo4jNatural language search using Neo4j
Natural language search using Neo4jKenny Bastani
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jWilliam Lyon
 
Natural Language Processing with Neo4j
Natural Language Processing with Neo4jNatural Language Processing with Neo4j
Natural Language Processing with Neo4jKenny Bastani
 
Building a Graph-based Analytics Platform
Building a Graph-based Analytics PlatformBuilding a Graph-based Analytics Platform
Building a Graph-based Analytics PlatformKenny Bastani
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4jNeo4j
 
Open Source Big Graph Analytics on Neo4j with Apache Spark
Open Source Big Graph Analytics on Neo4j with Apache SparkOpen Source Big Graph Analytics on Neo4j with Apache Spark
Open Source Big Graph Analytics on Neo4j with Apache SparkKenny Bastani
 
Big Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkBig Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkKenny Bastani
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesMax De Marzi
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use CasesMax De Marzi
 
Neo4J Open Source Graph Database
Neo4J Open Source Graph DatabaseNeo4J Open Source Graph Database
Neo4J Open Source Graph DatabaseMark Maslyn
 
20141216 graph database prototyping ams meetup
20141216 graph database prototyping ams meetup20141216 graph database prototyping ams meetup
20141216 graph database prototyping ams meetupRik Van Bruggen
 
Dnc Day 4 – Obama Speech
Dnc Day 4 – Obama SpeechDnc Day 4 – Obama Speech
Dnc Day 4 – Obama Speechmkursh
 
The impact of language planning, terminology planning, and arabicization, on ...
The impact of language planning, terminology planning, and arabicization, on ...The impact of language planning, terminology planning, and arabicization, on ...
The impact of language planning, terminology planning, and arabicization, on ...Alexander Decker
 
Meryl streep took a stand against donald trump
Meryl streep took a stand against donald trumpMeryl streep took a stand against donald trump
Meryl streep took a stand against donald trumpSusana Gallardo
 
AP Invoice Processing for JD Edwards_Bottomline Technologies
AP Invoice Processing for JD Edwards_Bottomline TechnologiesAP Invoice Processing for JD Edwards_Bottomline Technologies
AP Invoice Processing for JD Edwards_Bottomline TechnologiesBottomline Technologies
 
Document Classification In PHP
Document Classification In PHPDocument Classification In PHP
Document Classification In PHPIan Barber
 
The war on terrorism
The war on terrorismThe war on terrorism
The war on terrorismalcatdubois
 
M893 & m894 seahawks contest
M893 & m894 seahawks contestM893 & m894 seahawks contest
M893 & m894 seahawks contestdthielen1
 
Adivina de _quienes_son_las_siguientes_cansiones[1]
Adivina de _quienes_son_las_siguientes_cansiones[1]Adivina de _quienes_son_las_siguientes_cansiones[1]
Adivina de _quienes_son_las_siguientes_cansiones[1]turnedspon8520
 

Viewers also liked (20)

Natural language search using Neo4j
Natural language search using Neo4jNatural language search using Neo4j
Natural language search using Neo4j
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j
 
Natural Language Processing with Neo4j
Natural Language Processing with Neo4jNatural Language Processing with Neo4j
Natural Language Processing with Neo4j
 
Building a Graph-based Analytics Platform
Building a Graph-based Analytics PlatformBuilding a Graph-based Analytics Platform
Building a Graph-based Analytics Platform
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
Open Source Big Graph Analytics on Neo4j with Apache Spark
Open Source Big Graph Analytics on Neo4j with Apache SparkOpen Source Big Graph Analytics on Neo4j with Apache Spark
Open Source Big Graph Analytics on Neo4j with Apache Spark
 
Big Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkBig Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache Spark
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
Neo4J Open Source Graph Database
Neo4J Open Source Graph DatabaseNeo4J Open Source Graph Database
Neo4J Open Source Graph Database
 
20141216 graph database prototyping ams meetup
20141216 graph database prototyping ams meetup20141216 graph database prototyping ams meetup
20141216 graph database prototyping ams meetup
 
Dnc Day 4 – Obama Speech
Dnc Day 4 – Obama SpeechDnc Day 4 – Obama Speech
Dnc Day 4 – Obama Speech
 
The impact of language planning, terminology planning, and arabicization, on ...
The impact of language planning, terminology planning, and arabicization, on ...The impact of language planning, terminology planning, and arabicization, on ...
The impact of language planning, terminology planning, and arabicization, on ...
 
Meryl streep took a stand against donald trump
Meryl streep took a stand against donald trumpMeryl streep took a stand against donald trump
Meryl streep took a stand against donald trump
 
AP Invoice Processing for JD Edwards_Bottomline Technologies
AP Invoice Processing for JD Edwards_Bottomline TechnologiesAP Invoice Processing for JD Edwards_Bottomline Technologies
AP Invoice Processing for JD Edwards_Bottomline Technologies
 
Document Classification In PHP
Document Classification In PHPDocument Classification In PHP
Document Classification In PHP
 
The war on terrorism
The war on terrorismThe war on terrorism
The war on terrorism
 
M893 & m894 seahawks contest
M893 & m894 seahawks contestM893 & m894 seahawks contest
M893 & m894 seahawks contest
 
Visual Resume
Visual ResumeVisual Resume
Visual Resume
 
Adivina de _quienes_son_las_siguientes_cansiones[1]
Adivina de _quienes_son_las_siguientes_cansiones[1]Adivina de _quienes_son_las_siguientes_cansiones[1]
Adivina de _quienes_son_las_siguientes_cansiones[1]
 

Similar to Document Classification with Neo4j

MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...Tao Xie
 
xAPI: The Landscape
xAPI: The LandscapexAPI: The Landscape
xAPI: The LandscapeMegan Bowe
 
Software system design sample
Software system design sampleSoftware system design sample
Software system design sampleNorman K Ma
 
Data science workshop
Data science workshopData science workshop
Data science workshopHortonworks
 
An Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
An Empirical Comparison of Knowledge Graph Embeddings for Item RecommendationAn Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
An Empirical Comparison of Knowledge Graph Embeddings for Item RecommendationEnrico Palumbo
 
C# programming : Chapter One
C# programming : Chapter OneC# programming : Chapter One
C# programming : Chapter OneKhairi Aiman
 
See to believe: capturing insights using contextual inquiry
See to believe: capturing insights using contextual inquirySee to believe: capturing insights using contextual inquiry
See to believe: capturing insights using contextual inquiryDeirdre Costello
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014Paris Open Source Summit
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesTao Xie
 
Transferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTransferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTao Xie
 
Software craftsmanship - Imperative or Hype
Software craftsmanship - Imperative or HypeSoftware craftsmanship - Imperative or Hype
Software craftsmanship - Imperative or HypeSUGSA
 
Knowledge Graphs and Generative AI
Knowledge Graphs and Generative AIKnowledge Graphs and Generative AI
Knowledge Graphs and Generative AINeo4j
 
Sudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdfSudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdfSudipta Mukherjee
 
Building Large Sustainable Apps
Building Large Sustainable AppsBuilding Large Sustainable Apps
Building Large Sustainable AppsBuğra Oral
 

Similar to Document Classification with Neo4j (20)

History Of C Essay
History Of C EssayHistory Of C Essay
History Of C Essay
 
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
 
xAPI: The Landscape
xAPI: The LandscapexAPI: The Landscape
xAPI: The Landscape
 
Software system design sample
Software system design sampleSoftware system design sample
Software system design sample
 
Data science workshop
Data science workshopData science workshop
Data science workshop
 
An Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
An Empirical Comparison of Knowledge Graph Embeddings for Item RecommendationAn Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
An Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
 
C# programming : Chapter One
C# programming : Chapter OneC# programming : Chapter One
C# programming : Chapter One
 
See to believe: capturing insights using contextual inquiry
See to believe: capturing insights using contextual inquirySee to believe: capturing insights using contextual inquiry
See to believe: capturing insights using contextual inquiry
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
Sudipta mukherjee 2016_2017
Sudipta mukherjee 2016_2017Sudipta mukherjee 2016_2017
Sudipta mukherjee 2016_2017
 
Sudipta_Mukherjee_2016_2017
Sudipta_Mukherjee_2016_2017Sudipta_Mukherjee_2016_2017
Sudipta_Mukherjee_2016_2017
 
Maruti gollapudi cv
Maruti gollapudi cvMaruti gollapudi cv
Maruti gollapudi cv
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
Transferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTransferring Software Testing Tools to Practice
Transferring Software Testing Tools to Practice
 
Software craftsmanship - Imperative or Hype
Software craftsmanship - Imperative or HypeSoftware craftsmanship - Imperative or Hype
Software craftsmanship - Imperative or Hype
 
Knowledge Graphs and Generative AI
Knowledge Graphs and Generative AIKnowledge Graphs and Generative AI
Knowledge Graphs and Generative AI
 
Xiangen Hu - WESST - AutoTutor, an implementation of Conversation-Based Intel...
Xiangen Hu - WESST - AutoTutor, an implementation of Conversation-Based Intel...Xiangen Hu - WESST - AutoTutor, an implementation of Conversation-Based Intel...
Xiangen Hu - WESST - AutoTutor, an implementation of Conversation-Based Intel...
 
Sudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdfSudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdf
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Building Large Sustainable Apps
Building Large Sustainable AppsBuilding Large Sustainable Apps
Building Large Sustainable Apps
 

More from Kenny Bastani

In the Eventual Consistency of Succeeding at Microservices
In the Eventual Consistency of Succeeding at MicroservicesIn the Eventual Consistency of Succeeding at Microservices
In the Eventual Consistency of Succeeding at MicroservicesKenny Bastani
 
Building Cloud Native Architectures with Spring
Building Cloud Native Architectures with SpringBuilding Cloud Native Architectures with Spring
Building Cloud Native Architectures with SpringKenny Bastani
 
Extending the Platform with Spring Boot and Cloud Foundry
Extending the Platform with Spring Boot and Cloud FoundryExtending the Platform with Spring Boot and Cloud Foundry
Extending the Platform with Spring Boot and Cloud FoundryKenny Bastani
 
Back your app with MySQL and Redis on Cloud Foundry
Back your app with MySQL and Redis on Cloud FoundryBack your app with MySQL and Redis on Cloud Foundry
Back your app with MySQL and Redis on Cloud FoundryKenny Bastani
 
Using Docker, Neo4j, and Spring Cloud for Developing Microservices
Using Docker, Neo4j, and Spring Cloud for Developing MicroservicesUsing Docker, Neo4j, and Spring Cloud for Developing Microservices
Using Docker, Neo4j, and Spring Cloud for Developing MicroservicesKenny Bastani
 
Cloud Native Java Microservices
Cloud Native Java MicroservicesCloud Native Java Microservices
Cloud Native Java MicroservicesKenny Bastani
 
Building REST APIs with Spring Boot and Spring Cloud
Building REST APIs with Spring Boot and Spring CloudBuilding REST APIs with Spring Boot and Spring Cloud
Building REST APIs with Spring Boot and Spring CloudKenny Bastani
 
Neo4j Graph Data Modeling
Neo4j Graph Data ModelingNeo4j Graph Data Modeling
Neo4j Graph Data ModelingKenny Bastani
 
Building Killer Apps with Neo4j 2.0
Building Killer Apps with Neo4j 2.0Building Killer Apps with Neo4j 2.0
Building Killer Apps with Neo4j 2.0Kenny Bastani
 

More from Kenny Bastani (9)

In the Eventual Consistency of Succeeding at Microservices
In the Eventual Consistency of Succeeding at MicroservicesIn the Eventual Consistency of Succeeding at Microservices
In the Eventual Consistency of Succeeding at Microservices
 
Building Cloud Native Architectures with Spring
Building Cloud Native Architectures with SpringBuilding Cloud Native Architectures with Spring
Building Cloud Native Architectures with Spring
 
Extending the Platform with Spring Boot and Cloud Foundry
Extending the Platform with Spring Boot and Cloud FoundryExtending the Platform with Spring Boot and Cloud Foundry
Extending the Platform with Spring Boot and Cloud Foundry
 
Back your app with MySQL and Redis on Cloud Foundry
Back your app with MySQL and Redis on Cloud FoundryBack your app with MySQL and Redis on Cloud Foundry
Back your app with MySQL and Redis on Cloud Foundry
 
Using Docker, Neo4j, and Spring Cloud for Developing Microservices
Using Docker, Neo4j, and Spring Cloud for Developing MicroservicesUsing Docker, Neo4j, and Spring Cloud for Developing Microservices
Using Docker, Neo4j, and Spring Cloud for Developing Microservices
 
Cloud Native Java Microservices
Cloud Native Java MicroservicesCloud Native Java Microservices
Cloud Native Java Microservices
 
Building REST APIs with Spring Boot and Spring Cloud
Building REST APIs with Spring Boot and Spring CloudBuilding REST APIs with Spring Boot and Spring Cloud
Building REST APIs with Spring Boot and Spring Cloud
 
Neo4j Graph Data Modeling
Neo4j Graph Data ModelingNeo4j Graph Data Modeling
Neo4j Graph Data Modeling
 
Building Killer Apps with Neo4j 2.0
Building Killer Apps with Neo4j 2.0Building Killer Apps with Neo4j 2.0
Building Killer Apps with Neo4j 2.0
 

Recently uploaded

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 

Recently uploaded (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Document Classification with Neo4j

  • 1. Document Classification with Neo4j (graphs)-[:are]->(everywhere) © All Rights Reserved 2014 | Neo Technology, Inc. @kennybastani Neo4j Developer Evangelist
  • 2. © All Rights Reserved 2014 | Neo Technology, Inc. Agenda • Introduction to Neo4j • Introduction to Graph-based Document Classification • Graph-based Hierarchical Pattern Recognition • Generating a Vector Space Model for Recommendations • Graphify for Neo4j • U.S. Presidential Speech Transcript Analysis 2
  • 3. Introduction to Neo4j © All Rights Reserved 2014 | Neo Technology, Inc. 3
  • 4. The Property Graph Data Model © All Rights Reserved 2014 | Neo Technology, Inc. 4
  • 5. © All Rights Reserved 2014 | Neo Technology, Inc. John Sally Graph Databases Book 5
  • 6. © All Rights Reserved 2014 | Neo Technology, Inc. name: John age: 27 name: Sally age: 32 FRIEND_OF since: 01/09/2013 title: Graph Databases authors: Ian Robinson, Jim Webber HAS_READ on: 2/03/2013 rating: 5 HAS_READ on: 02/09/2013 rating: 4 FRIEND_OF since: 01/09/2013 6
  • 7. The Relational Table Model © All Rights Reserved 2014 | Neo Technology, Inc. 7
  • 8. Customers Customer_Accounts Accounts © All Rights Reserved 2014 | Neo Technology, Inc. 8
  • 9. The Neo4j Browser © All Rights Reserved 2014 | Neo Technology, Inc. 9
  • 10. Neo4j Browser - finding help © All Rights Reserved 2014 | Neo Technology, Inc. http://localhost:7474/ 10
  • 11. Execute Cypher, Visualize © All Rights Reserved 2014 | Neo Technology, Inc. 11
  • 12. Introduction to Document Classification © All Rights Reserved 2014 | Neo Technology, Inc. 12
  • 13. © All Rights Reserved 2014 | Neo Technology, Inc. Document Classification Automatically assign a document to one or more classes Documents may be classified according to their subjects or according to other attributes Automatically classify unlabeled documents to a set of relevant classes using labeled training data 13
  • 14. Example Use Cases for Document © All Rights Reserved 2014 | Neo Technology, Inc. Classification 14
  • 15. Sentiment Analysis for Movie Reviews Scenario: A movie website allows users to submit reviews describing what they either liked or disliked about a particular movie. © All Rights Reserved 2014 | Neo Technology, Inc. Problem: The user reviews are unstructured text. How do I automatically generate a score indicating whether the review was positive or negative? Solution: Train a natural language parsing model on a dataset that has been labeled in previous reviews as either positive or negative. 15
  • 16. Recommend Relevant Tags Scenario: A Q/A website allows users to submit questions and receive answers from other users. Problem: Users sometime do not know what tags to apply to their questions in order to increase discoverability for receiving answers. Solution: Automatically recommend the most relevant tags for questions by classifying the text from training on previous questions. © All Rights Reserved 2014 | Neo Technology, Inc. 16
  • 17. Recommend Similar Articles Scenario: A news website provides hundreds of new articles a day to users on a broad range of topics. Problem: The site needs to increase user engagement and time spent on the site. Solution: Train natural language parsing models for daily articles in order to provide recommendations for highly relevant articles at the bottom of each page. © All Rights Reserved 2014 | Neo Technology, Inc. 17
  • 18. How Automated Document Classification Works © All Rights Reserved 2014 | Neo Technology, Inc. 18
  • 19. Label © All Rights Reserved 2014 | Neo Technology, Inc. X Y Document Document Document Document Label Label Assign a set of labels that describes the document’s text Supervised Learning Step 1: Create a Training Dataset Z 19
  • 20. Step 2: Train a Natural Language Parsing Model p X Y = State Machine © All Rights Reserved 2014 | Neo Technology, Inc. Deep feature representations are selected and learned using an evolutionary algorithm State machines represent predicates that evaluate to 0 or 1 for a text match State machines map to classes of document labels that matched text during training Deep Learning p p p p p Class Class Z Class 20
  • 21. cos(θ) © All Rights Reserved 2014 | Neo Technology, Inc. Unlabeled Document The natural language parsing model is used to classify other unlabeled documents X Class Y Class Z Class 0.99 0.67 0.01 cos(θ) cos(θ) Step 3: Classify Unlabeled Documents 21
  • 22. Hierarchical Pattern Recognition © All Rights Reserved 2014 | Neo Technology, Inc. (HPR) 22
  • 23. What is Hierarchical Pattern Recognition (HPR)? HPR is a graph-based deep learning algorithm I created that learns deep feature representations in linear time — I created the algorithm to do graph-based traversals using a hierarchy of finite state machines (FSM). Designed for scalable performance in P time: © All Rights Reserved 2014 | Neo Technology, Inc. 23
  • 24. Influences & Inspirations + = p p p p p p X Y Z © All Rights Reserved 2014 | Neo Technology, Inc. 24 Ray Kurzweil (Pattern Recognition Theory of Mind) Jeff Hawkins (Hierarchical Temporal Memory) Hierarchical Pattern Recognition
  • 25. How does feature extraction work? p © All Rights Reserved 2014 | Neo Technology, Inc. 25 Hierarchical Pattern Recognition “Deep” feature representations are learned and associated with labels that are mapped to documents that the feature was discovered in. The feature hierarchy is translated into a Vector Space Model for classification on feature vectors generated from unlabeled text. p p p p p X Y Z HPR uses a probabilistic model in combination with an evolutionary algorithm to generate hierarchies of deep feature representations.
  • 26. Graph-based feature learning © All Rights Reserved 2014 | Neo Technology, Inc. 26
  • 27. Learning new features from matches on training data © All Rights Reserved 2014 | Neo Technology, Inc. 27
  • 28. Cost Function for the Generations of Features Reproduction occurs after a threshold of matches has been exceeded for a feature. After replication the cost function is applied to increase that threshold every time the feature reproduces. is the current threshold on the feature node. is the minimum threshold, which I chose as 5 for new features. © All Rights Reserved 2014 | Neo Technology, Inc. Cost function: 28
  • 29. © All Rights 29 Reserved 2014 | Neo Technology, Inc.
  • 30. Vector Space Model © All Rights Reserved 2014 | Neo Technology, Inc. 30
  • 31. Generating Feature Vectors The natural language parsing model created during training can be turned into a global feature index. This global feature index is a list of Neo4j internal IDs for every feature in the hierarchy. Using that global feature index, a multi-dimensional vector space is created with a length equal to the number of features in the hierarchy. © All Rights Reserved 2014 | Neo Technology, Inc. 31
  • 32. Relevance Rankings “Relevance rankings of documents in a keyword search can be calculated, using the assumptions of document similarities theory, by comparing the deviation of angles between each document vector and the original query vector where the query is represented as the same kind of vector as the documents.” - Wikipedia © All Rights Reserved 2014 | Neo Technology, Inc. 32
  • 33. Vector-based Cosine Similarity Measure In practice, it is easier to calculate the cosine of the angle between the vectors, instead of the angle itself: © All Rights Reserved 2014 | Neo Technology, Inc. 33
  • 34. Cosine Similarity & Vector Space Model © All Rights Reserved 2014 | Neo Technology, Inc. 34
  • 35. Vector-based Cosine Similarity Measure “The resulting similarity ranges from -1 meaning exactly opposite, to 1 meaning exactly the same, with 0 usually indicating independence, and in-between values indicating intermediate similarity or dissimilarity.” © All Rights Reserved 2014 | Neo Technology, Inc. via Wikipedia 35
  • 36. Graphify for Neo4j © All Rights Reserved 2014 | Neo Technology, Inc. 36
  • 37. Graphify for Neo4j Graphify is a Neo4j unmanaged extension used for document and text classification using graph-based hierarchical pattern recognition. © All Rights Reserved 2014 | Neo Technology, Inc. https://github.com/kbastani/graphify 37
  • 38. Example Project Head over to the GitHub project page and clone it to your local machine. Follow the directions listed in the README.md to install the extension. Navigate to the /examples directory of the project. © All Rights Reserved 2014 | Neo Technology, Inc. Run: examples/graphify-examples-author/src/java/org/neo4j/nlp/examples/author/main.java 38
  • 39. U.S. Presidential Speech Transcript Analysis © All Rights Reserved 2014 | Neo Technology, Inc. 39
  • 40. Identify the Political Affiliation of a Presidential Speech This example ingests a set of texts from presidential speeches with labels from the author of that speech in training phase. After building the training models, unlabeled presidential speeches are classified in the test phase. © All Rights Reserved 2014 | Neo Technology, Inc. 40
  • 41. The Presidents © All Rights Reserved 2014 | Neo Technology, Inc. • Ronald Reagan • labels: liberal, republican, ronald-reagan • George H.W. Bush • labels: conservative, republican, bush41 • Bill Clinton • labels: liberal, democrat, bill-clinton • George W. Bush • labels: conservative, republican, bush43 • Barack Obama • labels: liberal, democrat, barack-obama 41
  • 42. © All Rights Reserved 2014 | Neo Technology, Inc. Training Each of the presidents in the example have 6 speeches to analyze. 4 of the speeches are used to build a natural language parsing model. 2 of the speeches are used to test the validity of that model. 42
  • 43. Get Similar Labels/Classes © All Rights Reserved 2014 | Neo Technology, Inc. 43
  • 44. Ronald Reagan republican 0.7182046285385341 liberal 0.644281223102398 democrat 0.4854114595950056 conservative 0.4133639188595147 bill-clinton 0.4057969121945167 barack-obama 0.323947855372623 bush41 0.3222644898334092 bush43 0.3161309849153592 © All Rights Reserved 2014 | Neo Technology, Inc. Class Similarity 44
  • 45. George H.W. Bush conservative 0.7032274806766954 republican 0.6047256274615608 liberal 0.4439742461594541 democrat 0.39114918238853674 bill-clinton 0.3234223107986785 ronald-reagan 0.3222644898334092 barack-obama 0.2929260544514002 bush43 0.29106733975087984 © All Rights Reserved 2014 | Neo Technology, Inc. Class Similarity 45
  • 46. democrat 0.8375678825642422 liberal 0.7847858060182163 republican 0.5561860529059708 conservative 0.45365774896422445 barack-obama 0.4507676679770066 ronald-reagan 0.4057969121945167 bush43 0.365042482383354 bush41 0.3234223107986785 © All Rights Reserved 2014 | Neo Technology, Inc. Bill Clinton Class Similarity 46
  • 47. George W. Bush conservative 0.820636570272315 republican 0.7056890956512284 liberal 0.5075788396061254 democrat 0.4505424322086937 bill-clinton 0.365042482383354 barack-obama 0.33801949243378965 ronald-reagan 0.3161309849153592 bush41 0.29106733975087984 © All Rights Reserved 2014 | Neo Technology, Inc. Class Similarity 47
  • 48. Barack Obama democrat 0.7668017370739147 liberal 0.7184792203867296 republican 0.4847680475425114 bill-clinton 0.4507676679770066 conservative 0.4149264161292232 bush43 0.33801949243378965 ronald-reagan 0.323947855372623 bush41 0.2929260544514002 © All Rights Reserved 2014 | Neo Technology, Inc. Class Similarity 48
  • 49. Get involved in the Neo4j community © All Rights Reserved 2014 | Neo Technology, Inc. 49
  • 50. http://stackoverflow.com/questions/tagged/neo4j © All Rights Reserved 2014 | Neo Technology, Inc. 50
  • 51. http://groups.google.com/group/neo4j © All Rights Reserved 2014 | Neo Technology, Inc. 51
  • 52. https://github.com/neo4j/neo4j/issues © All Rights Reserved 2014 | Neo Technology, Inc. 52
  • 53. http://neo4j.meetup.com/ © All Rights Reserved 2014 | Neo Technology, Inc. 53
  • 54. © All Rights Reserved 2014 | Neo Technology, Inc. (Thank You) 54
  • 55. Twitter www.twitter.com/kennybastani LinkedIn www.linkedin.com/in/kennybastani GitHub www.github.com/kbastani © All Rights Reserved 2014 | Neo Technology, Inc. Get in touch 55

Editor's Notes

  1. When we think about data, we tend to think about how things are connected. This is a natural part of how we talk about things, and also of the graph model. “This is also a graph, but with some data attached. Here: we’ve attached names to the nodes and described the type of the relationships.”
  2. “We can take this further, and attach arbitrary key/value pairs” This is the Property Graph Model, which has the following characteristics: It contains Nodes and Relationships, both of which can contain properties (key-value pairs). Relationships are always between exactly 2 nodes. They have a type, and they are directed. “There are other graph models, however everyone in the industry has converged on the idea that this model is the most obvious and the most useful for real humans and the application we’re building”
  3. Let’s review the relational table model, to see the difference from the graph property model
  4. Start with Customers and Accounts “We have a customer, Alice.” “She’s got 3 accounts” “To keep track of which accounts Alice owns, we need a 3rd table, to store the mapping. Typically called a join table.”
  5. Dashboard, for monitoring of key stats Node, Relationship and Property “counts” are just estimates (actually represent the allocated ID space for each graph entity)
  6. “The Console is where you can run graph queries, written in Cypher.” We’ll be using this starting... now.
  7. Disclaimer: This is a graph-based approach to text classification and pattern recognition. This can be done in many different ways, including SVM, bayesian networks, belief networks, and many other approaches. I chose to create this on top of Neo4j because first its a database and second its already formatted as a network. This gives me the advantage of not worrying about data storage.
  8. Explain how the genetic algorithm works.
  9. I chose this example project because it’s easy to get presidential speeches online and it seemed like a good example to get others going with Graphify.
  10. “Get involved with the community, attend meetups, browse our open source code libraries, including Neo4j, by visiting us on GitHub.”
  11. “Visit stackoverflow.com with the tag Neo4j to get fast answers to your questions. We have a very active community of contributors that provide thorough answers 24/7. If you get stuck, make sure you head there.”
  12. “The same goes for Google groups, if you prefer that format over Stackoverflow.”
  13. “You can visit us on GitHub to submit or browse issues.”
  14. “Finally, I urge you to check out our website’s meetup page to find out where meetups are happening all around the world. Also we encourage you to share your experience with Neo4j, your applications, and your use cases by speaking at a local meetup. If you’re interested, please reach out to me, my contact details are in the next slide.”
  15. “Thank you for spending some time with me and learning about Neo4j and Cypher.”
  16. “Get in touch with me about meetups and Neo4j community events happening around the world.” “I’ll now open up the floor to questions.”