SlideShare a Scribd company logo
1 of 77
Graphs for AI and ML
Dr. Jim Webber
Chief Scientist, Neo4j
@jimwebber
● Some definitions
● Accidental Skynet
● Graph theory
● Contemporary graph ML
● The future of graph AI
Overview
● ML - Machine Learning
○ Finding functions from historical data to guide future
interactions within a given domain
● AI - Artificial Intelligence
● The property of a system that it appears intelligent to its users
● Often, but not always, using ML techniques
● Or ML implementations that can be cheaply retrained to address
neighbouring domains
A Bluffer’s Guide to AI-cronyms
● Predictive analytics
● Use past data to predict the future
● General purpose AI
● ML with transfer learning such that learned experiences in one
domain can be applied elsewhere
● Human-like AI
Often conflated with
ML all the things
Where are we today?
Extract all the features!
• What do we do? Turn it to
vectors and pump it through a
classification or regression
model
• That’s actually not a bad
thing
• But we can do so much before
we even get to ML…
• … if we have graph data
Credit: Graph Algorithms, Holder and Needham, O’Reilly 2019
http://www.bbc.co.uk/london/travel/downloads/tube_map.html
• Nodes with optional properties and optional labels
• Named, directed relationships with optional properties
• Relationships have exactly one start and end node
• Which may be the same node
Labeled Property graph model
Fearless querying
MATCH path = (:author {name:’Jim Webber’}
-[*]->(:character {name:’The Doctor’})
RETURN path
OR
MATCH (me:author {name:’Jim Webber’},
(doc:character {name:’The Doctor’}),
path = shortestPath((me)-[*]->(doc))
RETURN path
Take a step back
We can be smarter about this
Realtime Predictive Analytics
(circa 2008)
+ +
=
Not AI, but extremely effective
Credit: https://medium.com/basecs/breaking-down-breadth-first-search-cebe696709d9
Credit:
https://www.networkworld.com/article/3211410
/lan-wan/the-10-most-powerful-companies-in-
enterprise-networking.html
Toolkit matures into
proper database
• Cypher and Neo4j server make
real time graph analytical
patterns simple to apply
• Amazing and humane to
implement
Firstname:
Mickey
Surname: Smith
DoB: 19781006
SKU: 5e175641
Product:
Badgers
Nadgers Ale
SKU: 2555f258
Product:
Peewee Pilsner
Category: beer
SKU: 49d102bc
Product: Baby
Dry Nights
Category:
nappies
Category: baby Category:
alcoholic
drinks
SKU: 49d102bc
Product: XBox
360
Category:
consumer
electronics
Category:
console
BOUGHTBOUGHT
MEMBER_OF
MEMBER_OFMEMBER_OF
MEMBER_OFMEMBER_OF
Firstname: *
Surname: *
DoB: 1996 > x
> 1972
Category: beerCategory:
nappies
BOUGHTCategory: game
console
Young fathers pattern
Firstname: *
Surname: *
DoB: 1996 > x
> 1972
Category: beerCategory:
nappies
!BOUGHTCategory: game
console
Business opportunity
(beer)(nappies)
(console)
(daddy)
() ()
()
(d)-[:BOUGHT]->()-[:MEMBER_OF]->(n)
(d)-[:BOUGHT]->()-[:MEMBER_OF]->(b)
(d)-[:BOUGHT]->()-[:MEMBER_OF]->(c)
Flatten the graph
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category)
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category)
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(c:Category)
Include any labels
MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category),
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category)
Add a MATCH clause
MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category),
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category),
(c:Category)
WHERE NOT((d)-[:BOUGHT]->()-[:MEMBER_OF]->(c))
Constrain the Pattern
MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category),
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category),
(c:Category)
WHERE n.category = "nappies" AND
b.category = "beer" AND
c.category = "console" AND
NOT((d)-[:BOUGHT]->()-[:MEMBER_OF]->(c))
Add property constraints
MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category),
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category),
(c:Category)
WHERE n.category = "nappies" AND
b.category = "beer" AND
c.category = "console" AND
NOT((d)-[:BOUGHT]->()-[:MEMBER_OF]->(c))
RETURN DISTINCT d AS daddy
Profit!
==> +---------------------------------------------+
==> | daddy |
==> +---------------------------------------------+
==> | Node[15]{name:"Rory Williams",dob:19880121} |
==> +---------------------------------------------+
==> 1 row
==> 0 ms
==>
neo4j-sh (0)$
Results
Which sushi restaurants
in NYC do my friends
like?
Facebook Graph Search
See http://maxdemarzi.com/
Graph Structure
Simple Query, Intelligent Results
MATCH (:Person {name: 'Jim'})
-[:IS_FRIEND_OF]->(:Person)
-[:LIKES]->(restaurant:Restaurant)
-[:LOCATED_IN]->(:Place {location: 'New York'}),
(restaurant)-[:SERVES]->(:Cuisine {cuisine: 'Sushi'})
RETURN restaurant
Search structure
Graph Theory
• Rich knowledge of how graphs
operate in many domains
• Off the shelf algorithms to
process those graphs for
information, insight, predictions
• Low barrier to entry
• Amazingly powerful
Triadic Closure
name: Kyle
name: Stan name: Kenny
Triadic Closure
name: Kyle
name: Stan name: Kenny
name: Kyle
name: Stan name: Kenny
FRIEND
Structural Balance
name:
Cartman
name: Craig name: Tweek
Structural Balance
name:
Cartman
name: Craig name: Tweek
name:
Cartman
name: Craig name: Tweek
FRIEND
Structural Balance
name:
Cartman
name: Craig name: Tweek
name:
Cartman
name: Craig name: Tweek
ENEMY
Structural Balance
name: Kyle
name: Stan name: Kenny
name: Kyle
name: Stan name: Kenny
FRIEND
Structural Balance is a key
predictive technique
And it’s domain-agnostic
Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria
Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria
Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria
Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria
Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria
Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria
Predicting WWI
[Easley and Kleinberg]
It if a node has strong relationships to two neighbours, then these
neighbours must have at least a weak relationship between them.
[Wikipedia]
Strong Triadic Closure
Triadic Closure
(weak relationship)
name: Kenny
name: Stan name: Cartman
Triadic Closure
(weak relationship)
name: Kenny
name: Stan name: Cartman
name: Kenny
name: Stan name: Cartman
FRIEND 50%
• Relationships can have “strength” as well as intent
• Think: weighting on a relationship in a property graph
• Weak links play another super-important structural role in graph
theory
• They bridge neighbourhoods
Weak relationships
Local Bridges
FRIEND
name: Kenny
name: Stanname: Kyle
FRIEND
FRIEND
name: Sally
name: Bebename: Wendy
FRIEND
FRIEND 50%
name:
Cartman
FRIEND
ENEMY
“If a node A in a network satisfies the Strong Triadic Closure Property
and is involved in at least two strong relationships, then any local
bridge it is involved in must be a weak relationship.”
[Easley and Kleinberg]
Local Bridge Property
University Karate Club
• (NP) Hard problem
• Repeatedly remove the spanning links between dense regions
• Or recursively merge nodes into ever larger “subgraph” nodes
• Choose your algorithm carefully – some are better than others for
a given domain
• Can use to (almost exactly) predict the
break up of the karate club!
Graph Partitioning
University Karate Clubs
(predicted by Graph Theory)
9
University Karate Clubs
(what actually happened!)
• Label Propagation
• Union Find / Weakly Connected Components
• Strongly Connected Components
• Triangle-Count / Clustering Coefficient
ClusteringCentrality
• PageRank
• Betweenness
• Closeness
• Degree
Path Finding
• Breadth-first search
• Depth-first search
• Single-source shortest path
• All-pairs shortest path
• Minimum weight spanning
tree
Graph Algorithms in Neo4j
Amazing Native Graph Performance
Credit: https://reezocar.blob.core.windows.net/blog/2015/09/k2000.jpg
Find and stop spammers
Extract graph structure over time
Not message content!
(Fakhraei et al, KDD 2015)
Learning to stop bad guys
Result: find and classify 70% spammers with 90% accuracy
Much of modern graph ML is still about turning graphs to vectors
Graph2Vec and friends
Highly complementary techniques
Mixing structural data and features gives better results
Better data into the model, better results out
But we don’t have to always vectorize graphs...
Graph ML
Knowledge Graphs
• Semantic domain knowledge for
inference and understanding
• E.g. eBay Google Assistant
• What’s the next best question to ask
when a potential customer says they
want a bag?
• Price? Function? Colour?
• Depends on context! Demographic,
history, user journey.
• Richly connected data makes the
system seem intelligent
• But it’s “just” data and algorithms in
reality
Graph Convolutional
Neural Networks
A general architecture for
predicting node and relationship
attributes in graphs.
(Kipf and Welling, ICLR 2017)
Credit: Andrew Docherty (CSIRO), YowData 2017
https://www.youtube.com/watch?v=Gmxz41L70Fg
Graph Networks for
Structured Causal Models
• Position paper from Google,
MIT, Edinburgh
• Structured representations and
computations (graphs) are key
• Goal: generalize beyond direct
experience
• Like human infants can
https://arxiv.org/pdf/1806.01261.pdf
credit: @markhneedham
Thanks for listening
Ask the experts session tomorrow 14:30
@jimwebber

More Related Content

What's hot

AutoGPT Masterclass | STAENZ Academy
AutoGPT Masterclass | STAENZ AcademyAutoGPT Masterclass | STAENZ Academy
AutoGPT Masterclass | STAENZ Academy
Sanjeev Mishra
 

What's hot (20)

Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
 
Introduction to AI Ethics
Introduction to AI EthicsIntroduction to AI Ethics
Introduction to AI Ethics
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023
 
FAKE NEWS DETECTION PPT
FAKE NEWS DETECTION PPT FAKE NEWS DETECTION PPT
FAKE NEWS DETECTION PPT
 
Deep Neural Networks for Machine Learning
Deep Neural Networks for Machine LearningDeep Neural Networks for Machine Learning
Deep Neural Networks for Machine Learning
 
New! Neo4j AuraDS: The Fastest Way to Get Started with Data Science in the Cloud
New! Neo4j AuraDS: The Fastest Way to Get Started with Data Science in the CloudNew! Neo4j AuraDS: The Fastest Way to Get Started with Data Science in the Cloud
New! Neo4j AuraDS: The Fastest Way to Get Started with Data Science in the Cloud
 
AI in Manufacturing: Opportunities & Challenges
AI in Manufacturing: Opportunities & ChallengesAI in Manufacturing: Opportunities & Challenges
AI in Manufacturing: Opportunities & Challenges
 
AI in Law Enforcement - Applications and Implications of Machine Vision and M...
AI in Law Enforcement - Applications and Implications of Machine Vision and M...AI in Law Enforcement - Applications and Implications of Machine Vision and M...
AI in Law Enforcement - Applications and Implications of Machine Vision and M...
 
Solve for X with AI: a VC view of the Machine Learning & AI landscape
Solve for X with AI: a VC view of the Machine Learning & AI landscapeSolve for X with AI: a VC view of the Machine Learning & AI landscape
Solve for X with AI: a VC view of the Machine Learning & AI landscape
 
AI
AIAI
AI
 
Lesson 1 intro to ai
Lesson 1   intro to aiLesson 1   intro to ai
Lesson 1 intro to ai
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud Detection
 
AutoGPT Masterclass | STAENZ Academy
AutoGPT Masterclass | STAENZ AcademyAutoGPT Masterclass | STAENZ Academy
AutoGPT Masterclass | STAENZ Academy
 
1.Introduction to deep learning
1.Introduction to deep learning1.Introduction to deep learning
1.Introduction to deep learning
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking ahead
 
Generative AI and Student Writing.pptx
Generative AI and Student Writing.pptxGenerative AI and Student Writing.pptx
Generative AI and Student Writing.pptx
 
The Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The FutureThe Future Of Work & The Work Of The Future
The Future Of Work & The Work Of The Future
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Product Management for AI/ML
Product Management for AI/MLProduct Management for AI/ML
Product Management for AI/ML
 

Similar to Graphs for Ai and ML

Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
Caserta
 
Graph Data Science with Neo4j: Nordics Webinar
Graph Data Science with Neo4j: Nordics WebinarGraph Data Science with Neo4j: Nordics Webinar
Graph Data Science with Neo4j: Nordics Webinar
Neo4j
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
Trey Grainger
 

Similar to Graphs for Ai and ML (20)

GraphTour Boston - Graphs for AI and ML
GraphTour Boston - Graphs for AI and MLGraphTour Boston - Graphs for AI and ML
GraphTour Boston - Graphs for AI and ML
 
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
 
Graphs for AI & ML, Jim Webber, Neo4j
Graphs for AI & ML, Jim Webber, Neo4jGraphs for AI & ML, Jim Webber, Neo4j
Graphs for AI & ML, Jim Webber, Neo4j
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine Learning
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data Science
 
What Is GDS and Neo4j’s GDS Library
What Is GDS and Neo4j’s GDS LibraryWhat Is GDS and Neo4j’s GDS Library
What Is GDS and Neo4j’s GDS Library
 
3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning
 
Graph Data Science with Neo4j: Nordics Webinar
Graph Data Science with Neo4j: Nordics WebinarGraph Data Science with Neo4j: Nordics Webinar
Graph Data Science with Neo4j: Nordics Webinar
 
Improve ml predictions using graph algorithms (webinar july 23_19).pptx
Improve ml predictions using graph algorithms (webinar july 23_19).pptxImprove ml predictions using graph algorithms (webinar july 23_19).pptx
Improve ml predictions using graph algorithms (webinar july 23_19).pptx
 
Graphs for Data Science and Machine Learning
Graphs for Data Science and Machine LearningGraphs for Data Science and Machine Learning
Graphs for Data Science and Machine Learning
 
DutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesDutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time Series
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
 
Einstieg in Neo4j Graph Data Science
Einstieg in Neo4j Graph Data ScienceEinstieg in Neo4j Graph Data Science
Einstieg in Neo4j Graph Data Science
 
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
 
Main principles of Data Science and Machine Learning
Main principles of Data Science and Machine LearningMain principles of Data Science and Machine Learning
Main principles of Data Science and Machine Learning
 
Improve ML Predictions using Graph Analytics (today!)
Improve ML Predictions using Graph Analytics (today!)Improve ML Predictions using Graph Analytics (today!)
Improve ML Predictions using Graph Analytics (today!)
 
Neo4j in Depth
Neo4j in DepthNeo4j in Depth
Neo4j in Depth
 

More from Neo4j

More from Neo4j (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansQIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
QIAGEN: Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 

Recently uploaded

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 

Recently uploaded (20)

A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 

Graphs for Ai and ML

Editor's Notes

  1. Focus: is on graph analytics in this talk.
  2. ML - this is what nerds do. Sometimes ML is so compelling that it seems intelligent, but in reality it’s data and algorithms AI - train a system to classify animals, might also work on shoes. See: hot dog; not hot dog! GP-AI - systems like AlphaGo might be an architecture to support this in future, but we’re not there today
  3. GP-AI - systems like AlphaGo might be an architecture to support this in future, but we’re not there today
  4. Here’s where we are mostly today. Row-oriented data. Maybe some documents, maybe some columns, but mostly rows of data from arcane data models.
  5. You already know graphs
  6. People talk about Codd’s relational model being mature because it was proposed in 1969 – 49 years old. Euler’s graph theory was proposed in 1736 – 282 years old. Now we use the labelled property graph model. A very simple set of idioms that can build very sophisticated models.
  7. Graphs are the most natural way to model most domains. You already know this because you draw graphs on a whiteboard, but you’ve never had the opportunity to take that down into the database before. Nodes are a bit like documents, but they’re flat at present in Neo4j. You pour data into your nodes and then connect them – easy peasy. This enables high fidelity domain modeling because this is how your domains work. And you don’t have to do this stuff in your application code – it’s right there in the database Let’s prove it by exploring a fun domain…
  8. Graphs are the most natural way to model most domains. You already know this because you draw graphs on a whiteboard, but you’ve never had the opportunity to take that down into the database before. Nodes are a bit like documents, but they’re flat at present in Neo4j. You pour data into your nodes and then connect them – easy peasy. This enables high fidelity domain modeling because this is how your domains work. And you don’t have to do this stuff in your application code – it’s right there in the database Let’s prove it by exploring a fun domain…
  9. If you want to know who followed Matt Smith, easy! Traversing the regenerated (or any) relationship takes about 1/40 millionth of a second on this mac in a steady state database
  10. What if you want to know who preceded Matt Smith? Easy. Traverse the regenerated rels in the other way. Cost? About 1/40 millionth of a second on this laptop in a steady state database.
  11. Joins are super cheap for good graph DBs On my laptop, I can get to 40M traversals/sec in a steady state DB You can explore a lot of data very quickly Which makes it a good fit for data intensive applications like ML
  12. My shortest path to Doctor Who?
  13. But before we get to ML, let’s take a step back into my history building smart systems
  14. All the way back to Autumn 2008
  15. November 2007 met Emil at Øredev in Malmö Sweden Java and Maven build-your-own-DBMS toolkit called Neo4j Java Core API only Long afternoon of loading data and writing a recommendation query...
  16. Find the current customer Find things they own Find things that depend on the things they own Sell Repeat All we did at first was understand the dependencies between products and bundles. We never tried to upsell something incompatible. Never tried to sell them something they already owned. Never undersold them. And it opened a world of possibilities to combine other graphs: demographic, social, geographical, municipal, network... The system made intelligent suggestions, but it was not ML or AI, just graph queries. It was good.
  17. Unexpectedly Powerful Solved a problem in a long afternoon was meant to take years with off-the-shelf software Applied same pattern to PoS retail recommendations, fraud detection… in subsequent months Still amazed! Effect: join Neo4j as Chief Scientist in 2010. So let’s get into graphs.
  18. Realtime retail recommendations. Historical anecdote about beer and nappies.
  19. Large UK retailer We had a data model Some of it taxonomical Some of it stock-centric. Some transactional
  20. START n=node(*) MATCH n-[r?]->() DELETE n,r CREATE (daddy1:Person { name: 'Mickey Smith', dob: 19781006 }) CREATE (alcohol:Category { category : 'alcoholic drinks'}) CREATE (beer:Category { category : 'beer'}) CREATE beer-[:MEMBER_OF]->alcohol CREATE (peeweePilsner:Product { sku: '2555f258', product: 'Peewee Pilsner' }) CREATE (badgersNadgers:Product { sku: '5e175641', product: 'Badgers Nadgers Ale' }) CREATE peeweePilsner-[:MEMBER_OF]->beer CREATE badgersNadgers-[:MEMBER_OF]->beer CREATE daddy1-[:BOUGHT]->peeweePilsner CREATE daddy1-[:BOUGHT]->badgersNadgers CREATE (baby:Category { category: 'baby' }) CREATE (nappies:Category { category: 'nappies' }) CREATE nappies-[:MEMBER_OF]->baby CREATE (babyDryNights:Product { sku: '49d102bc', product: 'Baby Dry Nights'}) CREATE babyDryNights-[:MEMBER_OF]->nappies CREATE daddy1-[:BOUGHT]->babyDryNights CREATE (consumerElectronics:Category { category: 'consumer electronics' }) CREATE (console:Category { category: 'console' }) CREATE (xbox:Product { sku: '49d102bc', product: 'XBox 360' }) CREATE xbox-[:MEMBER_OF]->(console)-[:MEMBER_OF]->consumerElectronics CREATE daddy1-[:BOUGHT]->xbox CREATE (mummy1:Person { name: 'Rose Tyler', dob: 19800317 }) CREATE (wine:Product { sku:'3a3f22bc', product: 'Shiraz' }) CREATE wine-[:MEMBER_OF]->alcohol CREATE mummy1-[:BOUGHT]->wine CREATE mummy1-[:BOUGHT]->babyDryNights CREATE (daddy2:Person { name: 'Rory Williams', dob: 19880121 }) CREATE daddy2-[:BOUGHT]->peeweePilsner CREATE daddy2-[:BOUGHT]->babyDryNights // Cypher 1.0 query START beer=node(2), nappies=node(7), xbox=node(11) MATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer), (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies), (daddy)-[b?:BOUGHT]->(xbox) WHERE b is null RETURN distinct daddy // Cypher 2.0 query MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category), (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category), (c:Category) WHERE n.category = "nappies" AND b.category = "beer" AND c.category = "console" AND NOT((d)-[:BOUGHT]->()-[:MEMBER_OF]->(c)) RETURN DISTINCT d AS daddy
  21. The insight here is that we have a typical young father who buys beer, nappies and a game console simply by reducing subgraph We have a pattern to search for
  22. We knew it was young fathers, but I bet your model would classify them as lazy, drunken, gamers right?
  23. Now we look for young fathers – implied by beer and nappies purchases – who haven’t bought a game console.
  24. Turn it to text. And…
  25. Neo4j 2.0: MATCH (u:User), (n:ProductType), (b:ProductType), (x:ProductType) WHERE n.name = "nappies" AND b.name = "Beer" AND x.name = "Xbox" AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(n) AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(b) AND NOT((u)-[:BOUGHT]->()<-[:MEMBER_OF]-(x)) RETURN u
  26. Neo4j 2.0: MATCH (u:User), (n:ProductType), (b:ProductType), (x:ProductType) WHERE n.name = "nappies" AND b.name = "Beer" AND x.name = "Xbox" AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(n) AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(b) AND NOT((u)-[:BOUGHT]->()<-[:MEMBER_OF]-(x)) RETURN u
  27. Neo4j 2.0: MATCH (u:User), (n:ProductType), (b:ProductType), (x:ProductType) WHERE n.name = "nappies" AND b.name = "Beer" AND x.name = "Xbox" AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(n) AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(b) AND NOT((u)-[:BOUGHT]->()<-[:MEMBER_OF]-(x)) RETURN u
  28. Neo4j 2.0: MATCH (u:User), (n:ProductType), (b:ProductType), (x:ProductType) WHERE n.name = "nappies" AND b.name = "Beer" AND x.name = "Xbox" AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(n) AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(b) AND NOT((u)-[:BOUGHT]->()<-[:MEMBER_OF]-(x)) RETURN u
  29. This is fast: query latency is proportional to the amount of graph searched
  30. Now called “network science”
  31. First we need to talk about some local properties
  32. A triadic closure is a local property of (social) graphs whereby if two nodes are connected via a path involving a third node, there is an increased likelihood that the two nodes will become directly connected in future. This is a familiar enough situation for us in a social setting whereby if we happen to be friends with two people, ultimately there's an increased chance that those people will become direct friends too, since by being our friend in the first place, it's an indication of social similarity and suitability. It’s called triadic closure, because we try to close the triangle.
  33. We see this all the time – it’s likely that if we have two friends, that they will also become at least acquaintances and potentially friends themselves! In general, if a node A has relationships to B & C then the relationship between B&C is likely to form – especially if the existing relationships are both strong. This is an incredibly strong assertion and will not be typically upheld by all subgraphs in a graph. Nonetheless it is sufficiently commonplace (particularly in social networks) to be trusted as a predictive aid.
  34. Sentiment plays a role in how closures form too – there is a notion of balance.
  35. From a triadic closure perspective this is OK, but intuitively it seems odd. Cartman’s friends shouldn’t be friends with his enemies. Nor should Cartman’s enemies be friends with his friends.
  36. This makes sense – Cartman’s friend Craig is also an enemy of Cartman’s enemy Tweek Two negative sentiments and one positive sentiment is a balanced structure – and it makes sense too since we gang up with our friends on our poor beleaguered enemy
  37. Is this true? Yes. Is it nice? No. Is it realistic? Oh yes.
  38. Another balanced – and more pleasant – arrangement is for three positive sentiments, in this case mutual friends.
  39. A starting point for a network of friends and enemies 100 years on from the armistice Red links indicate enemy of relationship Black links indicate friend of relationship The Three Emperor’s league
  40. Italy forms the with Austria and Germany – a balanced +++ triadic closure If Italy had made only a single alliance (or enemy) it would have been unstable and another relationship would be likely to form anyway! Triple Alliance
  41. Russia becomes hostile to Austria and Germany – a balance --+ d triadic closure becomes agnostic towards France. German-Russian Lapse
  42. The French and Russians ally, forming a balanced --+ triadic closure with the UK French-Russian Alliance
  43. The UK and France enter into the famous Entente Cordiale This produces an unbalanced ++- triadic closure with Russia, and the graph doesn’t like it.
  44. The British and Russians form an alliance, thereby changing their previously unbalanced triadic closure into a balanced one. Other local pressures on the graph make other closures form. Italy becomes hostile to Russia, forming a balanced --+ closure with the France, and another balanced --+ closure with the UK. Germany and the UK become hostile forming a balanced --+ closure with Austria and another balanced --+ closure with Italy British-Russian Alliance
  45. That WWI can be predicted without domain knowledge by iterating a graph and applying local structural constraints is nothing short of astonishing to me. Note how the network slides into a balanced labeling — and into World War I.
  46. A very surprising result: graphs don’t know about human conflicts.
  47. In this case the string triadic closure property still holds – though it is a weak link that characterises the relationship between Stan and Cartman. Given a starting graph, we can apply this simple local principal to see how it would evolve.
  48. In this case the string triadic closure property still holds – though it is a weak link that characterises the relationship between Stan and Cartman. Given a starting graph, we can apply this simple local principal to see how it would evolve.
  49. A local bridge acts as a link – perhaps the only realistic link - between two otherwise distant (or separate) subgraphs. Local bridges are semantically rich – they provide conduits for information flow between otherwise independent groups. In this case DATING is a local bridge – it must also be a weak relationship according to our definition of a local bridge Intuitively this makes sense – your girl/boyfriend is rather less important at age 8 than your regular friends, IIRC.
  50. How do we identify local bridges? Any weak link which would cause a component of the graph to become disconnected. Being able to identify local bridges is important – in this case it’s the only know conduit to allow the girls and boys to communicate. In real life local bridges are apparent in your organisation as experts (or managers); appear as nexus in fraud cases;
  51. Zachary in the Journal of Anthropological Research 1977 Intuitively we can see “clumps” in this graph. But how do we separate them out? It’s called minimum cut.
  52. What’s interesting is that it’s mechanical – no domain knowledge is necessary. There’s only one failure with the method Zachary chose to partition the graph: node 9 should have gone to the instructor’s club but instead went with the original president of the club (node 34). Why? Because the student was three weeks away from completing a four-year quest to obtain a black belt, which he could only do with the instructor (node 1) Other minimum cut approaches might deliver slightly different results, but on the whole it’s amazing you get such insight from an algorithm!
  53. But is there enough information in the graph itself to predict the schism?
  54. But is there enough information in the graph itself to predict the schism?
  55. Actually neo4j already has a bunch of these algorithms. Call them easily from Cypher Emergent intelligence from the graph!
  56. Efficiency for graph operations is paramount. You don’t need huge macho clusters to do this.
  57. Large payment provider, transaction history A 300M node, ~18B rel graph pageranked with 20 iterations in less than 2 hours using the graph algos. On commodity hardware.
  58. Contemporary AI
  59. Graph structure itself is rich. In this example we don’t need to know the content of the messages to know they’re spam at high confidence, just their position in the graph. Mine a vector of graph features, feed it into the trained model. Graphs have a key advantage: structural context. Where is the node in the graph? Who are its neighbours? Etc. That richness feeds into the model and makes it better, more accurate, more dependable. PageRank, Degree, Neighbourhood, Colour, etc are all features that improve your ML outcomes but are only available from graphs.
  60. ICLR = International Conference on Learning Representations Graph of movies that a user liked. Feed into neural net Graph of users who rated one of those movies. Feed into neural net. Recurse through the data until you get to all the movies and all the users which are just embedding vectors (fancy hashes that place like near like in a vector space). [Can change these vectors for features to avoid cold-starts, without changing overall architecture.] Graph of back-propagated trained neural nets. Incremental: Scalable for both training and prediction. Extensible: bring in other graph layers! Better than collaborative filtering because it can work on any graph, not just bipartite user-likes-movies graphs. E.g. User likes actor in movies with genre – much richer! A bipartite graph, also called a bigraph, is a set of graph vertices decomposed into two disjoint sets such that no two graph vertices within the same set are adjacent. I.e. Users don’t connect to users, only to movies. This is already happening - it’s YouTube’s recommender algorithm.
  61. A growing realisation from leaders in the AI community: graph networks as the foundational building block for human-like AI. Argue: combinatorial generalization must be a top priority for AI to achieve human-like abilities. Must be able to compose a finite set of elements in infinite ways (eg like language) We draw analogies by aligning the relational structure between two domains and drawing inferences about one based on corresponding knowledge about the other (Gentner and Markman, 1997; Hummel and Holyoak, 2003). Hierarchies are critical. Inductive bias: how the algorithm prioritises solutions. Relational inductive biases to guide deep learning about entities, relations, and rules for composing them. I.e. the learning understands graphs
  62. All this might seem hard at first – we’re used to tables, and our toolkits expect them. Graphs changes this for the better. Once you get graphs, all the other things seem hard
  63. “a vast gap between human and machine intelligence remains, especially with respect to efficient, generalizable learning” 70% of graph ML today is still turning graphs to vectors E.g. deep walk - random walk through graph, assign vector node when encountered based on neighborhood 30% is truly graph AI - “differential neural computer” -> discern patterns that users can’t; write sophisticated algorithms (fraud, shortest path, etc) from incentive declarations. E.g. no longer need a human expert to discover the “young father” pattern in our data, the machine learns it’s a valuable query in some contexts. So enjoy using graphs for AI, but please remember graphs for good!