SlideShare a Scribd company logo
1 of 35
This is a novice-track talk, so all concepts and examples are kept simple
1. Basic graph theory concepts and definitions
2. A few real-world scenarios framed as graph data
3. Working with graphs in Python
The overall goal of this talk is to spark your interest in and show you what’s
out there as a jumping off point for you to go deeper
Graph: “A structure amounting to a set of objects in which some
pairs of the objects are in some sense ‘related’. The objects
correspond to mathematical abstractions called vertices (also called
nodes or points) and each of the related pairs of vertices is called an
edge (also called an arc or line)” – Richard Trudeau, Introduction to
Graph Theory (1st edition, 1993)
Graph Analytics: “Analysis of data structured as a graph
(sometimes also part of network analysis or link analysis depending
on scope and context)” – Me, talking to a stress ball as I made these
slides
• We see two vertices joined by
a single edge
• Vertex 1 is adjacent to vertex 2
• The neighborhood of vertex 1
is all adjacent vertices (vertex
2 in this case)
• We see that there is a loop on
vertex a
• Vertices a and b have multiple
edges between them
• Vertex c has a degree of 3
• There exists a path from vertex a
to vertex e
• Vertices f, g, and h form a 3-
cycle
• We have no single cut vertex or cut
edge (one that would create more
disjoint vertex/edge sets if
removed)
• We can separate this graph into two
disconnected sets:
1) Vertex Set 1 = {a, b, c, d, e}
2) Vertex Set 2 = {f, g, h}
• Imagine symmetric vertex
labels along the top and
left hand sides of the
matrix
• A one in a particular slot
tells us that the two
vertices are adjacent
• In this graph two vertices are
joined by a single directed
edge
• There is a dipath from vertex 1
to vertex 2 but not from vertex
2 to vertex 1
• Every vertex has ‘played’ every
other vertex
• We can see that there is no clear
winner (every vertex has
indegree and outdegree of 2)
• Vertices from Set 1 = {a, b, c, d} are
only adjacent to vertices from Set 2
= {e, f, g, h}
• This can be extended to tripartite
graphs (3 sets) or as many sets as we
like (n-partite graphs)
• Can we pair vertices from each set
together?
We can pair every vertex
from one set to a vertex
from the other using only
existing edges
• We can assign weights to edges
of a graph
• As we follow a path through the
graph, these weights accumulate
• For example, the path a -
> b -> c has an associated
weight of 0.5 + 0.4 = 0.9
• We can assign colors to vertices
• The graph we see here has a
proper coloring (no two vertices
of the same color are adjacent)
• We can also color edges!
• Are we focused more on objects or the relationships/interactions
between them?
• Are we looking at transition states?
• Is orientation important?
If you can imagine a graph to represent it, it’s probably worth giving it a
shot, if only for your own learning and exploration!
• If the lines represent
connections, what can we say
about the people highlighted
in red?
• What kinds of questions might
a graph be able to answer?
• e and d have the highest
degree
• What might the c-d-e cycle
tell us?
• What can we say about cut
vertices?
If we have page view
data with timestamps
how might we
represent this as a
graph?
• What might loops or multiple edges
between vertices represent?
• What types of data might we want to
use as values on the edges?
• What might comparing indegrees and
outdegrees on different vertices
represent?
If we have to regularly pick up a
load at the train station, make
deliveries to every factory and
then return to the garage how can
a graph help us find an optimal
route?
• We can assign weights to each edge to
represent distance, travel time, gas cost
for the distance, etc
• The path with the lowest total weight
represents the
shortest/cheapest/fastest/etc
• Note that edge weights are only
displayed for f-e and f-a
If the following people want to
attend the following talks (a-h),
what’s the minimum number of
sessions we need to satisfy
everyone?
• We can use the talks as
vertices and add edges
between talks that have the
same person interested
• The minimum number of
colors needed for a proper
coloring shows us the
minimum number of
sessions we need to satisfy
everyone
https://github.com/igraph/python-igraph https://github.com/networkx
https://graph-tool.skewed.de
• GraphML (XML-based)
• GML (ASCII-based)
• NetworkX has built in functions to work with a Pandas DataFrame or a
NumPy array/matrix
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
vertices = []
for x in range(1, 6):
vertices.append(x)
G.add_nodes_from(vertices)
G.add_edges_from([(1, 2), (2, 3), (5, 4),
(4, 2), (1, 3), (5, 1), (5, 2), (3, 4)])
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos, node_size=20)
nx.draw_networkx_edges(G, pos, width=5)
nx.draw_networkx_labels(G, pos,
font_size=14)
nx.draw(G, pos)
plt.show()
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
G.add_nodes_from(['a', 'b', 'c'])
G.add_edge('a', 'b', weight=0.5)
G.add_edge('b', 'c', weight=0.2)
G.add_edge('c', 'a', weight=0.7)
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos, node_size=500)
nx.draw_networkx_edges(G, pos, width=6)
nx.draw_networkx_labels(G, pos, font_size=14)
nx.draw_networkx_edge_labels(G, pos,
font_size=14)
nx.draw(G, pos)
plt.show()
>>> G.nodes()
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20]
>>> nx.shortest_path(G, 1, 18)
[1, 3, 18]
>>> G.degree()
{1: 4, 2: 3, 3: 4, 4: 4, 5: 4, 6: 3,
7: 3, 8: 3, 9: 4, 10: 3, 11: 2,
12: 2, 13: 2, 14: 4, 15: 3, 16: 3,
17: 2, 18: 3, 19: 3, 20: 3}
>>> nx.greedy_color(G)
{'d': 0, 'a': 0, 'e': 1, 'b': 1,
'c': 1, 'f': 2, 'h': 1, 'g': 0}
>>> temp = nx.greedy_color(G)
>>> len(set(temp.values()))
3
import networkx as nx
import matplotlib.pyplot as plt
G = nx.DiGraph([(1, 2), (1, 3), (4, 1),
(1, 5), (2, 3), (2, 4), (2, 5), (3, 4),
(3, 5), (4, 5)])
pos = nx.circular_layout(G)
nx.draw_networkx_nodes(G, pos,
node_size=200)
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_labels(G, pos,
fontsize=14)
>>> nx.has_path(G, 1, 5)
True
>>> nx.has_path(G, 5, 1)
False
>>> nx.shortest_path(G, 1, 4)
[1, 2, 4]
>>> nx.maximal_matching(G)
{(1, 4), (5, 2), (6, 3)}
• There’s a NetworkX tutorial tomorrow!
• In-browser Graphviz: webgraphviz.com
• Free graph theory textbook: An Introduction to Combinatorics and
Graph Theory, David Guichard
• Open problems in graph theory: openproblemgarden.org
• Graph databases
• Association for Computational Linguistics (ACL) 2010 Workshop on
Graph-based Methods for Natural Language Processing
• Free papers: researchgate.net

More Related Content

What's hot

Networks dijkstra's algorithm- pgsr
Networks  dijkstra's algorithm- pgsrNetworks  dijkstra's algorithm- pgsr
Networks dijkstra's algorithm- pgsr
Linawati Adiman
 
Double Patterning (3/31 update)
Double Patterning (3/31 update)Double Patterning (3/31 update)
Double Patterning (3/31 update)
guest833ea6e
 
Double Patterning
Double PatterningDouble Patterning
Double Patterning
Danny Luk
 

What's hot (20)

Vector in R
Vector in RVector in R
Vector in R
 
Generalized Notions of Data Depth
Generalized Notions of Data DepthGeneralized Notions of Data Depth
Generalized Notions of Data Depth
 
Dijkstra’S Algorithm
Dijkstra’S AlgorithmDijkstra’S Algorithm
Dijkstra’S Algorithm
 
Networks dijkstra's algorithm- pgsr
Networks  dijkstra's algorithm- pgsrNetworks  dijkstra's algorithm- pgsr
Networks dijkstra's algorithm- pgsr
 
Data structure
Data structureData structure
Data structure
 
Data structure and algorithm
Data structure and algorithmData structure and algorithm
Data structure and algorithm
 
Shortest path problem
Shortest path problemShortest path problem
Shortest path problem
 
Common fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps ofCommon fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps of
 
Dijkstra & flooding ppt(Routing algorithm)
Dijkstra & flooding ppt(Routing algorithm)Dijkstra & flooding ppt(Routing algorithm)
Dijkstra & flooding ppt(Routing algorithm)
 
Graph clustering
Graph clusteringGraph clustering
Graph clustering
 
Double Patterning (3/31 update)
Double Patterning (3/31 update)Double Patterning (3/31 update)
Double Patterning (3/31 update)
 
d
dd
d
 
Shortest path algorithm
Shortest  path algorithmShortest  path algorithm
Shortest path algorithm
 
Dijkstra's Algorithm
Dijkstra's Algorithm Dijkstra's Algorithm
Dijkstra's Algorithm
 
Combinatorial Optimization
Combinatorial OptimizationCombinatorial Optimization
Combinatorial Optimization
 
Image similarity using symbolic representation and its variations
Image similarity using symbolic representation and its variationsImage similarity using symbolic representation and its variations
Image similarity using symbolic representation and its variations
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
cdrw
cdrwcdrw
cdrw
 
Color vs texture feature extraction and matching in visual content retrieval ...
Color vs texture feature extraction and matching in visual content retrieval ...Color vs texture feature extraction and matching in visual content retrieval ...
Color vs texture feature extraction and matching in visual content retrieval ...
 
Double Patterning
Double PatterningDouble Patterning
Double Patterning
 

Similar to Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma

lecture 17
lecture 17lecture 17
lecture 17
sajinsc
 
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjteUnit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
pournima055
 
Lecture 5b graphs and hashing
Lecture 5b graphs and hashingLecture 5b graphs and hashing
Lecture 5b graphs and hashing
Victor Palmar
 

Similar to Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma (20)

LEC 12-DSALGO-GRAPHS(final12).pdf
LEC 12-DSALGO-GRAPHS(final12).pdfLEC 12-DSALGO-GRAPHS(final12).pdf
LEC 12-DSALGO-GRAPHS(final12).pdf
 
lecture 17
lecture 17lecture 17
lecture 17
 
Graphs and eularian circuit & path with c++ program
Graphs and eularian circuit & path with c++ programGraphs and eularian circuit & path with c++ program
Graphs and eularian circuit & path with c++ program
 
Unit 9 graph
Unit   9 graphUnit   9 graph
Unit 9 graph
 
Unit ix graph
Unit   ix    graph Unit   ix    graph
Unit ix graph
 
18 Basic Graph Algorithms
18 Basic Graph Algorithms18 Basic Graph Algorithms
18 Basic Graph Algorithms
 
Graphs
GraphsGraphs
Graphs
 
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjteUnit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
 
logic.pptx
logic.pptxlogic.pptx
logic.pptx
 
DATA STRUCTURES.pptx
DATA STRUCTURES.pptxDATA STRUCTURES.pptx
DATA STRUCTURES.pptx
 
Graphs in Data Structure
 Graphs in Data Structure Graphs in Data Structure
Graphs in Data Structure
 
Graph theory concepts complex networks presents-rouhollah nabati
Graph theory concepts   complex networks presents-rouhollah nabatiGraph theory concepts   complex networks presents-rouhollah nabati
Graph theory concepts complex networks presents-rouhollah nabati
 
Unit-6 Graph.ppsx ppt
Unit-6 Graph.ppsx                                       pptUnit-6 Graph.ppsx                                       ppt
Unit-6 Graph.ppsx ppt
 
Algorithms Design Assignment Help
Algorithms Design Assignment HelpAlgorithms Design Assignment Help
Algorithms Design Assignment Help
 
Algorithms Design Exam Help
Algorithms Design Exam HelpAlgorithms Design Exam Help
Algorithms Design Exam Help
 
8150.graphs
8150.graphs8150.graphs
8150.graphs
 
Dijkstra
DijkstraDijkstra
Dijkstra
 
graph_theory_1-11.pdf___________________
graph_theory_1-11.pdf___________________graph_theory_1-11.pdf___________________
graph_theory_1-11.pdf___________________
 
ae_722_unstructured_meshes.ppt
ae_722_unstructured_meshes.pptae_722_unstructured_meshes.ppt
ae_722_unstructured_meshes.ppt
 
Lecture 5b graphs and hashing
Lecture 5b graphs and hashingLecture 5b graphs and hashing
Lecture 5b graphs and hashing
 

More from PyData

More from PyData (20)

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma

  • 1.
  • 2. This is a novice-track talk, so all concepts and examples are kept simple 1. Basic graph theory concepts and definitions 2. A few real-world scenarios framed as graph data 3. Working with graphs in Python The overall goal of this talk is to spark your interest in and show you what’s out there as a jumping off point for you to go deeper
  • 3. Graph: “A structure amounting to a set of objects in which some pairs of the objects are in some sense ‘related’. The objects correspond to mathematical abstractions called vertices (also called nodes or points) and each of the related pairs of vertices is called an edge (also called an arc or line)” – Richard Trudeau, Introduction to Graph Theory (1st edition, 1993) Graph Analytics: “Analysis of data structured as a graph (sometimes also part of network analysis or link analysis depending on scope and context)” – Me, talking to a stress ball as I made these slides
  • 4.
  • 5. • We see two vertices joined by a single edge • Vertex 1 is adjacent to vertex 2 • The neighborhood of vertex 1 is all adjacent vertices (vertex 2 in this case)
  • 6.
  • 7. • We see that there is a loop on vertex a • Vertices a and b have multiple edges between them • Vertex c has a degree of 3 • There exists a path from vertex a to vertex e • Vertices f, g, and h form a 3- cycle
  • 8. • We have no single cut vertex or cut edge (one that would create more disjoint vertex/edge sets if removed) • We can separate this graph into two disconnected sets: 1) Vertex Set 1 = {a, b, c, d, e} 2) Vertex Set 2 = {f, g, h}
  • 9. • Imagine symmetric vertex labels along the top and left hand sides of the matrix • A one in a particular slot tells us that the two vertices are adjacent
  • 10. • In this graph two vertices are joined by a single directed edge • There is a dipath from vertex 1 to vertex 2 but not from vertex 2 to vertex 1
  • 11. • Every vertex has ‘played’ every other vertex • We can see that there is no clear winner (every vertex has indegree and outdegree of 2)
  • 12. • Vertices from Set 1 = {a, b, c, d} are only adjacent to vertices from Set 2 = {e, f, g, h} • This can be extended to tripartite graphs (3 sets) or as many sets as we like (n-partite graphs) • Can we pair vertices from each set together?
  • 13. We can pair every vertex from one set to a vertex from the other using only existing edges
  • 14. • We can assign weights to edges of a graph • As we follow a path through the graph, these weights accumulate • For example, the path a - > b -> c has an associated weight of 0.5 + 0.4 = 0.9
  • 15. • We can assign colors to vertices • The graph we see here has a proper coloring (no two vertices of the same color are adjacent) • We can also color edges!
  • 16. • Are we focused more on objects or the relationships/interactions between them? • Are we looking at transition states? • Is orientation important? If you can imagine a graph to represent it, it’s probably worth giving it a shot, if only for your own learning and exploration!
  • 17. • If the lines represent connections, what can we say about the people highlighted in red? • What kinds of questions might a graph be able to answer?
  • 18. • e and d have the highest degree • What might the c-d-e cycle tell us? • What can we say about cut vertices?
  • 19. If we have page view data with timestamps how might we represent this as a graph?
  • 20. • What might loops or multiple edges between vertices represent? • What types of data might we want to use as values on the edges? • What might comparing indegrees and outdegrees on different vertices represent?
  • 21. If we have to regularly pick up a load at the train station, make deliveries to every factory and then return to the garage how can a graph help us find an optimal route?
  • 22. • We can assign weights to each edge to represent distance, travel time, gas cost for the distance, etc • The path with the lowest total weight represents the shortest/cheapest/fastest/etc • Note that edge weights are only displayed for f-e and f-a
  • 23. If the following people want to attend the following talks (a-h), what’s the minimum number of sessions we need to satisfy everyone?
  • 24. • We can use the talks as vertices and add edges between talks that have the same person interested • The minimum number of colors needed for a proper coloring shows us the minimum number of sessions we need to satisfy everyone
  • 27. • GraphML (XML-based) • GML (ASCII-based) • NetworkX has built in functions to work with a Pandas DataFrame or a NumPy array/matrix
  • 28. import networkx as nx import matplotlib.pyplot as plt G = nx.Graph() vertices = [] for x in range(1, 6): vertices.append(x) G.add_nodes_from(vertices) G.add_edges_from([(1, 2), (2, 3), (5, 4), (4, 2), (1, 3), (5, 1), (5, 2), (3, 4)]) pos = nx.spring_layout(G) nx.draw_networkx_nodes(G, pos, node_size=20) nx.draw_networkx_edges(G, pos, width=5) nx.draw_networkx_labels(G, pos, font_size=14) nx.draw(G, pos) plt.show()
  • 29. import networkx as nx import matplotlib.pyplot as plt G = nx.Graph() G.add_nodes_from(['a', 'b', 'c']) G.add_edge('a', 'b', weight=0.5) G.add_edge('b', 'c', weight=0.2) G.add_edge('c', 'a', weight=0.7) pos = nx.spring_layout(G) nx.draw_networkx_nodes(G, pos, node_size=500) nx.draw_networkx_edges(G, pos, width=6) nx.draw_networkx_labels(G, pos, font_size=14) nx.draw_networkx_edge_labels(G, pos, font_size=14) nx.draw(G, pos) plt.show()
  • 30. >>> G.nodes() [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20] >>> nx.shortest_path(G, 1, 18) [1, 3, 18] >>> G.degree() {1: 4, 2: 3, 3: 4, 4: 4, 5: 4, 6: 3, 7: 3, 8: 3, 9: 4, 10: 3, 11: 2, 12: 2, 13: 2, 14: 4, 15: 3, 16: 3, 17: 2, 18: 3, 19: 3, 20: 3}
  • 31.
  • 32. >>> nx.greedy_color(G) {'d': 0, 'a': 0, 'e': 1, 'b': 1, 'c': 1, 'f': 2, 'h': 1, 'g': 0} >>> temp = nx.greedy_color(G) >>> len(set(temp.values())) 3
  • 33. import networkx as nx import matplotlib.pyplot as plt G = nx.DiGraph([(1, 2), (1, 3), (4, 1), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)]) pos = nx.circular_layout(G) nx.draw_networkx_nodes(G, pos, node_size=200) nx.draw_networkx_edges(G, pos) nx.draw_networkx_labels(G, pos, fontsize=14) >>> nx.has_path(G, 1, 5) True >>> nx.has_path(G, 5, 1) False >>> nx.shortest_path(G, 1, 4) [1, 2, 4]
  • 35. • There’s a NetworkX tutorial tomorrow! • In-browser Graphviz: webgraphviz.com • Free graph theory textbook: An Introduction to Combinatorics and Graph Theory, David Guichard • Open problems in graph theory: openproblemgarden.org • Graph databases • Association for Computational Linguistics (ACL) 2010 Workshop on Graph-based Methods for Natural Language Processing • Free papers: researchgate.net