Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- What to Upload to SlideShare by SlideShare 6559968 views
- Customer Code: Creating a Company C... by HubSpot 4879588 views
- Be A Great Product Leader (Amplify,... by Adam Nash 1092939 views
- Trillion Dollar Coach Book (Bill Ca... by Eric Schmidt 1279942 views
- APIdays Paris 2019 - Innovation @ s... by apidays 1541795 views
- A few thoughts on work life-balance by Wim Vanderbauwhede 1119255 views

Graphs for AI. A Path for Data Science

Amy Hodler

No Downloads

Total views

412

On SlideShare

0

From Embeds

0

Number of Embeds

1

Shares

0

Downloads

19

Comments

0

Likes

1

No notes for slide

- 1. Graphs & AI A Path for Data Science Amy E. Hodler Director, Graph Analytics & AI Programs Neo4j @amyhodler
- 2. It’s Not What You Know
- 3. It’s Who You Know And Where They Are
- 4. Whose pay will increase the most?
- 5. Photo by Helena Lopes on Unsplash Network Structure is highly predictive of pay and promotions • People Near Structural Holes • Organizational Misfits “Organizational Misfits and the Origins of Brokerage in Intrafirm Networks” A. Kleinbaum “Structural Holes and Good Ideas” R. Burt
- 6. Relationships and Network Structure Strongest Predictors of Behavior & Complex Outcomes “Research into networks reveal that, surprisingly, the most connected people inside a tight group within a single industry are less valuable than the people who span the gaps ...” 6 “…jumping from ladder to ladder is a more effective strategy, and that lateral or even downward moves across an organization are more promising in the longer run . . .”
- 7. It’s a counter-intuitive notion 7
- 8. Which is why network science is so powerful 8
- 9. Overview Network Structure and Predictions Neo4j for Graph Data Science Steps of Graph Data Science
- 10. Relationships The Strongest Predictors of Behavior! “Increasingly we're learning that you can make better predictions about people by getting all the information from their friends and their friends’ friends than you can from the information you have about the person themselves” James Fowler 11
- 11. 823 1607 2439 3765 5824 0 1000 2000 3000 4000 5000 6000 7000 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 Graph Is Accelerating AI Innovation 12 AI Research Papers Featuring Graph Data Source: Dimensions knowledge system Graph Technology graph neural network graph convolutional graph embedding graph learning graph attention graph kernel graph completion
- 12. Better Predictions with Graphs Using the Data You Already Have • Current data science models ignore network structure • Graphs add highly predictive features to ML models, increasing accuracy • Otherwise unattainable predictions based on relationships Machine Learning Pipeline 13
- 13. 14 • 27 Million warranty & service documents parsed for text to knowledge graph • Graph is context for AI to learn “prime examples” and anticipate maintenance • Improves satisfaction and equipment lifespan • Connecting 50 research databases, 100k’s of Excel workbooks, 30 bio-sample databases • Bytes 4 Diabetes Award for use of a knowledge graph, graph analytics, and AI • Customized views for flexible research angles • Almost 70% of credit card fraud was missed • ~1B Nodes and +1B Relationships to analyze • Graph analytics with queries & algorithms help find $ millions of fraud in 1st year Neo4j for Graph Analytics, AI and Data Science Caterpillar’s AI Supply Chain & Maintenance German Center for Diabetes Research (DZD) Financial Fraud Detection & Recovery Top 10 Bank
- 14. Predictive Maintenance Churn Prediction Fraud Detection Life Sciences Recommendations Cybersecurity Customer Segmentation Search/MDM Graph Data Science Applications Just a few examples…
- 15. A Path for Graph Data Science
- 16. The Steps of Graph Data Science Decision Support Graph Based Predictions Graph Native Learning 17 Graph Feature Engineering Graph Embeddings Graph Networks Knowledge Graphs Graph Analytics
- 17. The Steps of Graph Data Science Graph Feature Engineering Graph Embeddings Graph Networks 18 Graph AnalyticsKnowledge Graphs Graph search and queries Support domain experts
- 18. Knowledge Graph Connecting the Dots has become... 19 Multiple graph layers of financial information Includes corporate data with cross-relationships and external news
- 19. Knowledge Graph with Queries Connecting the Dots Dashboards and tools • Credit risk • Investment risk • Portfolio news recommendations • Typical analyst portfolio is 200 companies • Custom relative weights 1 Week Snapshot: 800,000 shortest path calculations for the ranked newsfeed. Each calculation optimized to take approximately 10 ms. has become... 20
- 20. The Steps of Graph Data Science Graph Feature Engineering Graph Embeddings Graph Networks 21 Knowledge Graphs Graph Analytics Graph queries & algorithms for offline analysis Understanding Structures
- 21. Query (e.g. Cypher) Fast, local decisioning and pattern matching Graph Algorithms (e.g. Neo4j Algorithms Library) Global analysis and iterations You know what you’re looking for and making a decision You’re learning the overall structure of a network, updating data, and predicting Local Patterns Global Computation 22
- 22. Deceptively Simple Queries How many flagged accounts are in the applicant’s network 4+ hops out? How many login / account variables in common? Add these metrics to your approval process Difficult for RDMS systems over 3 hops Graph Analytics via Queries Detecting Financial Fraud Improving existing pipelines to identify fraud via heuristics 23
- 23. Graph Analytics via Algorithms Generally Unsupervised 24 A subset of data science algorithms that come from network science, Graph Algorithms enable reasoning about network structure. Pathfinding and Search Centrality (Importance) Community Detection Heuristic Link Prediction Similarity
- 24. • Degree Centrality • Closeness Centrality • CC Variations: Harmonic, Dangalchev, Wasserman & Faust • Betweenness Centrality • Approximate Betweenness Centrality • PageRank • Personalized PageRank • ArticleRank • Eigenvector Centrality • Triangle Count • Clustering Coefficients • Connected Components (Union Find) • Strongly Connected Components • Label Propagation • Louvain Modularity • Balanced Triad (identification) Graph Algorithms & Functions in Neo4j • Shortest Path • Single-Source Shortest Path • All Pairs Shortest Path • A* Shortest Path • Yen’s K Shortest Path • Minimum Weight Spanning Tree • K-Spanning Tree (MST) • Random Walk • Degree Centrality • Closeness Centrality • CC Variations: Harmonic, Dangalchev, Wasserman & Faust • Betweenness Centrality & Approximate • PageRank • Personalized PageRank • ArticleRank • Eigenvector Centrality • Triangle Count • Clustering Coefficients • Connected Components (Union Find) • Strongly Connected Components • Label Propagation • Louvain Modularity • K-1 Coloring • Euclidean Distance • Cosine Similarity • Node Similarity (Jaccard) • Overlap Similarity • Pearson Similarity • Approximate KNN Pathfinding & Search Centrality / Importance Community Detection Similarity Link Prediction • Adamic Adar • Common Neighbors • Preferential Attachment • Resource Allocations • Same Community • Total Neighbors ...and also Auxiliary Functions: • Random graph generation • One hot encoding • Distributions & metrics 45
- 25. Graph Algorithms Detecting Financial Fraud Graph algorithms enable reasoning about network structure Louvain to identify communities that frequently interact PageRank to measure influence and transaction volumes Connected components identify disjointed group sharing identifiers Jaccard to measure account similarity 26
- 26. The Steps of Graph Data Science Graph Embeddings Graph Networks 27 Knowledge Graphs Graph Analytics Graph Feature Engineering Graph algorithms & queries for machine learning Improve Prediction Accuracy
- 27. Graph Feature Engineering Feature Engineering is combines and processes data to create new, more meaningful features, such as clustering or connectivity metrics. EXTRACTION 28 Client Betweenness Centrality Unique Shared Identifiers Weighted Score Known Fraudster? Jacob Olsen 0 1 1 No Kaylee Roach 32 2 4 Yes Mackenzie Burns 0 0 0 No Kayla Knowles 192 3 4 Yes Nicholas Jones 0 1 2 No John Smith 0.08 2 10 YesPaySim Dataset
- 28. Graph Feature Engineering Feature Engineering is combines and processes data to create new, more meaningful features, such as clustering or connectivity metrics. 29 Client Betweenness Centrality Shared Identifiers Weighted PageRank Known Fraudster? Jacob Olsen 0 1 1 No Kaylee Roach 32 2 4 Yes Mackenzie Burns 0 0 0 No Kayla Knowles 192 3 4 Yes Nicholas Jones 0 1 2 No John Smith 0.08 2 10 Yes Machine Learning on this To Build a Predictive Model
- 29. The Steps of Graph Data Science Decision Support Graph Based Predictions Graph Native Learning 30 Graph Feature Engineering Graph Embeddings Graph Networks Knowledge Graphs Graph Analytics FUTURE
- 30. Neo4j GDS Library Evolving the Graph Algorithms Library for Data Scientists • Run optimized, parallel algorithms over 10’s Billions of nodes • Production features like seeding for consistency • Scalable in-memory graph model that loads in parallel, can flexibly aggregate & reshape underlying data models • Simplified syntax & API with easy to understand guides, warnings, & errors messages • Extensive documentation with examples, tips, and browser guides Preview
- 31. for Enterprise Graph Data Science Neo4j Graph Data Science Library Practical, Scalable Graph Data Science Native Graph Creation & Persistence Neo4j Database Graph Exploration & Prototyping Neo4j Bloom Preview
- 32. Business neo4j.com/use-cases/ artificial-intelligence-analytics/ Data Scientists neo4j.com/sandbox Developers neo4j.com/download neo4j.com /graph-algorithms-book Free Until April 15
- 33. 34 “AI is not all about Machine Learning. Context, structure, and reasoning are necessary ingredients, and Knowledge Graphs and Linked Data are key technologies for this.” Wais Bashir Managing Editor, Onyx Advisory
- 34. 35 Amy E. Hodler @amyhodler amy.hodler@neo4j.com

No public clipboards found for this slide

Be the first to comment