Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- What to Upload to SlideShare by SlideShare 6559841 views
- Customer Code: Creating a Company C... by HubSpot 4879533 views
- Be A Great Product Leader (Amplify,... by Adam Nash 1092907 views
- Trillion Dollar Coach Book (Bill Ca... by Eric Schmidt 1279913 views
- APIdays Paris 2019 - Innovation @ s... by apidays 1541769 views
- A few thoughts on work life-balance by Wim Vanderbauwhede 1119238 views

GraphTour 2020 Amsterdam

Jim Webber, Neo4j

No Downloads

Total views

314

On SlideShare

0

From Embeds

0

Number of Embeds

1

Shares

0

Downloads

68

Comments

0

Likes

4

No notes for slide

- 1. Graphs & AI A Path for Data Science Dr. Jim Webber Chief Scientist, Neo4j @jimwebber
- 2. It’s Not What You Know
- 3. It’s Who You Know And Where They Are
- 4. Who’s pay will increase the most?
- 5. Photo by Helena Lopes on Unsplash Network Structure is Highly Predictive of Pay, Promotions and Positive Reviews • People Near Structural Holes • Organizational Misfits “Organizational Misfits and the Origins of Brokerage in Intrafirm Networks” A. Kleinbaum “Structural Holes and Good Ideas” R. Burt
- 6. Relationships and Network Structure Strongest Predictors of Behavior & Complex Outcomes “Research into networks reveal that, surprisingly, the most connected people inside a tight group within a single industry are less valuable than the people who span the gaps ...” 6 “…jumping from ladder to ladder is a more effective strategy, and that lateral or even downward moves across an organization are more promising in the longer run . . .”
- 7. These are counter-intuitive notions 7
- 8. Which I hope have piqued your interest 8
- 9. Network Structure and Predictions Neo4j for Graph Data Science Steps of Graph Data Science Overview
- 10. Relationships The Strongest Predictors of Behavior! “Increasingly we're learning that you can make better predictions about people by getting all the information from their friends and their friends’ friends than you can from the information you have about the person themselves” 11
- 11. Predicting Financial Contagion From Global to Local 12
- 12. 823 1607 2439 3765 5824 0 1000 2000 3000 4000 5000 6000 7000 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 Graph Is Accelerating AI Innovation 13 AI Research Papers Featuring Graph Source: Dimension Knowledge System Graph Technology graph neural network graph convolutional graph embedding graph learning graph attention graph kernel graph completion
- 13. Predictive Maintenance Churn Prediction Fraud Detection Life SciencesRecommendations Cybersecurity Customer Segmentation Search/MDM Graph Data Science Applications
- 14. Better Predictions with Graphs Using the Data You Already Have • Current data science models ignore network structure • Graphs add highly predictive features to ML models, increasing accuracy • Otherwise unattainable predictions based on relationships Machine Learning Pipeline 15
- 15. Steps of Graph Data Science
- 16. Goals of Graph Data Science Better Decisions Higher Accuracy New Learning and more Trust 17
- 17. The Steps of Graph Data Science Decision Support Graph Based Predictions Graph Native Learning 18 Graph Feature Engineering Graph Embeddings Graph Networks Knowledge Graphs Graph Analytics
- 18. The Steps of Graph Data Science Graph Feature Engineering Graph Embeddings Graph Networks 19 Graph AnalyticsKnowledge Graphs Graph search and queries Support domain experts
- 19. Knowledge Graph with Queries Connecting the Dots has become... 20 Multiple graph layers of financial information Includes corporate data with cross-relationships and external news
- 20. Knowledge Graph with Queries Connecting the Dots Dashboards and tools • Credit risk • Investment risk • Portfolio news recommendations • Typical analyst portfolio is 200 companies • Custom relative weights 1 Week Snapshot: 800,000 shortest path calculations for the ranked newsfeed. Each calculation optimized to take approximately 10 ms. has become... 21
- 21. The Steps of Graph Data Science Graph Feature Engineering Graph Embeddings Graph Networks 22 Knowledge Graphs Graph Analytics Graph queries & algorithms for offline analysis Understanding Structures
- 22. Query (e.g. Cypher) Fast, local decisioning and pattern matching Graph Algorithms (e.g. Neo4j Algorithms Library) Global analysis and iterations You know what you’re looking for and making a decision You’re learning the overall structure of a network, updating data, and predicting Local Patterns Global Computation 23
- 23. Deceptively Simple Queries How many flagged accounts are in the applicant’s network 4+ hops out? How many login / account variables in common? Add these metrics to your approval process Difficult for RDMS systems over 3 hops Graph Analytics via Queries Detecting Financial Fraud Improving existing pipelines to identify fraud via heuristics 24
- 24. Graph Analytics via Algorithms Generally Unsupervised 25 A subset of data science algorithms that come from network science, Graph Algorithms enable reasoning about network structure. Pathfinding and Search Centrality (Importance) Community Detection Heuristic Link Prediction Similarity
- 25. 26 45+ Graph Algorithms in Neo4j Pathfinding and Search Centrality (Importance) Community Detection Heuristic Link Prediction Similarity Parallel BFS Parallel DFS Shortest Path Single Source Shortest path All Pairs Shortest Path Minimum Spanning Tree A* Shortest Path Yen’s K-Shortest Path Minimum Spanning Tree Random Walk Degree Centrality Closeness Centrality (inc. harmonic, Dangalchev, Wasserman & Faust) Betweenness Centrality Approx. Betweenness Centrality Page Rank Personalized Page Rank ArticleRank Eigenvector Centrality Triangle Count Clustering Coefficients Connected Components (aka Union Find) Strongly Connected Components Label Propagation Louvain Modularity Balanced Triad Adamic Adar Common Neighbours Preferential Attachment Resource Allocations Same Community Total Neighbours Euclidean Distance Cosine Similarity Jaccard Similarity Overlap Similarity Pearson Similarity Approximate KNN
- 26. The Steps of Graph Data Science Graph Embeddings Graph Networks 27 Knowledge Graphs Graph Analytics Graph Feature Engineering Graph algorithms & queries for machine learning Improve Prediction Accuracy
- 27. Graph Feature Engineering Feature Engineering is how we combine and process the data to create new, more meaningful features, such as clustering or connectivity metrics. Graph features add more dimensions to machine learning EXTRACTION 28
- 28. Feature Engineering using Graph Queries Telecom-churn prediction Churn prediction research has found that simple hand- engineered features are highly predictive • How many calls/texts has an account made? • How many of their contacts have churned?
- 29. 30 Feature Engineering using Graph Queries Telecom-churn prediction Add graph features based on graph queries to ML data Raw Data: Call Detail Records Input Data: CDR Sample Call Stats by: Incoming Outgoing Per day Short durations In-network Centrality SMS’s … Test/Training Data Caller ID Receiver ID Time Duration Location … Caller ID Receiver ID Time Duration Location … Identify Early Predictors: Select simple, interpretable metrics that are highly correlated w/churn Churn Score: Supervised learning to predict binary & continuous measures of churn Output/Results Random Sample Selection Feature Engineering
- 30. 31 Feature Engineering using Graph Queries Telecom-churn prediction 89.4% Accuracy in Subscriber Churn Prediction Raw Data: Call Detail Records Input Data: CDR Sample Call Stats by: Incoming Outgoing Per day Short durations In-network Centrality SMS’s … Test/Training Data Caller ID Receiver ID Time Duration Location … Caller ID Receiver ID Time Duration Location … Identify Early Predictors: Select simple, interpretable metrics that are highly correlated w/churn Churn Score: Supervised learning to predict binary & continuous measures of churn Output/Results Random Sample Selection Feature Engineering Source: Behavioral Modeling for Churn Prediction by Khan et al, 2015
- 31. Feature Engineering using Graph Algorithms Detecting Financial Fraud Using Structure to Improve ML Predictions Connected components identify disjointed group sharing identifiers PageRank to measure influence and transaction volumes Louvain to identify communities that frequently interact Jaccard to measure account similarity
- 32. The Steps of Graph Data Science Decision Support Graph Based Predictions Graph Native Learning 33 Graph Feature Engineering Graph Embeddings Graph Networks Knowledge Graphs Graph Analytics FUTURE
- 33. for Enterprise-Ready, Graph Data Science 34 Harness the natural power of relationships and network structures to infer behavior Neo4j Graph Algorithms Practical, Scalable Graph Data Science Native Graph Creation & Persistence Get all the graph you can eat with an integrated database built to store and protect relationships Neo4j Database Graph Exploration & Prototyping Explore results visually, quickly prototype and collaborate with different groups Neo4j Desktop and Browser Neo4j Bloom
- 34. A Neo4j Graph Data Science Library 35 Data scientists are under pressure to add more value, faster. That means putting predictive models into production quickly with the data they already have. Practical, easy-to-use graph data science and analytics Use network structures to increase predictive accuracy Enterprise-grade features and scale Evolving the Neo4j Graph Algorithms Library to focus on Data Scientists Preview
- 35. 36 Data Modeling Which Algorithms? Learn Syntax Reshape What Now? How do I represent my data as a graph? Which library? Streamlined & Supported How do I know what this algorithm is telling me? Pick library that seems easy, learn syntax and fight esoteric error messages. What!? I have to convert my data into different format myself? Did I get it right? How the $#@! do I get it into production? We’re a graph database, your data are already in the right shape. We support high value algorithms that are well documented. Our syntax is standardized and simplified across our library! Our graph loaders seamlessly reshape your data. It’s easy to write your results and move straight to production! Graph Data Science Typical Experience
- 36. Business neo4j.com/use-cases/ artificial-intelligence-analytics/ Data Scientists neo4j.com/sandbox Developers neo4j.com/download https://neo4j.com /graph-algorithms-book Free Until April 15
- 37. Business neo4j.com/use-cases/ artificial-intelligence-analytics/ Data Scientists neo4j.com/sandbox Developers neo4j.com/download https://neo4j.com/ graph-databases-book/ Free Forever
- 38. 39 “AI is not all about Machine Learning. Context, structure, and reasoning are necessary ingredients, and Knowledge Graphs and Linked Data are key technologies for this.” Wais Bashir Managing Editor, Onyx Advisory
- 39. 40 Dr. Jim Webber Chief Scientist, Neo4j @jimwebber

No public clipboards found for this slide

Be the first to comment