Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

GraphTour 2020 - Graphs & AI: A Path for Data Science

GraphTour 2020 Amsterdam
Jim Webber, Neo4j

  • Be the first to comment

GraphTour 2020 - Graphs & AI: A Path for Data Science

  1. 1. Graphs & AI A Path for Data Science Dr. Jim Webber Chief Scientist, Neo4j @jimwebber
  2. 2. It’s Not What You Know
  3. 3. It’s Who You Know And Where They Are
  4. 4. Who’s pay will increase the most?
  5. 5. Photo by Helena Lopes on Unsplash Network Structure is Highly Predictive of Pay, Promotions and Positive Reviews • People Near Structural Holes • Organizational Misfits “Organizational Misfits and the Origins of Brokerage in Intrafirm Networks” A. Kleinbaum “Structural Holes and Good Ideas” R. Burt
  6. 6. Relationships and Network Structure Strongest Predictors of Behavior & Complex Outcomes “Research into networks reveal that, surprisingly, the most connected people inside a tight group within a single industry are less valuable than the people who span the gaps ...” 6 “…jumping from ladder to ladder is a more effective strategy, and that lateral or even downward moves across an organization are more promising in the longer run . . .”
  7. 7. These are counter-intuitive notions 7
  8. 8. Which I hope have piqued your interest 8
  9. 9. Network Structure and Predictions Neo4j for Graph Data Science Steps of Graph Data Science Overview
  10. 10. Relationships The Strongest Predictors of Behavior! “Increasingly we're learning that you can make better predictions about people by getting all the information from their friends and their friends’ friends than you can from the information you have about the person themselves” 11
  11. 11. Predicting Financial Contagion From Global to Local 12
  12. 12. 823 1607 2439 3765 5824 0 1000 2000 3000 4000 5000 6000 7000 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 Graph Is Accelerating AI Innovation 13 AI Research Papers Featuring Graph Source: Dimension Knowledge System Graph Technology graph neural network graph convolutional graph embedding graph learning graph attention graph kernel graph completion
  13. 13. Predictive Maintenance Churn Prediction Fraud Detection Life SciencesRecommendations Cybersecurity Customer Segmentation Search/MDM Graph Data Science Applications
  14. 14. Better Predictions with Graphs Using the Data You Already Have • Current data science models ignore network structure • Graphs add highly predictive features to ML models, increasing accuracy • Otherwise unattainable predictions based on relationships Machine Learning Pipeline 15
  15. 15. Steps of Graph Data Science
  16. 16. Goals of Graph Data Science Better Decisions Higher Accuracy New Learning and more Trust 17
  17. 17. The Steps of Graph Data Science Decision Support Graph Based Predictions Graph Native Learning 18 Graph Feature Engineering Graph Embeddings Graph Networks Knowledge Graphs Graph Analytics
  18. 18. The Steps of Graph Data Science Graph Feature Engineering Graph Embeddings Graph Networks 19 Graph AnalyticsKnowledge Graphs Graph search and queries Support domain experts
  19. 19. Knowledge Graph with Queries Connecting the Dots has become... 20 Multiple graph layers of financial information Includes corporate data with cross-relationships and external news
  20. 20. Knowledge Graph with Queries Connecting the Dots Dashboards and tools • Credit risk • Investment risk • Portfolio news recommendations • Typical analyst portfolio is 200 companies • Custom relative weights 1 Week Snapshot: 800,000 shortest path calculations for the ranked newsfeed. Each calculation optimized to take approximately 10 ms. has become... 21
  21. 21. The Steps of Graph Data Science Graph Feature Engineering Graph Embeddings Graph Networks 22 Knowledge Graphs Graph Analytics Graph queries & algorithms for offline analysis Understanding Structures
  22. 22. Query (e.g. Cypher) Fast, local decisioning and pattern matching Graph Algorithms (e.g. Neo4j Algorithms Library) Global analysis and iterations You know what you’re looking for and making a decision You’re learning the overall structure of a network, updating data, and predicting Local Patterns Global Computation 23
  23. 23. Deceptively Simple Queries How many flagged accounts are in the applicant’s network 4+ hops out? How many login / account variables in common? Add these metrics to your approval process Difficult for RDMS systems over 3 hops Graph Analytics via Queries Detecting Financial Fraud Improving existing pipelines to identify fraud via heuristics 24
  24. 24. Graph Analytics via Algorithms Generally Unsupervised 25 A subset of data science algorithms that come from network science, Graph Algorithms enable reasoning about network structure. Pathfinding and Search Centrality (Importance) Community Detection Heuristic Link Prediction Similarity
  25. 25. 26 45+ Graph Algorithms in Neo4j Pathfinding and Search Centrality (Importance) Community Detection Heuristic Link Prediction Similarity Parallel BFS Parallel DFS Shortest Path Single Source Shortest path All Pairs Shortest Path Minimum Spanning Tree A* Shortest Path Yen’s K-Shortest Path Minimum Spanning Tree Random Walk Degree Centrality Closeness Centrality (inc. harmonic, Dangalchev, Wasserman & Faust) Betweenness Centrality Approx. Betweenness Centrality Page Rank Personalized Page Rank ArticleRank Eigenvector Centrality Triangle Count Clustering Coefficients Connected Components (aka Union Find) Strongly Connected Components Label Propagation Louvain Modularity Balanced Triad Adamic Adar Common Neighbours Preferential Attachment Resource Allocations Same Community Total Neighbours Euclidean Distance Cosine Similarity Jaccard Similarity Overlap Similarity Pearson Similarity Approximate KNN
  26. 26. The Steps of Graph Data Science Graph Embeddings Graph Networks 27 Knowledge Graphs Graph Analytics Graph Feature Engineering Graph algorithms & queries for machine learning Improve Prediction Accuracy
  27. 27. Graph Feature Engineering Feature Engineering is how we combine and process the data to create new, more meaningful features, such as clustering or connectivity metrics. Graph features add more dimensions to machine learning EXTRACTION 28
  28. 28. Feature Engineering using Graph Queries Telecom-churn prediction Churn prediction research has found that simple hand- engineered features are highly predictive • How many calls/texts has an account made? • How many of their contacts have churned?
  29. 29. 30 Feature Engineering using Graph Queries Telecom-churn prediction Add graph features based on graph queries to ML data Raw Data: Call Detail Records Input Data: CDR Sample Call Stats by: Incoming Outgoing Per day Short durations In-network Centrality SMS’s … Test/Training Data Caller ID Receiver ID Time Duration Location … Caller ID Receiver ID Time Duration Location … Identify Early Predictors: Select simple, interpretable metrics that are highly correlated w/churn Churn Score: Supervised learning to predict binary & continuous measures of churn Output/Results Random Sample Selection Feature Engineering
  30. 30. 31 Feature Engineering using Graph Queries Telecom-churn prediction 89.4% Accuracy in Subscriber Churn Prediction Raw Data: Call Detail Records Input Data: CDR Sample Call Stats by: Incoming Outgoing Per day Short durations In-network Centrality SMS’s … Test/Training Data Caller ID Receiver ID Time Duration Location … Caller ID Receiver ID Time Duration Location … Identify Early Predictors: Select simple, interpretable metrics that are highly correlated w/churn Churn Score: Supervised learning to predict binary & continuous measures of churn Output/Results Random Sample Selection Feature Engineering Source: Behavioral Modeling for Churn Prediction by Khan et al, 2015
  31. 31. Feature Engineering using Graph Algorithms Detecting Financial Fraud Using Structure to Improve ML Predictions Connected components identify disjointed group sharing identifiers PageRank to measure influence and transaction volumes Louvain to identify communities that frequently interact Jaccard to measure account similarity
  32. 32. The Steps of Graph Data Science Decision Support Graph Based Predictions Graph Native Learning 33 Graph Feature Engineering Graph Embeddings Graph Networks Knowledge Graphs Graph Analytics FUTURE
  33. 33. for Enterprise-Ready, Graph Data Science 34 Harness the natural power of relationships and network structures to infer behavior Neo4j Graph Algorithms Practical, Scalable Graph Data Science Native Graph Creation & Persistence Get all the graph you can eat with an integrated database built to store and protect relationships Neo4j Database Graph Exploration & Prototyping Explore results visually, quickly prototype and collaborate with different groups Neo4j Desktop and Browser Neo4j Bloom
  34. 34. A Neo4j Graph Data Science Library 35 Data scientists are under pressure to add more value, faster. That means putting predictive models into production quickly with the data they already have. Practical, easy-to-use graph data science and analytics Use network structures to increase predictive accuracy Enterprise-grade features and scale Evolving the Neo4j Graph Algorithms Library to focus on Data Scientists Preview
  35. 35. 36 Data Modeling Which Algorithms? Learn Syntax Reshape What Now? How do I represent my data as a graph? Which library? Streamlined & Supported How do I know what this algorithm is telling me? Pick library that seems easy, learn syntax and fight esoteric error messages. What!? I have to convert my data into different format myself? Did I get it right? How the $#@! do I get it into production? We’re a graph database, your data are already in the right shape. We support high value algorithms that are well documented. Our syntax is standardized and simplified across our library! Our graph loaders seamlessly reshape your data. It’s easy to write your results and move straight to production! Graph Data Science Typical Experience
  36. 36. Business neo4j.com/use-cases/ artificial-intelligence-analytics/ Data Scientists neo4j.com/sandbox Developers neo4j.com/download https://neo4j.com /graph-algorithms-book Free Until April 15
  37. 37. Business neo4j.com/use-cases/ artificial-intelligence-analytics/ Data Scientists neo4j.com/sandbox Developers neo4j.com/download https://neo4j.com/ graph-databases-book/ Free Forever
  38. 38. 39 “AI is not all about Machine Learning. Context, structure, and reasoning are necessary ingredients, and Knowledge Graphs and Linked Data are key technologies for this.” Wais Bashir Managing Editor, Onyx Advisory
  39. 39. 40 Dr. Jim Webber Chief Scientist, Neo4j @jimwebber

×