Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

GraphTour - Workday: Tracking activity with Neo4j (English Version)

François Ritaly - Workday (English Version)

  • Login to see the comments

GraphTour - Workday: Tracking activity with Neo4j (English Version)

  1. 1. Tracking Activity with Neo4j
  2. 2. • Build Engineer (Build Engineering Team) • Located in Paris, France • Responsibilities ‒ Development of reusable Gradle plugins ‒ Administration of Artifactory ‒ Development of custom tools ‒ Support to engineering teams (mainly build-related) ‒ Sentinel (Server to track activity) Who Am I ? Workday Confidential
  3. 3. The Build Engineering Mission
  4. 4. • Define policies for engineering teams (dependency locking, artifact promotion, artifact metadata) • Provide reusable tooling (Gradle plugins & other custom tools) • Administer shared services (Artifactory) • Provide assistance to engineering teams Build Engineering – Our mission Workday Confidential
  5. 5. • To ensure policies are followed • Engineers enjoy a lot of freedom at Workday ! ‒ Netflix: The Paved Road • Is our tooling relevant ? • Gain insight into how development teams are working  We need answers to those questions ! Why We Need Monitoring Workday Confidential
  6. 6. • Artifacts (Jars, Rpms, “Deliveries”, etc) • CI Builds (in Bamboo, Team City and Jenkins) • SCM changes (in BitBucket, GitHub, Gerritt, etc) • Dependencies (between Artifacts, Builds, etc) • JIRA issues (tracking of code) • Promotions (of artifacts) • Metadata in general We’re interested in … Workday Confidential
  7. 7. • No unified system of records with all this information ! • The data is scattered across different systems (AF, CI, JIRA…) • … is secured with different credentials (AD, LDAP) • … is stored under different formats (JSON, XML, CSV, etc) • … is not always easily accessible • Accessing one data source is (usually) easy • Accessing two data sources is already a bit trickier • No unified query language for joining the aggregated data Problem: The Data is Everywhere Workday Confidential
  8. 8. Requirements
  9. 9. • Simple access to the information • Unified and intuitive data model • Powerful query language • Data as accurate as possible  Frequent updates to the data  Updates must be fast (performance) • Ability to easily refactor the data model • Ability to expose this information to engineering teams (automation) Requirements Workday Confidential
  10. 10. • We don’t want to rely on users to provide the information (unless we have no other choice) ! • The information we need usually already exists or can be derived, let’s use it ! But first of all ! Workday Confidential
  11. 11. Sentinel – Architecture Overview
  12. 12. Architecture Overview Workday Confidential REST API Web UI Data Miner … JIRA Artifactory Bamboo BitBucket Data SourcesA foundation to solve current and future problems Neo4j Aggregation Sanitization Normalization
  13. 13. • Command line tool (written in Groovy) • Executable fat jar • Runs from Bamboo every 15 mins • Scans the data sources containing the information we need • Preemptively extracts, sanitizes & normalizes the data • Detects incremental changes (optimized for performance) • Crash-proof • A run executes 59 commands in sequence • Scan time: 8 mins (min), 23 mins (average) The Data Miner Workday Confidential
  14. 14. • A (NoSQL) graph database • Graph paradigm is good for our need • Very flexible and easy to use • Schema-less • Excellent performance • All the useful data in one place • Cypher (Query Language) ! The Neo4j Database Workday Confidential
  15. 15. • UI made of HTML dashboards & dynamic charts • REST API • Spring Boot, Thymeleaf, D3.js, Swagger The Services We Expose Workday Confidential
  16. 16. Neo4j in a Nutshell
  17. 17. • Nodes have properties (Comparable to a Map<String, ?>) • … can have 0-N labels (Typing, Polymorphism) Neo4j - Nodes Workday Confidential core 1.0.5 jar Artifact ArtifactoryFile Workday id com.workday:core group com.workday artifact core version 1.0.5 created 1458713182201
  18. 18. • Relationships represent an edge between 2 nodes • … have a name • … can be directed • … can have properties Neo4j - Relationships Workday Confidential Artifact core 1.0.5 jar Git Commit core 5ce1f767 HAS_COMMIT
  19. 19. Neo4j query language A node () A labeled node (:Person) A relationship between 2 nodes ()--() A directed labeled relationship ()-[:PARENT_OF]->() MATCH (parent:Person)-[:PARENT_OF]->(child:Person) RETURN parent.name, COLLECT(child.name) Neo4j - Cypher Workday Confidential
  20. 20. Extracting the Data
  21. 21. Everything Starts with Artifactory Workday Confidential • Official repository of Artifacts, Rpms, Docker images • REST API to detect new artifacts in repositories
  22. 22. Step 1: Artifacts Workday Confidential • URI: com/workday/core/1.0.5/core-1.0.5-javadoc.jar  Group com.workday  Module core  Version 1.0.5  Type jar  Classifier javadoc  ID: “com.workday:core:1.0.5:javadoc@jar” Artifact core 1.0.5 jar javadoc
  23. 23. Step 2: Module Versions Workday Confidential • The artifact relates to a “Module Version”  Group com.workday  Module core  Version 1.0.5  ID: “com.workday:core:1.0.5” ModuleVersion core 1.0.5
  24. 24. Step 3: Modules Workday Confidential • The module version relates to a “Module”  Group com.workday  Module core  ID: “com.workday:core” Module core
  25. 25. All Together With Relationships Workday Confidential Module coreArtifact jar Artifact javadoc jar Artifact sources jar ARTIFACT_OF Artifact jar Artifact javadoc jar Artifact sources jar ARTIFACT_OF Version 1.0.5 Version 1.0.7 VERSION_OF VERSION_OF Version 1.0.6
  26. 26. Step 4: Artifact Dependencies Workday Confidential • Maven / Ivy descriptors  Dependencies • Dependencies  DEPENDS_ON relationships services 2.0.0 DEPENDS_ON gson 2.2.2 core 1.0.5 DEPENDS_ON
  27. 27. Step 5: Artifact Metadata Workday Confidential • Populated at build time (by a custom Gradle plugin) • Captures information about ‒ Gradle, JDK, Build machine ‒ CI builds ‒ SCM changes • Makes artifacts “self-documented”
  28. 28. Manifest Metadata – SCM Info Workday Confidential • WD-Git-Origin ssh://git@bitbucket.acme.com/core/core.git • WD-Git-Commit e28a60b96f452680c57cb76798def09fd171011f Artifact core 1.0.5 jar Git Commit core e28a60… HAS_COMMIT
  29. 29. Concrete Examples
  30. 30. List of all Workday Artifacts Workday Confidential Group Module Latest Version Age (days) SCM url SCM change Build URL Latest JIRAs com.workday core 1.0.5 120.2 core.git e28a60b9 URL CORE-120 com.workday foo-services 1.3.0 29.1 foo-services.git 146ae135 URL FOO-57 com.workday bar-services 2.2.8 54.8 bar-services.git b538c156 URL BAR-70 … … … … … … … … Public dashboard accessible with latest information (automatically up-to-date) → Where’s the build of this jar file ? → Where are the sources for this jar file ?
  31. 31. Identifying Direct Dependents Workday Confidential MATCH (dependent:ModuleVersion)-[:DEPENDS_ON]->(dependency:ModuleVersion) WHERE dependency.id = "com.workday:core:1.0.5” RETURN dependent.id AS dependent  Service in the Sentinel REST API Dependent com.workday:foo- services:1.3.0 com.workday:foo- services:1.2.5 com.workday:bar- services:2.2.8 com.workday:bar- services:2.2.7 …
  32. 32. Build Orchestration Workday Confidential Producing build Consuming build Bamboo Build CORE BUILT Artifact core 1.0.5 jar ModuleVersion core 1.0.5 ARTIFACT_OF ModuleVersion foo-services 1.3.0 ARTIFACT_OF Bamboo Build FOO-SERVICES Artifact foo-services 1.3.0 jar BUILT DEPENDS_ON DEPENDS_ON
  33. 33. Automated Release Notes Workday Confidential Version 1 Version 2 Bamboo Build CORE #11 BUILT Artifact core v1 jar Git Commit core 5ce1f767 HAS_REVISION Git Commit core ee2a0e22 HAS_REVISION Bamboo Build CORE #12 Artifact core v2 Jar BUILT PARENT_REVISION JIRA Issue CORE-120 LINKS_TO
  34. 34. Identify SCM Changes per JIRA Workday Confidential Find all mentions of a JIRA in commit messages Input: JIRA issue Output: Set of SCM changes JIRA Issue CORE-120 Git Commit core 5ce1f767 Git Commit core 5954ff88 Git Commit core ee2a0e22
  35. 35. Rule: “No dynamic dependencies in Maven / Ivy files” Rationale: Builds must be reproducible Dynamic versions: 1.+, LATEST, [1.0, 2.0[ Detection of Rule Violations Workday Confidential HTML dashboard listing the latest violations ModuleVersion baz 4.2.10 ModuleVersion pmd-checks 1.+ DEPENDS_ON
  36. 36. Conclusion Workday Confidential • Service rolled out internally • Neo4j is the perfect tool for capturing the data we’re interested in ‒ Very easy to refactor / enrich the data • Cypher gives us insight from the aggregated data • Solid foundation for future services ‒ Difficult part: Capturing the data ‒ Easy part: Leveraging the data by creating new queries • Decisions based on facts, not (educated) guesses • Holistic reporting
  37. 37. Q & A Thanks for attending Workday Confidential
  38. 38. TM

×