Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

GraphConnect 2014 SF: The Business Graph

GraphConnect 2014 SF: The Business Graph
presented by Kurt Freytag, Head of Product and Engineering, CrunchBase

GraphConnect 2014 SF: The Business Graph

  1. 1. SAN FRANCISCO | 10.22.2014 THE BUSINESS GRAPH The Business Graph (Why we chose Neo4j to rebuild CrunchBase)
  2. 2. THE BUSINESS GRAPH Kurt Freytag Head of Product, CrunchBase kurt@crunchbase.com 415.891.7761 @kfreytag 5’10”, 155lbs. Coding since 1977 Who Am I?
  3. 3. THE BUSINESS GRAPH • Concise History of CrunchBase • Our Vision • Why Neo4j? • Building w/ Neo4j & The Web • Q&A What am I Talking About?
  4. 4. THE BUSINESS GRAPH • Started in 2007 by Michael Arrington • Zero dedicated staff from 2007-2013 • Organically became source of truth for Startup Ecosystem • Millions of Monthly Users • Ran on two crappy AWS servers History of CrunchBase - In One Slide MySQL 5.0Rails 2.0
  5. 5. THE BUSINESS GRAPH • The Complete Graph of the Connected Business World • Entities: people, products, companies • Activities: fundings, acquisitions, job changes • Connections: how everything relates • Time: the lifecycle of every element • World’s Most Powerful Startup Community • Open to all The Vision of CrunchBase
  6. 6. THE BUSINESS GRAPH Emil Eifrem Founder • A natural way of modeling data Why Neo4j? Neotechnologies Company Neo4j Enterprise Edition Product Seed Round Funding Sunstone Capital Investor Connor Venture Partners Investor Lars Nordwall COO Philip Rathle VP of Products GraphConnect 2014 Event Kurt Freytag Speaker
  7. 7. THE BUSINESS GRAPH • A natural way of modeling data • Adapts easily to changing requirements Why Neo4j? Neotechnologies Company Seed Round Funding Sunstone Capital Investor Connor Venture Partners Investor Investment Investment John Smith Lead Investor John Smith Lead Investor
  8. 8. THE BUSINESS GRAPH • A natural way to model data • Adapts easily to changing requirements • Built-In Business Intelligence • Very specific or very general questions • We don’t know the questions in advance Why Neo4j? select if (tg.described_count > 1, 'complex', 'basic') dup o.normalized_name, concat('=hyperlink("http://www.crunchbase.com', o.p ifnull(o.domain, '') domain, ifnull(o.homepage_url, '') homepage_url, if(o.status = 'unknown', '', o.status) status, o.permalink, ifnull(o.investment_rounds, '') investment_rounds, ifnull(o.funding_rounds, '') funding_rounds, ifnull(o.relationships, '') relationships, ifnull(o.milestones, '') milestones, if( o.logo_url is null, '', 'Yes') has_logo, length(ifnull(o.overview, '')) overview_length, ifnull(o.created_by, '') created_by, date_format(o.created_at, '%Y-%m-%d %H:%i:%s') crea UNIX_TIMESTAMP(o.created_at) ts, ( ifnull(o.investment_rounds, 0)*20 + ifnull(o.funding_rounds, 0)*20 + ifnull(o.relationships, 0)*10 + ifnull(o.milestones, 0) + length(ifnull(o.overview, '')) + if( o.logo_url is null, 0, 50)) entity_rank, o.entity_type, o.entity_id from cb_objects o join t_duplicate_objects td on td.object_id = o.id join t_duplicate_groups tg on tg.id = td.duplicate_ EXPLAIN PLAN
  9. 9. THE BUSINESS GRAPH • A natural way of modeling data • Adapts easily to changing requirements • Built-In Business Intelligence • Very specific or very general questions • We don’t know the questions in advance • Directly maps to our OO thinking Why Neo4j? class Organization < BaseEntity relationship :has_funding_round, relationship :has_customer, relationship :sponsors_event, ... end Neotechnologies Company class FundingRound < BaseActivity attribute :announced_on, attribute :closed_on, attribute :funding_type, attribute :series, attribute :money_raised, attribute :post_money_valuation, ... end Seed Round Funding class HasFundingRound < BaseRelationship relationship :has_funding_round, relationship :has_customer, relationship :sponsors_event, ... end has_funding_round
  10. 10. THE BUSINESS GRAPH • A natural way of modeling data • Adapts easily to changing requirements • Built-In Business Intelligence • Very specific or very general questions • We don’t know the questions in advance • Directly maps to our OO thinking • We move faster • Just launched CrunchBase Events @ TC Disrupt London • Design, development, QA, and release was 2 weeks Why Neo4j?
  11. 11. Okay, if Neo’s so awesome, why doesn’t everybody use it?
  12. 12. THE BUSINESS GRAPH • CGI • design a data model • roll-your-own database connection • manually write all your queries • ORM (Hibernate, Doctrine) • design a data model • build the objects • map ‘em through configuration Databases & the Web - A Brief History
  13. 13. THE BUSINESS GRAPH • Today’s languages use datastores as dumb repos • Generate schemas from code • Isolate developer from writing queries • Focus on business logic, not data • Couple of Problems • The DBA role existed for a reason • Data modeling is the foundation of a scalable architecture • Generated queries can easily be 1,000x less efficient • Quick development can lead to slow applications Database as a Commodity
  14. 14. THE BUSINESS GRAPH • Neo4j is tough to adopt • Languages don’t support it out-of-the-box • The tools / drivers that exist are immature • Neo4j is not plug-n-play • However… • Neo4j is ideal for Object-Oriented development • Graphs are a natural fit for many use cases • We need to make Neo4j as easy to choose as MySQL Means that… + = ?
  15. 15. THE BUSINESS GRAPH • ActiveRecord for Neo4j • Implements a lot of ActiveModel • Validations • Serialization • Callbacks • Handles all Marshalling / UnMarshalling • “Feels” like ActiveRecord • Makes Neo4j plug-n-play for Rails • We Will Open Source It “Deja”
  16. 16. Thanks. Enjoy.

×