SlideShare a Scribd company logo
1 of 132
Scaling Tribal Knowledge
CHRIS WILLIAMS / JOHN BODLEY / FEB 8, 2017 / BIG DATA APPLICATION MEETUP
The problem
tribal knowledge |ˈtrībəl ˈnäləj |
noun
Tribal knowledge is any unwritten information that is not commonly
known by others within a company
As Airbnb grows so do the challenges around the volume,
complexity, and obscurity of data
In a large and complex organization, with a sea of data
resources, users struggle to find the right data
Data is often siloed and lacks context
I’m a recovering Data Scientist who wants to democratize
data; automate common workflows, surface relevant
information, and provide context
Tables in our Hive data warehouse
100k
Data resources
Beyond the data warehouse
> 6,000
Superset charts and
dashboards
Data resources
Beyond the data warehouse
> 6,000
Superset charts and
dashboards
> 5,000
Experiments and
metrics
Data resources
Beyond the data warehouse
> 6,000
Superset charts and
dashboards
> 5,000
Experiments and
metrics
> 4,000
Tableau dashboards
and workbooks
Data resources
Beyond the data warehouse
> 6,000
Superset charts and
dashboards
> 5,000
Experiments and
metrics
> 4,000
Tableau dashboards
and workbooks
> 1,000
Knowledge posts
Data resources
Beyond the data warehouse
With many more data sources
and data types to love
With many more data sources
and data types to love
and most importantly…
> 3,000 Airbnb employees
Portland
San Francisco
Los Angeles
Toronto
New York
Miami
Sao Paulo
Dublin
London
Paris
Barcelona
Berlin
Milan
Copenhagen
New Delhi
Seoul
Beijing
Tokyo
Sydney
Singapore
Washington, DC
> 20
Offices around the world
The mandate
To democratize data and empower Airbnb employees to be data-
informed by aiding with data exploration, discovery, and trust
The concept
Search…
It should be fairly evident what we feed into the search indices
But are we missing something?
The relevancy of relationships
Nodes and relationships have equal standing
created consumedSpoke 3
The graph
created
consumed
associated
associated
consumed
consum
ed
created
consum
ed
The graph
created
consumed
associated
associated
consumed
consum
ed
created
consum
ed
The graph
created
associated
associated
consumed
consum
ed
created
consum
ed
consumed
The graph
consumed
associated
associated
consumed
consum
ed
consum
ed
created
created
The graph
created
consumed
associated
associated
consumed
created
consum
ed
consum
ed
The graph
created
consumed
associated
associated consum
ed
created
consum
ed
consumed
The graph
created
consumed
consumed
consum
ed
created
consum
ed
associated
associated
The construction
Databases
5
APIs
3
Airflow DAG
1
Databases
5
APIs
3
Airflow DAG
1
We leverage all these data resources to build a graph comprising of
nodes and relationships
The Airflow DAG is run everyday and the output is stored in Hive
We gather over 10,000 thumbnails from the Tableau API,
Knowledge Repo database, and Superset screenshots
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
The winding data path
Airflow
Data transfer
Python
Graph datastore
py2neo
Python Neo4j
driver
Neo4j
Graph database
GraphAware
Neo4j/Elasticsearch plugin
Elasticsearch
Search engine
Flask
Python web framework
Hive
Data warehouse
Why we choose Neo4j for our database
The main reasons
Logical
Given our data is
represented as a graph
it is logical to use a
graph database to
store the data
Why we choose Neo4j for our database
The main reasons
Logical
Given our data is
represented as a graph
it is logical to use a
graph database to
store the data
Nimble
Performance wins
when dealing with
connected data versus
relational databases
Why we choose Neo4j for our database
The main reasons
Logical
Given our data is
represented as a graph
it is logical to use a
graph database to
store the data
Nimble
Performance wins
when dealing with
connected data versus
relational databases
Popular
It is the world’s leading
graph database and
the community edition
is free
Why we choose Neo4j for our database
The main reasons
Logical
Given our data is
represented as a graph
it is logical to use a
graph database to
store the data
Nimble
Performance wins
when dealing with
connected data versus
relational databases
Popular
It is the world’s leading
graph database and
the community edition
is free
Integrative
It integrates well with
Python and
Elasticsearch
Why we choose Neo4j for our database
The main reasons
The Neo4j and Elasticsearch symbiotic relationship
Courtesy of two GraphAware plugins
The Neo4j and Elasticsearch symbiotic relationship
Courtesy of two GraphAware plugins
Neo4j plugin
Provides bi-directional integration which transparently and asynchronously replicate data from
Neo4j to Elasticsearch
The Neo4j and Elasticsearch symbiotic relationship
Courtesy of two GraphAware plugins
Neo4j plugin
Provides bi-directional integration which transparently and asynchronously replicate data from
Neo4j to Elasticsearch
Elasticsearch plugin
Enables Elasticsearch to consult with the Neo4j database during a search query to enrich the
search rankings by leveraging the graph topology
The schema
createds t
(s:Entity)-[r:CREATED]->(t:Entity)
:Entity
:Org
:Group :User
:Superset
:Slice:Dashboard
Node label hierarchy
:Hive
:Schema :Table
(:Entity:Org:User {id: ‘jane_doe’})
(:Entity:Hive:Table {id: ‘core_data.dim_users’})
(:Entity:Superset:Dashboard {id: 123})
Efficient data retrieval and uniqueness
Restrictions and workarounds with the Neo4j schema
Efficient data retrieval and uniqueness
Restrictions and workarounds with the Neo4j schema
Indexes
Neo4j provides indexes for efficient data retrieval similar to a RDMS, however they are only
defined for a single label
Efficient data retrieval and uniqueness
Restrictions and workarounds with the Neo4j schema
Indexes
Neo4j provides indexes for efficient data retrieval similar to a RDMS, however they are only
defined for a single label
Uniqueness Constraints
Ensures that properties are unique for all nodes for a specific single label
Efficient data retrieval and uniqueness
Restrictions and workarounds with the Neo4j schema
Indexes
Neo4j provides indexes for efficient data retrieval similar to a RDMS, however they are only
defined for a single label
Uniqueness Constraints
Ensures that properties are unique for all nodes for a specific single label
GraphAware UUID plugin
Transparently assigns a globally unique UUID property to newly created elements which
cannot be changed or deleted
(:Entity {uuid: ‘<UUID>’})
The web app
The web app
Designing the user experience and interface of 

a data tool should not be an afterthought
Designing the user experience and interface of 

a data tool should not be an afterthought
Technical data power
user; the epitome of a
tribal knowledge
holder
Daphne Data
User personas
Less data literate;
needs to keep tabs on
her team’s resources
Manager Mel
New employee or 

new team; has no idea
what’s going on
Nathan New
Designing for data exploration, discovery, and trust
Company dataSearch
Resource details

&meta-data
User data Group data
Company dataSearch User data Group data
Resource details

&meta-data
Company dataSearch User data Group data
Resource details

&meta-data
Search
Resource details 

&meta-data
Company dataUser data Group data
Search
Resource details 

&meta-data
Company dataUser data Group data
Google-esque search filters
Search
Resource details 

&meta-data
Company dataUser data Group data
Google-esque search filters
Resource details & meta-data
Search
Resource details 

&meta-data
Company dataUser data Group data
Google-esque search filters
Resource details & meta-data
Context, context, & context
Search
Resource details 

&meta-data
Company dataUser data Group data
Search
Resource details 

&meta-data
Company dataUser data Group data
Description, external link, social
Search
Resource details 

&meta-data
Company dataUser data Group data
Meta-data & consumption
Description, external link, social
Search
Resource details 

&meta-data
Company dataUser data Group data
Surface relationships,
everything’s a link to promote
exploration
Meta-data & consumption
Description, external link, social
Column details & value distributions
Table lineage
Enrich meta-data on the fly
Search
Resource details 

&meta-data
Company dataUser data Group data
Column details & value distributions
Table lineage
Enrich meta-data on the fly
Search
Resource details 

&meta-data
Company dataUser data Group data
Search
Resource details 

&meta-data
Company dataUser data Group data
Search
Resource details 

&meta-data
Company dataUser data Group data
Search
Resource details 

&meta-data
Company dataUser data Group data
Search
Resource details 

&meta-data
Company dataUser data Group data
User details & 

meta-data
Search
Resource details 

&meta-data
Company dataUser data Group data
User details & 

meta-data
What they make, 

what they consume
Search
Resource details 

&meta-data
Company dataUser data Group data
Former employees also 

hold tribal knowledge
Search
Resource details 

&meta-data
Company dataUser data Group data
Search
Resource details 

&meta-data
Company dataUser data Group data
Group overview
Search
Resource details 

&meta-data
Company dataUser data Group data
Group overview
Search
Resource details 

&meta-data
Company dataUser data Group data
Pinterest-like curation
Group overview
Search
Resource details 

&meta-data
Company dataUser data Group data
Basic organization functionality
Pinterest-like curation
Search
Resource details 

&meta-data
Company dataUser data Group data
Curated + Popular content
Search
Resource details 

&meta-data
Company dataUser data Group data
Curated + Popular content
Thumbnails for maximum context
Search
Resource details 

&meta-data
Company dataUser data Group data
Pinning flow from resource page
Edit mode / draggable grid
Search
Resource details 

&meta-data
Company dataUser data Group data
Pinning flow from resource page
Edit mode / draggable grid
???? ??
Employees can feel disconnected
from Company-level metrics
Search
Resource details 

&meta-data
Company dataUser data Group data
The technology stack
Application +
dependencies
DOM Testing
eslint
enzyme
mocha
chai
Application
state
Styling
khan/aphrodite
The challenges
The challenges
The challenges
Complex
dependencies
An umbrella data tool is
vulnerable to changes
in upstream resource
dependencies
The challenges
Complex
dependencies
An umbrella data tool is
vulnerable to changes
in upstream resource
dependencies
Data-dense design
Balancing simplicity and
functionality is hard;
most internal design
resources are not made
for data-rich apps
The challenges
Complex
dependencies
An umbrella data tool is
vulnerable to changes
in upstream resource
dependencies
Data-dense design
Balancing simplicity and
functionality is hard;
most internal design
resources are not made
for data-rich apps
Graph merging
Non-trivial Git-like
merging of (daily or real-
time) graph updates
The challenges
Complex
dependencies
An umbrella data tool is
vulnerable to changes
in upstream resource
dependencies
Data-dense design
Balancing simplicity and
functionality is hard;
most internal design
resources are not made
for data-rich apps
Graph flickering
Transient relationships
should not create
“flickering” artifacts
Graph merging
Non-trivial Git-like
merging of (daily or real-
time) graph updates
The future
The future
The future
New resource types
A/B tests, logging
schemas, SQL queries,
etc.
The future
New resource types
A/B tests, logging
schemas, SQL queries,
etc.
Certified content
Use certification to build
trust and enable users to
filter through a sea of
stale content
The future
New resource types
A/B tests, logging
schemas, SQL queries,
etc.
Certified content
Use certification to build
trust and enable users to
filter through a sea of
stale content
Alerts&
recommendations
Move from active
exploration to deliver
relevant updates and
content suggestions
The future
New resource types
A/B tests, logging
schemas, SQL queries,
etc.
Certified content
Use certification to build
trust and enable users to
filter through a sea of
stale content
Game-ification
Provide content
producers with a sense
of value
Alerts&
recommendations
Move from active
exploration to deliver
relevant updates and
content suggestions
The team
The Dataportal team
Analytics&Experimentation Products
John Bodley
Software Engineer
Eli Brumbaugh
Experience Designer
Jeff Feng
Product Manager
Michelle Thomas
Software Engineer
Chris Williams
Data Visualization
The Dataportal team
Analytics&Experimentation Products
John Bodley
Software Engineer
Eli Brumbaugh
Experience Designer
Jeff Feng
Product Manager
Michelle Thomas
Software Engineer
Chris Williams
Data Visualization
Thank you
john.bodley@airbnb.com
chris.williams@airbnb.com

More Related Content

What's hot

GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandOntotext
 
Graphs for Finance - AML with Neo4j Graph Data Science
Graphs for Finance - AML with Neo4j Graph Data Science Graphs for Finance - AML with Neo4j Graph Data Science
Graphs for Finance - AML with Neo4j Graph Data Science Neo4j
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j
 
A whirlwind tour of graph databases
A whirlwind tour of graph databasesA whirlwind tour of graph databases
A whirlwind tour of graph databasesjexp
 
Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Cambridge Semantics
 
Scalable, Fast Analytics with Graph - Why and How
Scalable, Fast Analytics with Graph - Why and HowScalable, Fast Analytics with Graph - Why and How
Scalable, Fast Analytics with Graph - Why and HowCambridge Semantics
 
GraphTour - Neo4j Platform Overview
GraphTour - Neo4j Platform OverviewGraphTour - Neo4j Platform Overview
GraphTour - Neo4j Platform OverviewNeo4j
 
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...South London Geek Nights
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentationTao Feng
 
RDBMS to Graph Webinar
RDBMS to Graph WebinarRDBMS to Graph Webinar
RDBMS to Graph WebinarNeo4j
 
What you need to know to start an AI company?
What you need to know to start an AI company?What you need to know to start an AI company?
What you need to know to start an AI company?Mo Patel
 
Graphs for AI & ML, Jim Webber, Neo4j
Graphs for AI & ML, Jim Webber, Neo4jGraphs for AI & ML, Jim Webber, Neo4j
Graphs for AI & ML, Jim Webber, Neo4jNeo4j
 
Network and IT Operations
Network and IT OperationsNetwork and IT Operations
Network and IT OperationsNeo4j
 
Microsoft R Server for Data Sciencea
Microsoft R Server for Data ScienceaMicrosoft R Server for Data Sciencea
Microsoft R Server for Data ScienceaData Science Thailand
 
Neo4j Graph Platform Overview, Kurt Freytag, Neo4j
Neo4j Graph Platform Overview, Kurt Freytag, Neo4jNeo4j Graph Platform Overview, Kurt Freytag, Neo4j
Neo4j Graph Platform Overview, Kurt Freytag, Neo4jNeo4j
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discoverymarkgrover
 
Exploring the Great Olympian Graph
Exploring the Great Olympian GraphExploring the Great Olympian Graph
Exploring the Great Olympian GraphNeo4j
 
schema.org, Linked Data's Gateway Drug
schema.org, Linked Data's Gateway Drugschema.org, Linked Data's Gateway Drug
schema.org, Linked Data's Gateway DrugConnected Data World
 

What's hot (20)

GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
 
Graphs for Finance - AML with Neo4j Graph Data Science
Graphs for Finance - AML with Neo4j Graph Data Science Graphs for Finance - AML with Neo4j Graph Data Science
Graphs for Finance - AML with Neo4j Graph Data Science
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
 
A whirlwind tour of graph databases
A whirlwind tour of graph databasesA whirlwind tour of graph databases
A whirlwind tour of graph databases
 
Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?Should a Graph Database Be in Your Next Data Warehouse Stack?
Should a Graph Database Be in Your Next Data Warehouse Stack?
 
Scalable, Fast Analytics with Graph - Why and How
Scalable, Fast Analytics with Graph - Why and HowScalable, Fast Analytics with Graph - Why and How
Scalable, Fast Analytics with Graph - Why and How
 
GraphTour - Neo4j Platform Overview
GraphTour - Neo4j Platform OverviewGraphTour - Neo4j Platform Overview
GraphTour - Neo4j Platform Overview
 
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentation
 
RDBMS to Graph Webinar
RDBMS to Graph WebinarRDBMS to Graph Webinar
RDBMS to Graph Webinar
 
The Power of Machine Learning and Graphs
The Power of Machine Learning and GraphsThe Power of Machine Learning and Graphs
The Power of Machine Learning and Graphs
 
What you need to know to start an AI company?
What you need to know to start an AI company?What you need to know to start an AI company?
What you need to know to start an AI company?
 
Graphs for AI & ML, Jim Webber, Neo4j
Graphs for AI & ML, Jim Webber, Neo4jGraphs for AI & ML, Jim Webber, Neo4j
Graphs for AI & ML, Jim Webber, Neo4j
 
Network and IT Operations
Network and IT OperationsNetwork and IT Operations
Network and IT Operations
 
Microsoft R Server for Data Sciencea
Microsoft R Server for Data ScienceaMicrosoft R Server for Data Sciencea
Microsoft R Server for Data Sciencea
 
Neo4j Graph Platform Overview, Kurt Freytag, Neo4j
Neo4j Graph Platform Overview, Kurt Freytag, Neo4jNeo4j Graph Platform Overview, Kurt Freytag, Neo4j
Neo4j Graph Platform Overview, Kurt Freytag, Neo4j
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
 
Exploring the Great Olympian Graph
Exploring the Great Olympian GraphExploring the Great Olympian Graph
Exploring the Great Olympian Graph
 
Spark in 15 min
Spark in 15 minSpark in 15 min
Spark in 15 min
 
schema.org, Linked Data's Gateway Drug
schema.org, Linked Data's Gateway Drugschema.org, Linked Data's Gateway Drug
schema.org, Linked Data's Gateway Drug
 

Similar to 2017-01-08-scaling tribalknowledge

Democratizing Data at Airbnb
Democratizing Data at AirbnbDemocratizing Data at Airbnb
Democratizing Data at AirbnbNeo4j
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSStéphane Fréchette
 
Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...
Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...
Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...Impetus Technologies
 
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...Neo4j
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World DistilledRTTS
 
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenData Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenDatabricks
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Tech Triveni
 
Neo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform OverviewNeo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform OverviewNeo4j
 
Tapping into Scientific Data with Hadoop and Flink
Tapping into Scientific Data with Hadoop and FlinkTapping into Scientific Data with Hadoop and Flink
Tapping into Scientific Data with Hadoop and FlinkMichael Häusler
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurgeRTTS
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaObjectRocket
 
Flink Forward Europe 2019 - Berlin
Flink Forward Europe 2019 - BerlinFlink Forward Europe 2019 - Berlin
Flink Forward Europe 2019 - BerlinDavid Morin
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big DataFrank Kienle
 
Intro to Neo4j Webinar
Intro to Neo4j WebinarIntro to Neo4j Webinar
Intro to Neo4j WebinarNeo4j
 
Designing the Next Generation of Data Pipelines at Zillow with Apache Spark
Designing the Next Generation of Data Pipelines at Zillow with Apache SparkDesigning the Next Generation of Data Pipelines at Zillow with Apache Spark
Designing the Next Generation of Data Pipelines at Zillow with Apache SparkDatabricks
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010nzhang
 
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...Yann Pauly
 
Rennes Meetup 2019-09-26 - Change data capture in production
Rennes Meetup 2019-09-26 - Change data capture in productionRennes Meetup 2019-09-26 - Change data capture in production
Rennes Meetup 2019-09-26 - Change data capture in productionDavid Morin
 

Similar to 2017-01-08-scaling tribalknowledge (20)

Democratizing Data at Airbnb
Democratizing Data at AirbnbDemocratizing Data at Airbnb
Democratizing Data at Airbnb
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APS
 
Datalake Architecture
Datalake ArchitectureDatalake Architecture
Datalake Architecture
 
Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...
Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...
Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...
 
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Me...
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
Apache hive
Apache hiveApache hive
Apache hive
 
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenData Discovery at Databricks with Amundsen
Data Discovery at Databricks with Amundsen
 
Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...Apache CarbonData+Spark to realize data convergence and Unified high performa...
Apache CarbonData+Spark to realize data convergence and Unified high performa...
 
Neo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform OverviewNeo4j Database and Graph Platform Overview
Neo4j Database and Graph Platform Overview
 
Tapping into Scientific Data with Hadoop and Flink
Tapping into Scientific Data with Hadoop and FlinkTapping into Scientific Data with Hadoop and Flink
Tapping into Scientific Data with Hadoop and Flink
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
 
Flink Forward Europe 2019 - Berlin
Flink Forward Europe 2019 - BerlinFlink Forward Europe 2019 - Berlin
Flink Forward Europe 2019 - Berlin
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
Intro to Neo4j Webinar
Intro to Neo4j WebinarIntro to Neo4j Webinar
Intro to Neo4j Webinar
 
Designing the Next Generation of Data Pipelines at Zillow with Apache Spark
Designing the Next Generation of Data Pipelines at Zillow with Apache SparkDesigning the Next Generation of Data Pipelines at Zillow with Apache Spark
Designing the Next Generation of Data Pipelines at Zillow with Apache Spark
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010
 
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
 
Rennes Meetup 2019-09-26 - Change data capture in production
Rennes Meetup 2019-09-26 - Change data capture in productionRennes Meetup 2019-09-26 - Change data capture in production
Rennes Meetup 2019-09-26 - Change data capture in production
 

Recently uploaded

Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 

Recently uploaded (20)

Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 

2017-01-08-scaling tribalknowledge