SlideShare a Scribd company logo
1 of 58
Download to read offline
Architecting for change:
LinkedIn's new data ecosystem
Sept 28, 2016
Shirshanka Das, Principal Staff Engineer, LinkedIn
Yael Garten, Director of Data Science, LinkedIn
@shirshanka, @yaelgarten
Design for change. Expect it. Embrace it.
Product Change Technology Culture
&
Process
Learnings
The Product Change: 

Launch a completely rewritten LinkedIn mobile app
What does this impact?
Data driven
product
Tracking data records user activity
InvitationClickEvent()
Tracking data records user activity
InvitationClickEvent()
Scale fact:

~ 1000 tracking event types, 

~ Double-digit TB per day, 

hundreds of metrics & data
products
user
engagement
tracking data
metric scripts
production code
Tracking Data Lifecycle
TransportProduce Consume
Member facing
data products
Business facing
decision making
Tracking Data Lifecycle & Teams
TransportProduce Consume
Product or App teams:
PMs, Developers, TestEng
Infra teams:
Hadoop, Kafka, DWH, ...
Data teams: 

Analytics, Relevance Engineers,...
user
engagement
tracking data
metric scripts
production code
Member facing
data products
Business facing
decision making
How do we calculate a metric: ProfileViews
PageViewEvent
	Record	1:	
{	
		"header"	:	{	
				"memberId"	:	12345,	
				"time"	:	1454745292951,	
				"appName"	:	{	
						"string"	:	"LinkedIn"	
				"pageKey"	:	"profile_page"	
				},	
		},	
		"trackingInfo"	:	{	
				["vieweeID"	:	"23456"],	
	 ...	
		}	
}	
Metric: 

ProfileViews = sum(PageViewEvent)

where pageKey = profile_page

PageViewEvent
	Record	1:	
{	
		"header"	:	{	
				"memberId"	:	12345,	
				"time"	:	1454745292951,	
				"appName"	:	{	
						"string"	:	"LinkedIn"	
				"pageKey"	:	"new_profile_page"	
				},	
		},	
		"trackingInfo"	:	{	
				["vieweeID"	:	"23456"],	
	 ...	
		}	
}	
or new_profile_page
PageViewEvent
	Record	1:	
{	
		"header"	:	{	
				"memberId"	:	12345,	
				"time"	:	1454745292951,	
				"appName"	:	{	
						"string"	:	"LinkedIn"	
				"pageKey"	:	"profile_page"	
				},	
		},	
		"trackingInfo"	:	{	
				["vieweeID"	:	"23456"],	
	 ...	
		}	
}	
CASE	
		WHEN	trackingInfo["profileIds"]	...	
		WHEN	trackingInfo["profileid"]	...	
		WHEN	trackingInfo["profileId"]	...	
		WHEN	trackingInfo["url$profileIds"]	...	
		WHEN	trackingInfo["11"]	LIKE	'%profileIds=%'	THEN	SUBSTRING(trackingInfo["11"],9,60)	
		WHEN	trackingInfo["12"]	LIKE	'%priceIds=%'	THEN	SUBSTRING(trackingInfo["12"],9,60)	
		ELSE	NULL	
END	AS	profile_id
Evolution as we mature and grow...
Metric: ProfileViews = sum(PageViewEvent

where 

pagekey = profile_page and
memberID != trackinginfo[vieweeID] )
Eventually… unmaintainable
get_tracking_codes	=	foreach	get_domain_rolled_up	generate	
		..entry_domain_rollup,	
	 (		(tracking_code	matches	'eml-ced.*'	or	tracking_code	matches	'eml-b2_content_ecosystem
_digest.*'	
				or	(referer	is	not	null	and	(referer	matches	'.*touch.linkedin.com.*trk=eml-ced.*'		
				or	referer	matches	'.*touch.linkedin.com.*trk=eml-b2_content_ecosystem_digest.*'))	?	'Email	-	CED'	:						
(tracking_code	matches	'eml-.*'	or	(referer	is	not	null	and	referer	matches	'.*touch.linkedin.com.*trk=eml-.*')	
or	entry_domain_rollup	==	'Email'	?	'Email	-	Other'	:	
								(tracking_code	==	'hp-feed-article-title-hpm'	and	entry_domain_rollup	==	'Linkedin'	?	'Homepage	Pulse	
Module'	:	((tracking_code	matches	'hp-feed-.*'	and	entry_domain_rollup	==	'Linkedin')	or	(std_user_interface	
matches	'(phone	app|tablet	app|phone	browser|tablet	browser)'	and	tracking_code	==	'v-feed')			or	
(tracking_code	==	'Organic	Traffic'	and	entry_domain_rollup	==	'Linkedin'	and	(referer	==	'https://
www.linkedin.com/nhome'	or	referer	==	'http://www.linkedin.com/nhome'))	?	'Feed'	:	
												(tracking_code	matches	'hb_ntf_MEGAPHONE_.*'	and	entry_domain_rollup	==	'Linkedin'	?	'Desktop	
Notifications'	:			(tracking_code	==	'm_sim2_native_reader_swipe_right'	?	'Push	Notification'	:		(tracking_code	==	
'pulse_dexter_stream_scroll'		and	entry_domain_rollup	==	'Linkedin'	?	'Pulse	-	Infinite	Scroll	on	Dexter'	:	--infinite	
scroll	on	dexter	
			((tracking_code	==	'pulse_dexter_nav_click'	or	tracking_code	==	'pulse-det-nav_art')	and	entry_domain_rollup	==	
'Linkedin'	?	'Pulse	-	Left	Rail	Click	on	Dexter'	:	--left	rail	click	on	dexter	
																				(tracking_code	==	'Organic	Traffic'	and	referer	is	not	null	and	referer	matches	'.*linkedin.com/pulse
/article.*'	?	'Publishing	Platform'	:	
																						'None	Found	Yet'))))))))))	as	entry_point;	
Homepage
team
Push Notification
team
Email team
Long form post
team
PageViewEvent
	Record	1:	
{	
		"header"	:	{	
				"memberId"	:	12345,	
				"time"	:	1454745292951,	
				"appName"	:	{	
						"string"	:	"LinkedIn"	
				"pageKey"	:	"profile_page"	
				},	
		},	
		"trackingInfo"	:	{	
				["vieweeID"	:	"23456"],	
	 ...	
		}	
}	
We wanted to move to better data models
LI_ProfileViewEvent
	Record	1:	
{	
		"header"	:	{	
				"memberId"	:	12345,	
				"time"	:	4745292951145,	
				"appName"	:	{	
						"string"	:	"LinkedIn"	
				"pageKey"	:	"profile_page"	
				},	
		},	
"entityView"	:	{	
					"viewType"	:	"profile-view",	
					"viewerId"	:	“12345”,		
"vieweeId"	:	“23456”,		
		},	
}
Two options:

1. Keep the old tracking:
a. Cost: producers (try to) replicate it (write bad old code
from scratch),
b. Save: consumers avoid migrating.



2. Evolve.
a. Cost: time on data modeling, and on consumer
migration,
b. Save: pays down data modeling tech debt
How much work would it be?
How much work would it be?
Two options:
1. Keep the old tracking:
a. Cost: producers (try to) replicate it (write bad old code
from scratch),
b. Save: consumers avoid migrating.



2. Evolve.
a. Cost: time on data modeling, and on consumer
migration,
b. Save: pays down data modeling tech debt
2000 daysEstimated cost to update consumers to new tracking with clean, committee-approved data models
Estimated cost for producers to attempt to replicate old tracking
5000 days
#AnalyticsHappiness
The Task and Opportunity
Must do: So we will do the data modeling, and rewrite all the metrics to
account for the changes happening upstream… but…



Extra credit points: How do we make sure that the cost is not this high the
next time?




How do we handle evolution in a principled way?
Product Change Technology Culture
&
Process
Learnings
Metrics ecosystem at LinkedIn: 3 yrs ago
Operational Challenges
Diminished Trust due to multiple sources of truth
Data Stages
Ingest Process Serve VisualizeCreate
Ingest Process Serve VisualizeCreate
Tracking
Kafka
Espresso
…
Tracking Architecture
SDKs in different frameworks
(server, client)
Tracking front-end
Monitoring Tools
Components
KafkaClient-side
Tracking
Tracking
Frontend
Services
Tools
Create
Data Stages
Ingest Process Serve VisualizeCreate
Unified Ingestion with
Hundreds of TB / day
Thousands of datasets
80+% of data ingest
Ingest
In production @ LinkedIn, Intel, Swisscom, NerdWallet, PayPal
Ingest
Ingest Process Serve VisualizeCreate
Hadoop
Processing engines @ LinkedInProcess
Ingest Process Serve VisualizeCreate
Pinot
Pinot
Kafka Hadoop
Samza Jobs
Pinot
minutes
hour +
Distributed Multi-dimensional OLAP
Columnar + indexes
No joins
Latency: low ms to sub-second
Serve
Site-facing	Apps Reporting	dashboards Monitoring
In production @
LinkedIn, Uber
Serve
Ingest Process VisualizeCreate
Hadoop Pinot Raptor
Serve
Ingest Process VisualizeCreate
Unified Metrics Platform (UMP)
Hadoop Pinot Raptor
Serve
Unified Metrics Platform
Metrics Logic
Raw
Data
Pinot
UMP Harness
Incremental
Aggregate
Backfill
Auto-join
Raptor
dashboards
HDFS
Aggregated
Data
Experiment
Analysis
Relevance
...
HDFS
Ad-hoc
Ingest Process Serve VisualizeCreate
RaptorKafka

Espresso

…
Hadoop Pinot
Tracking Unified Metrics Platform (UMP)
How do we handle old and new?
PageViewEvent
ProfileViewEvent
Producers Consumers
old
new
Relevance
Analytics
The Big Challenge
load “/data/tracking/PageViewEvent” using AvroStorage()
(Pig scripts)
My Raw Data
Our scripts were doing ….
My Raw Data
My Data API
We need “microservices" for Data
The Database community solved this
decades ago...
Views!
We had been working on something that could help...
A Data Access Layer for Linkedin
Abstract away underlying physical details to allow users to
focus solely on the logical concerns
Logical Tables + Views
Logical FileSystem
Solving
With
Views
Producers
LinkedInProfileView
PageViewEvent
ProfileViewEvent
new
old
Consumers
pagekey==
profile
1:1
Relevance
Analytics
Views
ecosystem
41
Producers Consumers
LinkedInProfileView
JSAProfileView
Job Seeker App
(JSA)
LinkedIn App
UnifiedProfileView
Data Catalog +
Discovery
(DALI)
DaliFileSystem Client
Data Source
(HDFS)
Data Sink
(HDFS)
Processing Engine
(MapReduce, Spark)
DALI Datasets (Tables + Views)
Query Layers
(Hive, Pig, Spark)
View Defs +
UDFs
(Artifactory, Git)
Dataflow APIs
(MR, Spark,
Scalding)
DALI CLI
Dali: Implementation Details in Context
From
load ‘/data/tracking/PageViewEvent’
using AvroStorage();
To
load ‘tracking.UnifiedProfileView’ using
DaliStorage();
One small step for a script
A Few Hard Problems
Versioning
Views and UDFs
Mapping to Hive metastore entities
Development lifecycle
Git as source of truth
Gradle for build
LinkedIn tooling integration for deployment
Early experiences with Dali views
How we executed
Lots of work to get the infra ready
Closed beta model
Tons of training and education (hand holding) for all
Governance body
Feedback from analysts is overwhelmingly positive:
+ Much simpler to share and standardize data cleansing code with peers
+ Provides effective insulation to scripts from upstream changes
- Harder to debug where problems are due to additional layer
State of the world today
~100 producer views
~200 consumer views
~30% of UMP metrics use Dali data
sources
~80 unique tracking event data sources
ProfileViews
MessagesSent
Searches
InvitationsSent
ArticlesRead
JobApplications
...
What’s next for Dali?
Real-time Views on streaming data
Selective materialization
Hive is an implementation detail, not a long term bet
Open source
Data Quality Framework
Product Change Technology Culture
&
Process
Learnings
Infrastructure enables, but culture really preserves
get_tracking_codes	=	foreach	get_domain_rolled_up	generate	
		..entry_domain_rollup,	
	 (		(tracking_code	matches	'eml-ced.*'	or	tracking_code	matches	'eml-b2_content
_ecosystem_digest.*'	
				or	(referer	is	not	null	and	(referer	matches	'.*touch.linkedin.com.*trk=eml-ced.*'		
				or	referer	matches	'.*touch.linkedin.com.*trk=eml-b2_content_ecosystem_digest.*'))	?	'Email	
-	CED'	:						(tracking_code	matches	'eml-.*'	or	(referer	is	not	null	and	referer	matches	'.*touch.linkedin
.com.*trk=eml-.*')	or	entry_domain_rollup	==	'Email'	?	'Email	-	Other'	:	
								(tracking_code	==	'hp-feed-article-title-hpm'	and	entry_domain_rollup	==	'Linkedin'	?	'Homepage	
Pulse	Module'	:	((tracking_code	matches	'hp-feed-.*'	and	entry_domain_rollup	==	'Linkedin')	or	
(std_user_interface	matches	'(phone	app|tablet	app|phone	browser|tablet	browser)'	and	tracking_code	
==	'v-feed')			or	(tracking_code	==	'Organic	Traffic'	and	entry_domain_rollup	==	'Linkedin'	and	(referer	==	
'https://www.linkedin.com/nhome'	or	referer	==	'http://www.linkedin.com/nhome'))	?	'Feed'	:	
												(tracking_code	matches	'hb_ntf_MEGAPHONE_.*'	and	entry_domain_rollup	==	'Linkedin'	?	
'Desktop	Notifications'	:			(tracking_code	==	'm_sim2_native_reader_swipe_right'	?	'Push	Notification'	:		
(tracking_code	==	'pulse_dexter_stream_scroll'		and	entry_domain_rollup	==	'Linkedin'	?	'Pulse	-
For a great data ecosystem that can handle change:
1. Standardize core data entities

2. Create clear maintainable contracts between data producers
& consumers

3. Ensure dialogue between data producers & consumers

1. Standardize core data entities
• Event types and names: Page, Action, Impression
• Framework level client side tracking: views, clicks, flows
• For all else (custom) - guide when to create a new Event or Dali view

Navigation
Page View
Control Interaction
2. Create clear maintainable contracts
1
1. Tracking specification with monitoring: clear, visual, consistent contract
Need tooling to support culture shift

Tracking specification Tool
2
2. Dali dataset specification with data quality rules
3. Ensure dialogue between Producers & Consumers
• Awareness: Train about end-to-end data pipeline, data modeling
• Instill communication & collaborative ownership process between all: a step-by-step
playbook for who & how to develop and own tracking

PM → Analyst → Engineer → All3 → TestEng → Analyst
user engagement
tracking data
metric 

scripts
production

code
Member facing

data products
Business facing
decision making
Product Change Technology Culture
&
Process
Learnings
Our Learnings
Culture and Process
● Spend time to identify what needs culture & process, and 

what needs tools & tech
● Big changes can mean big opportunities
● Very hard to massively change things like data culture or data tech debt; never
a good time to invest in “invisible” behind-the-scenes change

→ Make it non-invisible -- try to clarify or size out the cost of NOT doing it

→ needed strong leaders, and a village

Tech
● Must build tooling to support that culture change otherwise culture will revert
● Work hard to make any new layer as frictionless as possible
● Virtual views on Hadoop data can work at scale! (Dali views)
For a great data ecosystem that can handle change:
1. Standardize core data entities

2. Create clear maintainable contracts between data producers & consumers

3. Ensure dialogue between data producers & consumers

Design for change. Expect it. Embrace it.
Did we succeed? We just handled another huge change!
#AnalyticsHappiness
Thank you.
@shirshanka, @yaelgarten

More Related Content

What's hot

Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesNishith Agarwal
 
JanusGraph DataBase Concepts
JanusGraph DataBase ConceptsJanusGraph DataBase Concepts
JanusGraph DataBase ConceptsSanil Bagzai
 
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...Flink Forward
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWSAmazon Web Services
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeDatabricks
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
LinkedIn Data Infrastructure (QCon London 2012)
LinkedIn Data Infrastructure (QCon London 2012)LinkedIn Data Infrastructure (QCon London 2012)
LinkedIn Data Infrastructure (QCon London 2012)Sid Anand
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMatillion
 
Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Roopa Tangirala
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Databricks
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAAdam Doyle
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 
Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020Julien Le Dem
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowDataWorks Summit
 
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)Ryan Blue
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMatei Zaharia
 

What's hot (20)

Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
JanusGraph DataBase Concepts
JanusGraph DataBase ConceptsJanusGraph DataBase Concepts
JanusGraph DataBase Concepts
 
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
 
The delta architecture
The delta architectureThe delta architecture
The delta architecture
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
LinkedIn Data Infrastructure (QCon London 2012)
LinkedIn Data Infrastructure (QCon London 2012)LinkedIn Data Infrastructure (QCon London 2012)
LinkedIn Data Infrastructure (QCon London 2012)
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - Snowflake
 
Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup)
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEA
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020Data lineage and observability with Marquez - subsurface 2020
Data lineage and observability with Marquez - subsurface 2020
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
 

Viewers also liked

Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012
Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012
Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012Shirshanka Das
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Shirshanka Das
 
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Shirshanka Das
 
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Shirshanka Das
 
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...Shirshanka Das
 

Viewers also liked (7)

Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012
Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012
Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012
 
Aksyon radyo
Aksyon radyoAksyon radyo
Aksyon radyo
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
 
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
 
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
 
SlideShare 101
SlideShare 101SlideShare 101
SlideShare 101
 

Similar to Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem

Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Yael Garten
 
Backstage 2019 - The Atlassian Journey with Amplitude - Itzik Feldman
Backstage 2019 - The Atlassian Journey with Amplitude - Itzik FeldmanBackstage 2019 - The Atlassian Journey with Amplitude - Itzik Feldman
Backstage 2019 - The Atlassian Journey with Amplitude - Itzik FeldmanAmplitude
 
Digital Twin: A radical new approach to IoT
Digital Twin: A radical new approach to IoTDigital Twin: A radical new approach to IoT
Digital Twin: A radical new approach to IoTDimitri Volkmann
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningKai Wähner
 
Agile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Agile Testing Days 2017 Intoducing AgileBI Sustainably - ExcercisesAgile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Agile Testing Days 2017 Intoducing AgileBI Sustainably - ExcercisesRaphael Branger
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsArcadia Data
 
Running Data Platforms Like Products
Running Data Platforms Like ProductsRunning Data Platforms Like Products
Running Data Platforms Like ProductsVMware Tanzu
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureMongoDB
 
Dynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving IT
Dynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving ITDynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving IT
Dynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving ITDynatrace
 
Real-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studyReal-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studydeep.bi
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022StreamNative
 
DataAquitaine February 2022
DataAquitaine February 2022DataAquitaine February 2022
DataAquitaine February 2022Yves Caseau
 
High-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsHigh-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsClusterpoint
 
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo JapanAI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo JapanAvkash Chauhan
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategyJames Serra
 
Creating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationCreating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationMongoDB
 
Applying linear regression and predictive analytics
Applying linear regression and predictive analyticsApplying linear regression and predictive analytics
Applying linear regression and predictive analyticsMariaDB plc
 
Microsoft Graph: The API for Microsoft 365
Microsoft Graph: The API for Microsoft 365Microsoft Graph: The API for Microsoft 365
Microsoft Graph: The API for Microsoft 365Mayur Tendulkar
 

Similar to Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem (20)

Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...
 
Backstage 2019 - The Atlassian Journey with Amplitude - Itzik Feldman
Backstage 2019 - The Atlassian Journey with Amplitude - Itzik FeldmanBackstage 2019 - The Atlassian Journey with Amplitude - Itzik Feldman
Backstage 2019 - The Atlassian Journey with Amplitude - Itzik Feldman
 
Digital Twin: A radical new approach to IoT
Digital Twin: A radical new approach to IoTDigital Twin: A radical new approach to IoT
Digital Twin: A radical new approach to IoT
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
 
Agile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Agile Testing Days 2017 Intoducing AgileBI Sustainably - ExcercisesAgile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Agile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
 
Scaling Legacy
Scaling LegacyScaling Legacy
Scaling Legacy
 
Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time Analytics
 
Running Data Platforms Like Products
Running Data Platforms Like ProductsRunning Data Platforms Like Products
Running Data Platforms Like Products
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise Architecture
 
Dynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving IT
Dynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving ITDynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving IT
Dynatrace: Davis - Hololens - AI update - Cloud announcements - Self driving IT
 
Real-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studyReal-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case study
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
DataAquitaine February 2022
DataAquitaine February 2022DataAquitaine February 2022
DataAquitaine February 2022
 
High-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsHigh-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutions
 
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo JapanAI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
 
Creating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital TransformationCreating a Modern Data Architecture for Digital Transformation
Creating a Modern Data Architecture for Digital Transformation
 
Applying linear regression and predictive analytics
Applying linear regression and predictive analyticsApplying linear regression and predictive analytics
Applying linear regression and predictive analytics
 
Microsoft Graph: The API for Microsoft 365
Microsoft Graph: The API for Microsoft 365Microsoft Graph: The API for Microsoft 365
Microsoft Graph: The API for Microsoft 365
 

Recently uploaded

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 

Recently uploaded (20)

办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 

Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem