SlideShare a Scribd company logo
1 of 30
Download to read offline
The Curse of the
Data Lake Monster
Kiran Prakash and Lucy Chambers
We have a
problem
@lucyfedia
So what is a data lake?
● Democratisation of Data
● Centralized and Monolithic
● Domain Agnostic
● Structured and unstructured data
@kiran_p
https://martinfowler.com/articles/data-monolith-to-mesh.html @kiran_p
Why do Data
Lakes Fail?
@kiran_p
Build it they will come!
● Seen primarily as an infrastructure problem
● Pinning down uses cases & value stream is hard
● Analysis paralysis & overengineering
@kiran_p
Centralised and Monolithic
https://martinfowler.com/articles/data-monolith-to-mesh.html @kiran_p
Functional Decomposition
@kiran_phttps://martinfowler.com/articles/data-monolith-to-mesh.html
Axis of change
@kiran_phttps://martinfowler.com/articles/data-monolith-to-mesh.html
@kiran_p
Data Swamps
Focus on initiatives which
align with business outcomes.
Structure teams around
business capabilities.
Product
Thinking
Self service platform for
storage, catalogue,
computation, access rights
and pipelines etc.
Autonomous teams with clear
bounded context building and
running products
independently.
Platform
Thinking
Domain Driven
Design
The Data Mesh Paradigm
@kiran_p
Product Thinking
For Data Projects
@lucyfedia
Project vs Product
Project Mode Product Mode
START Solution (often) defined at outset.
Problem identified at outset.
Solution developed iteratively and tested.
STOP Team moves on when solution delivered. Team moves on when problem verifiably fixed.
FOCUS Features delivered in a given time & budget.
Progress made on key business goals
(measured by metrics).
HAS FIXED SCOPE? Usually. Almost never.
@lucyfedia
Product teams have two jobs and two customers
● Deliver business capabilities
- External User
● Expose their domain’s data for others to consume
- (often) Internal User
@lucyfedia
● Discoverable
● Addressable
● Trustworthy
● Self-describing
● Interoperable
● Secure
A data product is:
@lucyfedia
Data Swamps
“If a tree falls in a wood, and
no-one is around to hear it,
does it make a sound?”
- Some philosopher
@lucyfedia
“If someone puts data into
a data lake, and no-one
can find it, is it even there?”
- Me
@lucyfedia
Data Mesh
Architecture
Domain Driven
Design
Self Service
Platforms
@kiran_p
Distributed Pipelines
@kiran_phttps://martinfowler.com/articles/data-monolith-to-mesh.html
@kiran_p
Self service platforms for:
● Storage
● Data pipeline
● Discovery & Catalogue
● Access control
● Archiving
● Encryption
Data Mesh
@kiran_phttps://martinfowler.com/articles/data-monolith-to-mesh.html
Example:
a fictional insurance company
Reduce fraud by
5% per year
Identify
fraudulent
claims
Reduce vehicle damage
claims by 2% per year
Increase conversion
rate by 2%
Predict Weather
Patterns
Upselling
Insurance
Products
The Use-Cases
@lucyfedia
@lucyfedia
Fraud Detection
Customer Claims
Customer
Health
Customer
Vehicle
Claims
Health
Claims
Vehicle
Customer
House
Claims
House
Lake Shore
Marts
Data Lake
(for Raw Data)
@lucyfedia
Fraud Detection
Customer Claims
Customer
Health
Customer
Vehicle
Claims
Health
Claims
Vehicle
Customer
House
Claims
House
Lake Shore
Marts
Data Lake
(for Raw Data)
Upselling
Customer Products
Products
@lucyfedia
Fraud Detection
Customer Claims
Customer
Health
Customer
Vehicle
Claims
Health
Claims
Vehicle
Customer
House
Claims
House
Lake Shore
Marts
Data Lake
(for Raw Data)
Upselling
Customer Products
Alert
Customer Weather
Products Weather
Not a technology problem
Becoming data-driven
is usually an
organisational problem
Work with cross functional
product teams and real use-
cases to deliver business
value.
Build by autonomous cross
functional teams using data
platforms instead of
centralized data lake
Domain data
is a product
Distributed
Data Mesh
Key Takeaways
@kiran_p & @lucyfedia
Kiran Prakash
@kiran_p
Thank you
Lucy Chambers
@lucyfedia
How to Move Beyond a
Monolithic Data Lake to
a Distributed Data Mesh
martinfowler.com/articles/
data-monolith-to-mesh.html
The Curse of the
Data Lake Monster
thoughtworks.com/insights/
blog/curse-data-lake-monster

More Related Content

What's hot

The Role of the Logical Data Fabric in a Unified Platform for Modern Analytics
The Role of the Logical Data Fabric in a Unified Platform for Modern AnalyticsThe Role of the Logical Data Fabric in a Unified Platform for Modern Analytics
The Role of the Logical Data Fabric in a Unified Platform for Modern Analytics
Denodo
 
Powering Self Service Business Intelligence with Hadoop and Data Virtualization
Powering Self Service Business Intelligence with Hadoop and Data VirtualizationPowering Self Service Business Intelligence with Hadoop and Data Virtualization
Powering Self Service Business Intelligence with Hadoop and Data Virtualization
Denodo
 

What's hot (20)

Data virtualization an introduction
Data virtualization an introductionData virtualization an introduction
Data virtualization an introduction
 
Simplifying Cloud Architectures with Data Virtualization
Simplifying Cloud Architectures with Data VirtualizationSimplifying Cloud Architectures with Data Virtualization
Simplifying Cloud Architectures with Data Virtualization
 
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
 
Why Data Virtualization? An Introduction.
Why Data Virtualization? An Introduction.Why Data Virtualization? An Introduction.
Why Data Virtualization? An Introduction.
 
Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019
 
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
 Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
 
Data Virtualization: From Zero to Hero
Data Virtualization: From Zero to HeroData Virtualization: From Zero to Hero
Data Virtualization: From Zero to Hero
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
 
Continuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in ProductionContinuous Intelligence: Keeping your AI Application in Production
Continuous Intelligence: Keeping your AI Application in Production
 
The Role of the Logical Data Fabric in a Unified Platform for Modern Analytics
The Role of the Logical Data Fabric in a Unified Platform for Modern AnalyticsThe Role of the Logical Data Fabric in a Unified Platform for Modern Analytics
The Role of the Logical Data Fabric in a Unified Platform for Modern Analytics
 
Minimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationMinimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data Virtualization
 
An Introduction to Data Virtualization in 2018
An Introduction to Data Virtualization in 2018An Introduction to Data Virtualization in 2018
An Introduction to Data Virtualization in 2018
 
3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio3 Reasons Data Virtualization Matters in Your Portfolio
3 Reasons Data Virtualization Matters in Your Portfolio
 
Powering Self Service Business Intelligence with Hadoop and Data Virtualization
Powering Self Service Business Intelligence with Hadoop and Data VirtualizationPowering Self Service Business Intelligence with Hadoop and Data Virtualization
Powering Self Service Business Intelligence with Hadoop and Data Virtualization
 
Datamesh community meetup 28th jan 2021
Datamesh community meetup 28th jan 2021Datamesh community meetup 28th jan 2021
Datamesh community meetup 28th jan 2021
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow Presentation
 
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
 
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
 

Similar to The Curse of the Data Lake Monster

DataEd Slides: Data Architecture vs. Data Modeling – Compare and Contrast
DataEd Slides: Data Architecture vs. Data Modeling – Compare and ContrastDataEd Slides: Data Architecture vs. Data Modeling – Compare and Contrast
DataEd Slides: Data Architecture vs. Data Modeling – Compare and Contrast
DATAVERSITY
 

Similar to The Curse of the Data Lake Monster (20)

2020 | Metadata Day | LinkedIn
2020 | Metadata Day | LinkedIn2020 | Metadata Day | LinkedIn
2020 | Metadata Day | LinkedIn
 
Data Warehousing Trends
Data Warehousing TrendsData Warehousing Trends
Data Warehousing Trends
 
Big Data overview
Big Data overviewBig Data overview
Big Data overview
 
DataEd Slides: Data Architecture vs. Data Modeling – Compare and Contrast
DataEd Slides: Data Architecture vs. Data Modeling – Compare and ContrastDataEd Slides: Data Architecture vs. Data Modeling – Compare and Contrast
DataEd Slides: Data Architecture vs. Data Modeling – Compare and Contrast
 
Machine Learning - It's the Data, Stupid
Machine Learning - It's the Data, StupidMachine Learning - It's the Data, Stupid
Machine Learning - It's the Data, Stupid
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 
Schema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowSchema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & How
 
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
The New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudThe New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the Cloud
 
Building the Artificially Intelligent Enterprise
Building the Artificially Intelligent EnterpriseBuilding the Artificially Intelligent Enterprise
Building the Artificially Intelligent Enterprise
 
Driving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data AssetsDriving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data Assets
 
ADV Slides: 2021 Trends in Enterprise Analytics
ADV Slides: 2021 Trends in Enterprise AnalyticsADV Slides: 2021 Trends in Enterprise Analytics
ADV Slides: 2021 Trends in Enterprise Analytics
 
Adopting a Logical Data Architecture for Today's Data and Analytics Requirements
Adopting a Logical Data Architecture for Today's Data and Analytics RequirementsAdopting a Logical Data Architecture for Today's Data and Analytics Requirements
Adopting a Logical Data Architecture for Today's Data and Analytics Requirements
 
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
 
Memory Database Technology is Driving a New Cycle of Business Innovation
Memory Database Technology is Driving a New Cycle of Business InnovationMemory Database Technology is Driving a New Cycle of Business Innovation
Memory Database Technology is Driving a New Cycle of Business Innovation
 
Big Data Pitfalls
Big Data PitfallsBig Data Pitfalls
Big Data Pitfalls
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
What is the future of data strategy?
What is the future of data strategy?What is the future of data strategy?
What is the future of data strategy?
 

More from Thoughtworks

More from Thoughtworks (20)

Design System as a Product
Design System as a ProductDesign System as a Product
Design System as a Product
 
Designers, Developers & Dogs
Designers, Developers & DogsDesigners, Developers & Dogs
Designers, Developers & Dogs
 
Cloud-first for fast innovation
Cloud-first for fast innovationCloud-first for fast innovation
Cloud-first for fast innovation
 
More impact with flexible teams
More impact with flexible teamsMore impact with flexible teams
More impact with flexible teams
 
Culture of Innovation
Culture of InnovationCulture of Innovation
Culture of Innovation
 
Dual-Track Agile
Dual-Track AgileDual-Track Agile
Dual-Track Agile
 
Developer Experience
Developer ExperienceDeveloper Experience
Developer Experience
 
When we design together
When we design togetherWhen we design together
When we design together
 
Hardware is hard(er)
Hardware is hard(er)Hardware is hard(er)
Hardware is hard(er)
 
Customer-centric innovation enabled by cloud
 Customer-centric innovation enabled by cloud Customer-centric innovation enabled by cloud
Customer-centric innovation enabled by cloud
 
Amazon's Culture of Innovation
Amazon's Culture of InnovationAmazon's Culture of Innovation
Amazon's Culture of Innovation
 
When in doubt, go live
When in doubt, go liveWhen in doubt, go live
When in doubt, go live
 
Don't cross the Rubicon
Don't cross the RubiconDon't cross the Rubicon
Don't cross the Rubicon
 
Error handling
Error handlingError handling
Error handling
 
Your test coverage is a lie!
Your test coverage is a lie!Your test coverage is a lie!
Your test coverage is a lie!
 
Docker container security
Docker container securityDocker container security
Docker container security
 
Redefining the unit
Redefining the unitRedefining the unit
Redefining the unit
 
Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22
 
A Tribute to Turing
A Tribute to TuringA Tribute to Turing
A Tribute to Turing
 
Rsa maths worked out
Rsa maths worked outRsa maths worked out
Rsa maths worked out
 

Recently uploaded

Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 

Recently uploaded (20)

WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 

The Curse of the Data Lake Monster