SlideShare a Scribd company logo
1 of 56
Time for a
Change Stream
By Leigha Mitchell & Edward Robinson
@LeighaNotLeia @earobinson
Change streams and
using them to version
your data
The hubba stack
● MEANR - Mongo, Express, Angular, Node, React
● Many Services (Payments, Products, Users, etc)
● Three Engineering teams (one Python, two JS)
● AWS, GCP, MongoDB Atlas, RabbitMQ, Redis, etc
● Mongoose, Mongo Native Driver
What is a change stream?
Change streams allow applications to access real-
time data changes without the complexity and risk
of tailing the oplog. Applications can use change
streams to subscribe to all data changes on a
collection and immediately react to them. --
https://docs.mongodb.com/manual/changeStrea
ms/
Why are we here?
Hubba
● 6 year old company
● Networking for Brands and Buyers
● Microservices
● Decided to launch ordering
The exact product you ordered must be
delivered to you
Ordering must haves
This Not this
What exactly is a version?
A particular form of something differing in certain
respects from an earlier form or other forms of the
same type of thing. --
https://www.google.com?q=define+version
1. Options
➔ Not versioning
Previously wasn’t needed, continue to
not us this
➔ Make copy
How do we know what to copy?
➔ In-app versioning
Lean into Mongoose and version the
app
➔ Oplog versioning
Write new service to consume the
oplog
Not versioning
Would have locked
orderable products
Make a copy
Knowing what to copy is
hard
In-app Versioning
Works great, if you’re
building from scratch
model.save()
vs
db.getCollection('users').update({},
{$set: {name: ‘frodo’}})
Denormalization:
Denormalization allows you to avoid some application-
level joins, at the expense of having more complex and
expensive updates. Denormalizing one or more fields
makes sense if those fields are read much more often
than they are updated. --
https://www.mongodb.com/blog/post/6-rules-of-
thumb-for-mongodb-schema-design-part-3
Denormalization Example - Pre bearer of the ring
Users / Hobbits
{
_id : 1,
name : Frodo,
occupation: unemployed
}, {
_id : 2,
name : Sam,
occupation: unemployed
},
Messages
{
_id : 93,
from : 2,
to: 1,
fromOccupation: unemployed,
toOccupation: unemployed,
message: What do you call a hobbit party?
}, {
_id : 94,
from : 1,
to: 2,
fromOccupation: unemployed,
toOccupation: unemployed,
message: A little get together.
}
Denormalization Example - Post bearer of the ring
Users / Hobbits
{
_id : 1,
name : Frodo,
occupation: Bearer of the ring
}, {
_id : 2,
name : Sam,
occupation: Protector of Frodo
},
Messages
{
_id : 93,
from : 2,
to: 1,
fromOccupation: Protector of Frodo,
toOccupation: Bearer of the ring,
message: What do you call a hobbit party?
}, {
_id : 94,
from : 1,
to: 2,
fromOccupation: Bearer of the ring,
toOccupation: Protector of Frodo,
message: A little get together.
}
message.update({to: ObjectId: 1}, {$set: {toOccupation: Bearer of the ring}}, {multi: true})
Oplog Versioning
So what did we choose?
So what did we choose?
None of them!
Along comes:
MongoDB 3.6
Now with
Change Streams!
2. Why Mongo 3.6
➔ Easy Versioning
Ability to do versioning without
significant architecture changes
➔ Raw Queries
Allowed us to use existing raw queries
without altering to support versioning
➔ Many Sources
No front-end data consumers did not
need to be altered
MongoDBWebsite
Product
Service
Orders
Service
History
Service
Hubba Services
Creating a Product
Website
Products
Service
MongoDB
History
Service
Create
Version 0
Create
Version 0
Create
Version 0
Created
Version 0
Created
Version 0
Ordering a Product
Website
Orders
Service
History
Service
MongoDB
Order
Version 0
Order
Version 0
Order
Version 0
Yes You
Can!
Ordered
Version 0
But what about the raw
queries??
But what about the raw
queries??
The version numbers are
Human Readable : SHA
42:c3f42aeb1c3d85b5a1594a5d0a727fcdf58a33ac
But what about the raw
queries??
The version numbers are
Human Readable : SHA
42:c3f42aeb1c3d85b5a1594a5d0a727fcdf58a33ac
But what about the raw
queries??
The version numbers are
Human Readable : SHA
42:c3f42aeb1c3d85b5a1594a5d0a727fcdf58a33ac
Autonomy of a Change Stream
{
_id : { <BSON Object> },
“operationType” : “<operation>“,
“fullDocument” : { <document> },
“ns” : {
“db” : “<database>“,
“coll” : “<collection”
},
“documentKey” : { “_id” : <ObjectId> },
“updateDescription” : {
“updatedFields” : { <document> },
“removedFields” : [ “<field>“, ... ]
}
}
Autonomy of a Change Stream
{
_id : { <BSON Object> },
“operationType” : “<operation>“,
“fullDocument” : { <document> },
“ns” : {
“db” : “<database>“,
“coll” : “<collection”
},
“documentKey” : { “_id” : <ObjectId> },
“updateDescription” : {
“updatedFields” : { <document> },
“removedFields” : [ “<field>“, ... ]
}
}
Insert
Delete
Update
Replace
Invalidate
Autonomy of a Change Stream
{
_id : { <BSON Object> },
“operationType” : “<operation>“,
“fullDocument” : { <document> },
“ns” : {
“db” : “<database>“,
“coll” : “<collection”
},
“documentKey” : { “_id” : <ObjectId> },
“updateDescription” : {
“updatedFields” : { <document> },
“removedFields” : [ “<field>“, ... ]
}
}
Autonomy of a Change Stream
{
_id : { <BSON Object> },
“operationType” : “<operation>“,
“fullDocument” : { <document> },
“ns” : {
“db” : “<database>“,
“coll” : “<collection>”
},
“documentKey” : { “_id” : <ObjectId> },
“updateDescription” : {
“updatedFields” : { <document> },
“removedFields” : [ “<field>“, ... ]
}
}
Autonomy of a Change Stream
{
_id : { <BSON Object> },
“operationType” : “<operation>“,
“fullDocument” : { <document> },
“ns” : {
“db” : “<database>“,
“coll” : “<collection”
},
“documentKey” : { “_id” : <ObjectId> },
“updateDescription” : {
“updatedFields” : { <document> },
“removedFields” : [ “<field>“, ... ]
}
}
Autonomy of a Change Stream
{
_id : { <BSON Object> },
“operationType” : “<operation>“,
“fullDocument” : { <document> },
“ns” : {
“db” : “<database>“,
“coll” : “<collection”
},
“documentKey” : { “_id” : <ObjectId> },
“updateDescription” : {
“updatedFields” : { <document> },
“removedFields” : [ “<field>“, ... ]
}
}
Change Streams in Action
Insert a message
Mongo Query
db.getCollection('messages').insert({from: 2, to: 1, fromOccupation: 'unemployed', toOccupation:
'unemployed', message: 'What do you call a habbit party?', version: 0})
Document
{
"_id" :
ObjectId("5b0b6c7adce137d2655f7efe"),
"from" : 2,
"to" : 1,
"fromOccupation" : "unemployed",
"toOccupation" : "unemployed",
"message" : "What do you call a habbit
party?",
"version" : 0
}
Change Stream
{
"_id":{
"_data":"glsLbHoAAAABRmRfaWQAZFsLbHrc
4TfSZV9+/gBaEASgyrHmfadIaLKbJfTNB8BgB
A=="
},
"operationType":"insert",
"fullDocument":{...},
"ns":{ "db":"hubba", "coll":"messages" },
"documentKey":{
"_id":"5b0b6c7adce137d2655f7efe"
}
Insert History Document
Update a message
Mongo Query
db.getCollection('messages').update({_id: ObjectId("5b0b6c7adce137d2655f7efe")}, {$set: {version: 1,
message : "What do you call a hobbit party?"}})
Document
{
"_id" :
ObjectId("5b0b6c7adce137d2655f7efe"),
"from" : 2,
"to" : 1,
"fromOccupation" : "unemployed",
"toOccupation" : "unemployed",
"message" : "What do you call a hobbit
party?",
"version" : 1
}
Change Stream
{
"_id": {…}, "operationType":"update",
"fullDocument":{...},
"Ns":{"db":"hubba","coll":"messages"}, "documentKey":
{... },
"updateDescription":{
"updatedFields":{
"message":"What do you call a hobbit party?",
"version":1
},
"removedFields":{}
}
}
Update History Document
Implementation
- Proof of concept
- Used full document to consume all changes
from Mongo and write them to our DB
- Successfully mirrored actual documents
Gotcha #1
Principle of eventual consistency
- Full document wouldn’t always represent the
document that existed in the DB
Lesson: If you’re using change streams for event sourcing,
you can source attributes out of the update description, but
not out of the full document
Solution:
Be more like a database
Change Streams are guaranteed to be delivered in
the order that the change happens. Source the
events as they happen, and apply update
description to the previous version
Gotcha #2
How do we Bootstrap versions into DB
- With lack of insert events for all records, how do we get
them into the DB?
Solution:
Use full document
If we can’t find a previous event, just use full
document
Gotcha #3
How do you ensure you record each write only once?
Solution:
Change streams guarantee that each
change will be delivered in order once
and only once.
Gotcha #4
Large documents can cause issues
Solution:
Don’t do it.
Seriously we don’t. 16Mb is a lot of data!
Gotcha #5
We fell off the Oplog
If your oplog can hold 100 documents, and I fill it up with
messages, you will not be able to resume the products
change stream
Solution:
We got a bigger oplog and update our
versioned collections
We build a script to run every hour and randomly
update one of the documents in every versioned
collection
Cons
➔ Histories has no context
Histories is unable to validate the data
it gets, its just a blind data store
➔ Spinning up a whole service
We could have solved this with in app
versioning, now we maintain an extra
service
➔ Issues upgrading
We had a few road bumps with
performance when we first released
mongo 3.6 + the new drivers
Benefits
➔ Histories is isolated
As long as our data is persisted we
have a history of it
➔ We get to keep our raw queries
Our denormalization strategies
continue to work
➔ Language support
We do not need to implement a history
support for every language we use, just
a version generator, if we want access.
Metrics
Conclusion:
1. Change streams are a great way to follow updates to
your documents
2. Using change streams for event sourcing would be
amazing
3. If you are versioning data in a legacy app, change
streams may be for you
Demo
Thank you!

More Related Content

What's hot

Java Persistence Frameworks for MongoDB
Java Persistence Frameworks for MongoDBJava Persistence Frameworks for MongoDB
Java Persistence Frameworks for MongoDB
MongoDB
 
5952 database systems administration (comp 1011.1)-cw1
5952   database systems administration (comp 1011.1)-cw15952   database systems administration (comp 1011.1)-cw1
5952 database systems administration (comp 1011.1)-cw1
saeedkhan841514
 

What's hot (18)

Getting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJSGetting Started with MongoDB and NodeJS
Getting Started with MongoDB and NodeJS
 
Building Apps with MongoDB
Building Apps with MongoDBBuilding Apps with MongoDB
Building Apps with MongoDB
 
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
 
Java development with MongoDB
Java development with MongoDBJava development with MongoDB
Java development with MongoDB
 
Joins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation EnhancementsJoins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation Enhancements
 
MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know MongoDB + Java - Everything you need to know
MongoDB + Java - Everything you need to know
 
Morphia, Spring Data & Co.
Morphia, Spring Data & Co.Morphia, Spring Data & Co.
Morphia, Spring Data & Co.
 
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
 
MongoDB crud
MongoDB crudMongoDB crud
MongoDB crud
 
Functions
FunctionsFunctions
Functions
 
MongoDB Stich Overview
MongoDB Stich OverviewMongoDB Stich Overview
MongoDB Stich Overview
 
MongoDB and its usage
MongoDB and its usageMongoDB and its usage
MongoDB and its usage
 
Java Persistence Frameworks for MongoDB
Java Persistence Frameworks for MongoDBJava Persistence Frameworks for MongoDB
Java Persistence Frameworks for MongoDB
 
Data Management 3: Bulletproof Data Management
Data Management 3: Bulletproof Data ManagementData Management 3: Bulletproof Data Management
Data Management 3: Bulletproof Data Management
 
Webinar: Transitioning from SQL to MongoDB
Webinar: Transitioning from SQL to MongoDBWebinar: Transitioning from SQL to MongoDB
Webinar: Transitioning from SQL to MongoDB
 
Back to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkBack to Basics Webinar 5: Introduction to the Aggregation Framework
Back to Basics Webinar 5: Introduction to the Aggregation Framework
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
5952 database systems administration (comp 1011.1)-cw1
5952   database systems administration (comp 1011.1)-cw15952   database systems administration (comp 1011.1)-cw1
5952 database systems administration (comp 1011.1)-cw1
 

Similar to MongoDB World 2018: Time for a Change Stream - Using MongoDB Change Streams to Version Your Database

Eagle6 mongo dc revised
Eagle6 mongo dc revisedEagle6 mongo dc revised
Eagle6 mongo dc revised
MongoDB
 

Similar to MongoDB World 2018: Time for a Change Stream - Using MongoDB Change Streams to Version Your Database (20)

Grokking TechTalk 9 - Building a realtime & offline editing service from scra...
Grokking TechTalk 9 - Building a realtime & offline editing service from scra...Grokking TechTalk 9 - Building a realtime & offline editing service from scra...
Grokking TechTalk 9 - Building a realtime & offline editing service from scra...
 
Eagle6 mongo dc revised
Eagle6 mongo dc revisedEagle6 mongo dc revised
Eagle6 mongo dc revised
 
Eagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessEagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational Awareness
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
 
Grokking #9: Building a real-time and offline editing service with Couchbase
Grokking #9: Building a real-time and offline editing service with CouchbaseGrokking #9: Building a real-time and offline editing service with Couchbase
Grokking #9: Building a real-time and offline editing service with Couchbase
 
A Brief MongoDB Intro
A Brief MongoDB IntroA Brief MongoDB Intro
A Brief MongoDB Intro
 
Building Services With gRPC, Docker and Go
Building Services With gRPC, Docker and GoBuilding Services With gRPC, Docker and Go
Building Services With gRPC, Docker and Go
 
Api's and ember js
Api's and ember jsApi's and ember js
Api's and ember js
 
Back to Basics 2017: Mí primera aplicación MongoDB
Back to Basics 2017: Mí primera aplicación MongoDBBack to Basics 2017: Mí primera aplicación MongoDB
Back to Basics 2017: Mí primera aplicación MongoDB
 
Document Validation in MongoDB 3.2
Document Validation in MongoDB 3.2Document Validation in MongoDB 3.2
Document Validation in MongoDB 3.2
 
"Real-time Collaborative Text Editing on Grammarly’s Front-End Team" Oleksii...
 "Real-time Collaborative Text Editing on Grammarly’s Front-End Team" Oleksii... "Real-time Collaborative Text Editing on Grammarly’s Front-End Team" Oleksii...
"Real-time Collaborative Text Editing on Grammarly’s Front-End Team" Oleksii...
 
MongoDB.local Austin 2018: Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch A...
MongoDB.local Austin 2018:  Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch A...MongoDB.local Austin 2018:  Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch A...
MongoDB.local Austin 2018: Ch-Ch-Ch-Ch-Changes: Taking Your MongoDB Stitch A...
 
Mongoose and MongoDB 101
Mongoose and MongoDB 101Mongoose and MongoDB 101
Mongoose and MongoDB 101
 
Insight on MongoDB Change Stream - Abhishek.D, Mydbops Team
Insight on MongoDB Change Stream - Abhishek.D, Mydbops TeamInsight on MongoDB Change Stream - Abhishek.D, Mydbops Team
Insight on MongoDB Change Stream - Abhishek.D, Mydbops Team
 
Keynote - Speaker: Grigori Melnik
Keynote - Speaker: Grigori Melnik Keynote - Speaker: Grigori Melnik
Keynote - Speaker: Grigori Melnik
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
 
MongoDB Stitch Introduction
MongoDB Stitch IntroductionMongoDB Stitch Introduction
MongoDB Stitch Introduction
 
Crafting Evolvable Api Responses
Crafting Evolvable Api ResponsesCrafting Evolvable Api Responses
Crafting Evolvable Api Responses
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
 

More from MongoDB

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

MongoDB World 2018: Time for a Change Stream - Using MongoDB Change Streams to Version Your Database

  • 1. Time for a Change Stream By Leigha Mitchell & Edward Robinson @LeighaNotLeia @earobinson
  • 2. Change streams and using them to version your data
  • 3.
  • 4. The hubba stack ● MEANR - Mongo, Express, Angular, Node, React ● Many Services (Payments, Products, Users, etc) ● Three Engineering teams (one Python, two JS) ● AWS, GCP, MongoDB Atlas, RabbitMQ, Redis, etc ● Mongoose, Mongo Native Driver
  • 5. What is a change stream? Change streams allow applications to access real- time data changes without the complexity and risk of tailing the oplog. Applications can use change streams to subscribe to all data changes on a collection and immediately react to them. -- https://docs.mongodb.com/manual/changeStrea ms/
  • 6. Why are we here? Hubba ● 6 year old company ● Networking for Brands and Buyers ● Microservices ● Decided to launch ordering
  • 7. The exact product you ordered must be delivered to you Ordering must haves This Not this
  • 8. What exactly is a version? A particular form of something differing in certain respects from an earlier form or other forms of the same type of thing. -- https://www.google.com?q=define+version
  • 9. 1. Options ➔ Not versioning Previously wasn’t needed, continue to not us this ➔ Make copy How do we know what to copy? ➔ In-app versioning Lean into Mongoose and version the app ➔ Oplog versioning Write new service to consume the oplog
  • 10. Not versioning Would have locked orderable products
  • 11. Make a copy Knowing what to copy is hard
  • 12. In-app Versioning Works great, if you’re building from scratch
  • 14. Denormalization: Denormalization allows you to avoid some application- level joins, at the expense of having more complex and expensive updates. Denormalizing one or more fields makes sense if those fields are read much more often than they are updated. -- https://www.mongodb.com/blog/post/6-rules-of- thumb-for-mongodb-schema-design-part-3
  • 15. Denormalization Example - Pre bearer of the ring Users / Hobbits { _id : 1, name : Frodo, occupation: unemployed }, { _id : 2, name : Sam, occupation: unemployed }, Messages { _id : 93, from : 2, to: 1, fromOccupation: unemployed, toOccupation: unemployed, message: What do you call a hobbit party? }, { _id : 94, from : 1, to: 2, fromOccupation: unemployed, toOccupation: unemployed, message: A little get together. }
  • 16. Denormalization Example - Post bearer of the ring Users / Hobbits { _id : 1, name : Frodo, occupation: Bearer of the ring }, { _id : 2, name : Sam, occupation: Protector of Frodo }, Messages { _id : 93, from : 2, to: 1, fromOccupation: Protector of Frodo, toOccupation: Bearer of the ring, message: What do you call a hobbit party? }, { _id : 94, from : 1, to: 2, fromOccupation: Bearer of the ring, toOccupation: Protector of Frodo, message: A little get together. } message.update({to: ObjectId: 1}, {$set: {toOccupation: Bearer of the ring}}, {multi: true})
  • 18. So what did we choose?
  • 19. So what did we choose? None of them!
  • 20. Along comes: MongoDB 3.6 Now with Change Streams!
  • 21. 2. Why Mongo 3.6 ➔ Easy Versioning Ability to do versioning without significant architecture changes ➔ Raw Queries Allowed us to use existing raw queries without altering to support versioning ➔ Many Sources No front-end data consumers did not need to be altered
  • 23. Creating a Product Website Products Service MongoDB History Service Create Version 0 Create Version 0 Create Version 0 Created Version 0 Created Version 0
  • 24. Ordering a Product Website Orders Service History Service MongoDB Order Version 0 Order Version 0 Order Version 0 Yes You Can! Ordered Version 0
  • 25. But what about the raw queries??
  • 26. But what about the raw queries?? The version numbers are Human Readable : SHA 42:c3f42aeb1c3d85b5a1594a5d0a727fcdf58a33ac
  • 27. But what about the raw queries?? The version numbers are Human Readable : SHA 42:c3f42aeb1c3d85b5a1594a5d0a727fcdf58a33ac
  • 28. But what about the raw queries?? The version numbers are Human Readable : SHA 42:c3f42aeb1c3d85b5a1594a5d0a727fcdf58a33ac
  • 29. Autonomy of a Change Stream { _id : { <BSON Object> }, “operationType” : “<operation>“, “fullDocument” : { <document> }, “ns” : { “db” : “<database>“, “coll” : “<collection” }, “documentKey” : { “_id” : <ObjectId> }, “updateDescription” : { “updatedFields” : { <document> }, “removedFields” : [ “<field>“, ... ] } }
  • 30. Autonomy of a Change Stream { _id : { <BSON Object> }, “operationType” : “<operation>“, “fullDocument” : { <document> }, “ns” : { “db” : “<database>“, “coll” : “<collection” }, “documentKey” : { “_id” : <ObjectId> }, “updateDescription” : { “updatedFields” : { <document> }, “removedFields” : [ “<field>“, ... ] } } Insert Delete Update Replace Invalidate
  • 31. Autonomy of a Change Stream { _id : { <BSON Object> }, “operationType” : “<operation>“, “fullDocument” : { <document> }, “ns” : { “db” : “<database>“, “coll” : “<collection” }, “documentKey” : { “_id” : <ObjectId> }, “updateDescription” : { “updatedFields” : { <document> }, “removedFields” : [ “<field>“, ... ] } }
  • 32. Autonomy of a Change Stream { _id : { <BSON Object> }, “operationType” : “<operation>“, “fullDocument” : { <document> }, “ns” : { “db” : “<database>“, “coll” : “<collection>” }, “documentKey” : { “_id” : <ObjectId> }, “updateDescription” : { “updatedFields” : { <document> }, “removedFields” : [ “<field>“, ... ] } }
  • 33. Autonomy of a Change Stream { _id : { <BSON Object> }, “operationType” : “<operation>“, “fullDocument” : { <document> }, “ns” : { “db” : “<database>“, “coll” : “<collection” }, “documentKey” : { “_id” : <ObjectId> }, “updateDescription” : { “updatedFields” : { <document> }, “removedFields” : [ “<field>“, ... ] } }
  • 34. Autonomy of a Change Stream { _id : { <BSON Object> }, “operationType” : “<operation>“, “fullDocument” : { <document> }, “ns” : { “db” : “<database>“, “coll” : “<collection” }, “documentKey” : { “_id” : <ObjectId> }, “updateDescription” : { “updatedFields” : { <document> }, “removedFields” : [ “<field>“, ... ] } }
  • 36. Insert a message Mongo Query db.getCollection('messages').insert({from: 2, to: 1, fromOccupation: 'unemployed', toOccupation: 'unemployed', message: 'What do you call a habbit party?', version: 0}) Document { "_id" : ObjectId("5b0b6c7adce137d2655f7efe"), "from" : 2, "to" : 1, "fromOccupation" : "unemployed", "toOccupation" : "unemployed", "message" : "What do you call a habbit party?", "version" : 0 } Change Stream { "_id":{ "_data":"glsLbHoAAAABRmRfaWQAZFsLbHrc 4TfSZV9+/gBaEASgyrHmfadIaLKbJfTNB8BgB A==" }, "operationType":"insert", "fullDocument":{...}, "ns":{ "db":"hubba", "coll":"messages" }, "documentKey":{ "_id":"5b0b6c7adce137d2655f7efe" }
  • 38. Update a message Mongo Query db.getCollection('messages').update({_id: ObjectId("5b0b6c7adce137d2655f7efe")}, {$set: {version: 1, message : "What do you call a hobbit party?"}}) Document { "_id" : ObjectId("5b0b6c7adce137d2655f7efe"), "from" : 2, "to" : 1, "fromOccupation" : "unemployed", "toOccupation" : "unemployed", "message" : "What do you call a hobbit party?", "version" : 1 } Change Stream { "_id": {…}, "operationType":"update", "fullDocument":{...}, "Ns":{"db":"hubba","coll":"messages"}, "documentKey": {... }, "updateDescription":{ "updatedFields":{ "message":"What do you call a hobbit party?", "version":1 }, "removedFields":{} } }
  • 40. Implementation - Proof of concept - Used full document to consume all changes from Mongo and write them to our DB - Successfully mirrored actual documents
  • 41. Gotcha #1 Principle of eventual consistency - Full document wouldn’t always represent the document that existed in the DB Lesson: If you’re using change streams for event sourcing, you can source attributes out of the update description, but not out of the full document
  • 42. Solution: Be more like a database Change Streams are guaranteed to be delivered in the order that the change happens. Source the events as they happen, and apply update description to the previous version
  • 43. Gotcha #2 How do we Bootstrap versions into DB - With lack of insert events for all records, how do we get them into the DB?
  • 44. Solution: Use full document If we can’t find a previous event, just use full document
  • 45. Gotcha #3 How do you ensure you record each write only once?
  • 46. Solution: Change streams guarantee that each change will be delivered in order once and only once.
  • 47. Gotcha #4 Large documents can cause issues
  • 48. Solution: Don’t do it. Seriously we don’t. 16Mb is a lot of data!
  • 49. Gotcha #5 We fell off the Oplog If your oplog can hold 100 documents, and I fill it up with messages, you will not be able to resume the products change stream
  • 50. Solution: We got a bigger oplog and update our versioned collections We build a script to run every hour and randomly update one of the documents in every versioned collection
  • 51. Cons ➔ Histories has no context Histories is unable to validate the data it gets, its just a blind data store ➔ Spinning up a whole service We could have solved this with in app versioning, now we maintain an extra service ➔ Issues upgrading We had a few road bumps with performance when we first released mongo 3.6 + the new drivers
  • 52. Benefits ➔ Histories is isolated As long as our data is persisted we have a history of it ➔ We get to keep our raw queries Our denormalization strategies continue to work ➔ Language support We do not need to implement a history support for every language we use, just a version generator, if we want access.
  • 54. Conclusion: 1. Change streams are a great way to follow updates to your documents 2. Using change streams for event sourcing would be amazing 3. If you are versioning data in a legacy app, change streams may be for you
  • 55. Demo

Editor's Notes

  1. L: Introductions to who we are My name is Leigha, I am a developer at hubba, this is Edward, also a developer at hubba
  2. L: This presentation will be about mongo change streams, and how we use them at hubba to version our data
  3. L
  4. L
  5. ED: Change streams where introduced in mongo 3.6, they are a replacement for using the oplog to subscribe to changes to your documents in your database they can be used for all sorts of fancy things. At hubba we currently use them to version our data, but would like to use them for event sourcing moving forward Better explaination about change streams Explain the oplog
  6. L: Explain Hubba in terms of buyers and brands, introducing ordering We need to focus more on hubba and why its important
  7. L: The key part of this is that the exact product you ordered must be delivered to you, so when the product description changes, we need to know that the description changed, and capture that in our system.
  8. ED: A version is a reference to a particular form of something, this makes versioning the act of creating those references to a particular form of something. As developers we all use versioning every day, every time you use git, push a release or install a package.
  9. L: We learned about the oplog last time at mongo world Before this, what is versioning
  10. L: Maybe not the best user experience
  11. ED: When you land on a page to order a product, that product can change before the user adds it to the card
  12. L: Different teams, using different ORMs, raw mongo queries
  13. L: Different teams, using different ORMs, raw mongo queries
  14. Leigha: Help us make this slide better!!!!!!!!!!!! Explain Raw mongo queries Highlight keywords here!
  15. L:
  16. L:
  17. ED: We learned about a few people using the oplog for event sourcing and version last year at mongo world! Mongo DB world Rolled back
  18. L:
  19. L:
  20. L: we were on Mongo 3.4 at the time
  21. L:
  22. ED: Our Changestream design Explain architecture Explain process of adding product to your cart
  23. ED:6 fields update
  24. ED:6 fields
  25. L: Show an example of a version number Show an example of incrementing the version number
  26. L: Show an example of a version number Show an example of incrementing the version number We need to explain this better! (I need to set up this up better)
  27. L: Show an example of a version number Show an example of incrementing the version number We need to explain this better! (I need to set up this up better)
  28. L: Show an example of a version number Show an example of incrementing the version number We need to explain this better! (I need to set up this up better)
  29. ED:6 fields
  30. Delete Explain invalidate better (we dont do it, this happens when a collection is dropped)
  31. Highlight the info
  32. E
  33. ED
  34. L: This happens when a document is updated very quickly many times. This is because Change streams can be configured to only send when the document is durable, however the full document only comes from a node in your cluster Explain more, how it affected our user History service would corrupt documents because it would overlap
  35. Quarem
  36. There’s previous data, since we can’t rely on the insert events for this older data because it doesn’t have them, how are we going to get this data into our shiny new History collection? Explain more, how did it effect us, implications
  37. E: How did bootstraping affect us
  38. But how are we sure that it’s only going to write once and in order?
  39. But what are the chances that the documents will be too large and cause issues?
  40. Example of our large document solution Phrase this as it’s super unlikely because you have so much space
  41. ED
  42. L
  43. ED
  44. ED
  45. Leigha