SlideShare a Scribd company logo
1 of 49
Data Modeling on NoSQL
Bryce Cottam
Principal Architect, Think Big a Teradata Company
• Where we came from (RDBMS Modeling)
• Migrate Existing Data Model to NoSQL
• Questions
Agenda
• Migrate a SQL based solution to NoSQL
• NoSQL Smack-Down (Battle of the NoSQL Bands)
Anti-Agenda
What we are NOT going to cover:
Where We Came From
(RDBMS Modeling)
SQL Backdrop
123 Tony Soprano true 1963-04-15
124 Carmella Soprano false 1968-12-02
125 Johnny Sacrimoni true 1959-01-11
158 Paulie Gualtieri false 1960-08-04
159 Silvio Dante false 1965-10-11
162 Ralph Cifaretto false 1969-03-28
164 Christopher Moltisanti false 1974-01-11
165 Adriana La Cerva false 1976-11-02
• Column Order
• Column Names
• Column Width
• Data Types
Metadata Raw Data
• Save space
• Consistent format
• Familiar syntax (ANSI SQL Standard)
Issues at Scale
UI Presentation
UI Presentation
UI Presentation
Where We Came From
Auction
User Bid
Payment
id
email
name
profile_image_url
access_level
created_date
id
user_id
auction_id
amount
timestamp
id
title
image_url
current_price
high_bidder
end_time
id
auction_id
timestamp
card_type
confirmation_number
Data Models
public class User {
private long id;
private String email;
private String name;
private String profileImageUrl;
// AccessLevel is an enum
private AccessLevel accessLevel;
private Date createdDate;
private List<Auction> auctions;
private List<Bid> bids;
...
}
public class Auction {
private long id;
private String title;
private String imageUrl;
private BigDecimal currentPrice;
private User highBidder;
private Date endTime;
private List<Bid> bids;
private Payment payment;
...
}
public class Bid {
private long id;
private User user;
private Auction auction;
private BigDecimal amount;
private Date timestamp;
...
}
public class Payment {
private long id;
private Auction auction;
private Date timestamp;
// Visa, MasterCard, AmEx etc.
private String cardType;
private String confirmationNumber;
...
}
Support Queries
select a.*, b.*
from auction a
join bid b
on a.id = b.auction_id
where a.id = 12345
order by b.timestamp desc
• Either manual SQL or ORM generated SQL will wind up joining a few tables to get the
desired results
• Joins are not supported by most NoSQL solutions
Get all Bids for a given Auction:
Support Queries
select count(*) from bid where user_id = 554422
• Aggregates in NoSQL are usually not supported
• If they are supported, they often have performance or memory issues
select avg(current_price) from auction
select u.name, max(s.bid_count) as bids
from (select user_id, count(*) as bid_count
from bid group by user_id) as s
join user u on u.id = s.user_id
Count all Bids for a User:
Get average final price of all Auctions:
Get the User with the most Bids:
Adapt to your Data Store
Model
• Most web app developers think in terms of tables, columns, queries
• Many times the schema is simply mirrored in the application layer model objects
• (Not a bad thing, but hard to change)
• The most successful/scalable applications embrace the features and limitations of their
chosen datastore
Schema DAO Application
Patterns defined here effect application
behavior for data interaction
Model
Access PatternStorage Details
Model
Encouraging Scalable Access Patterns
public class BidDao {
// Common API structure, loads all in memory
// Also requires that the full User object is available
public List<Bid> getBids(User user) {…}
...
}
public class BidDao {
// Paging is a good option to avoid memory issues
public List<Bid> getBids(String userId, int offset, int limit) {…}
// Streaming APIs encourages streaming processing
public Iterator<Bid> getBids(String userId) {…}
...
}
Common:
Alternative:
Encouraging Scalable Access Patterns
DAO
DAO
Common:
Streaming:
Small buffer
Memory Required
DAO
Paging: Memory Required
…
Garbage Collected
…
Memory Required
Adapt to your Data Store
Application
SQL-NoSQL Adapter
DAO DAO DAO
Danger!!
If you mask your true
datastore semantics,
you risk your
scalability
• DataNucleus is a good option if used with discipline
• Provides JDO/JPA support
NoSQL Store
Top level concepts to embrace
• Denormalization
• Intelligent Key Design
• Counters
• Sharding
Denormalization
Identify Conceptually Immutable Fields
public class User {
private long id;
private String email;
private String name;
private String profileImageUrl;
// AccessLevel is an enum
private AccessLevel accessLevel;
private Date createdDate;
private List<Auction> auctions;
private List<Bid> bids;
...
}
public class Auction {
private long id;
private String title;
private String imageUrl;
private BigDecimal currentPrice;
private User highBidder;
private Date endTime;
private List<Bid> bids;
private Payment payment;
...
}
public class UserReference {
private long id;
private String name;
private String profileImageUrl;
...
}
public class AuctionReference {
private long id;
private String title;
private String imageUrl;
...
}
Modified Data Structures
public class User {
// Changed ids to Strings
// (more on that soon)
private String id;
private String email;
private String name;
private String profileImageUrl;
private AccessLevel accessLevel;
private Date createdDate;
private List<Auction> auctions;
private List<Bid> bids;
...
}
public class Auction {
private String id;
private String title;
private String imageUrl;
private BigDecimal currentPrice;
private UserReference highBidder;
private Date endTime;
private List<Bid> bids;
private Payment payment;
...
}
public class Bid {
private String id;
private UserReference user;
private AuctionReference auction;
private BigDecimal amount;
private Date timestamp;
...
}
public class Payment {
private String id;
private AuctionReference auction;
private Date timestamp;
// Visa, MasterCard, AmEx etc.
private String cardType;
private String confirmationNumber;
...
}
Modified Data Models
public class Bid {
// the @Embedded annotation (both JDO and JPA)
// indicates that this is not an FK relationship:
@Embedded
private UserReference user;
@Embedded
private AuctionReference auction;
...
}
…/d288-4af3-8821-27a37269ec0c {amount:”14.00”, user_id:”abc123”, user_name:”Ralph Cifaretto”, user_profile_image:”http://…”, …}
…/d288-4af3-8821-27a37283af10 {amount:”240.00”, user_id:”abc123”, user_name:”Ralph Cifaretto”, user_profile_image:”http://…”, …}
Bid
id
user_id
user_name
user_profile_image
amount
timestamp
auction_title
…
Under the hood in the data store:
• JDO/JPA configuration is certainly not required
• We’re making a copy of the conceptually immutable properties of the user
• When we read a Bid record now, we don’t need to go fetch the User record
• Nor do we need a join
Manual Marshaling
public class BidDao {
public Bid read(String id) {
// This is an HBase-like API, but the idea is the same for most all
// NoSQL datastore native APIs:
Result result = openConnection().get(“bid”, id);
Bid bid = new Bid();
bid.setId(result.getValue(“id”));
...
String userId = result.getValue(“user_id”);
String userName = result.getValue(“user_name”);
String profileUrl = result.getValue(“user_profile_image”);
UserReference user = new UserReference(userId, userName, profileUrl);
bid.setUser(user);
...
return bid;
}
...
}
// To access user information:
UserReference user = bid.getUser();
String userName = user.getName();
We support access pattern without joins
auction_title
auction_title
auction_title
auction_title
auction_image
.somg
Bid
id
user_id
user_name
user_profile_image
amount
timestamp
auction_id
auction_title
auction_image_url
Click on Auction
image or name
and go to details
for Auction
Data is duplicated many (many) times
Bid
id amount user_id user_name user_profile_image auction_id auction_title . . .
124 14.00 5432 Gustavo ‘Gus’ Fring http://nj.boss.com… 555111222 Barrel Methylamine . . .
125 13.00 1234 Walter White http://dead.users… 555111222 Barrel Methylamine . . .
126 12.00 2223 Hank Schrader http://dea.bro.com… 555111222 Barrel Methylamine . . .
127 11.00 1234 Walter White http://dead.users… 555111222 Barrel Methylamine . . .
128 10.00 1112 Jesse Pinkman http://facebook.com… 555111222 Barrel Methylamine . . .
129 9.00 2223 Hank Schrader http://dea.bro.com… 555111222 Barrel Methylamine . . .
130 8.00 1234 Walter White http://dead.users… 555111222 Barrel Methylamine . . .
131 7.00 1112 Jesse Pinkman http://facebook.com… 555111222 Barrel Methylamine . . .
132 6.00 1234 Walter White http://dead.users… 555111222 Barrel Methylamine . . .
User
id name profile_image email created_date . . .
5432 Gustavo ‘Gus’ Fring http://nj.boss.com… tony@breakingbad.com 2008-01-01 . . .
1234 Walter White http://chem.users… walter@breakingbad.com 2008-02-02 . . .
2223 Hank Schrader http://dea.bro.com… hank@breakingbad.com 2009-01-12 . . .
1112 Jesse Pinkman http://facebook.com… jessie@breakingbad.com 2008-11-16 . . .
What about updates?
Backend
Node(s)
Async Request to
change all Bid
records related to
this user
Name
Change
Request
Edge
Node
Time Line
NoSQL
Response
sent to user
Use workers to
modify affected
records
Possibly minutes
Denormalization Observations
• We don’t always need ACID compliance
• Strict FK enforcement not always required
• MySQL’s MyISAM storage works fine for many situations
• Users are getting used to change latency
• There is a trade off between horizontal scalability in your app
and patterns we’ve been trained to rely on
Intelligent Key Design
Sample NoSQL Storage Layout
Server 1
key001 ...data...
key002 ...data...
key003 ...data...
key004 ...data...
key005 ...data...
key006 ...data...
key007 ...data...
key008 ...data...
key009 ...data...
key010 ...data...
…
Server 2
key011 ...data...
key012 ...data...
key013 ...data...
key014 ...data...
key015 ...data...
key016 ...data...
key017 ...data...
key018 ...data...
key019 ...data...
key020 ...data...
Server 3
key021 ...data...
key022 ...data...
key023 ...data...
key024 ...data...
key025 ...data...
key026 ...data...
key027 ...data...
key028 ...data...
key029 ...data...
key030 ...data...
Server n
key091 ...data...
key092 ...data...
key093 ...data...
key094 ...data...
key095 ...data...
key096 ...data...
key097 ...data...
key098 ...data...
key099 ...data...
key100 ...data...
• This scan is “get everything from key16 through key22”
• A key-range scan returns N rows in linear time O(N) regardless of the number of rows in the table
• This is not true for relational databases
Intelligent Key Design
abc123 {…}
abc124 {name:”Tony Soprano”, createdDate:”2011-01-12”, email:”tony@sopranos.com”, role:”BOSS”}
abc125 {name:”Salvator Bonpensiero”, createdDate:”2014-10-02”, email:”bonpensiero@sopranos.com”, role:”CAPO”}
abc126 {name:”Christopher Moltisanti”, createdDate:”2012-10-02”, email:”christopher@sopranos.com”, role:”SOLDIER”}
abc2 {name:”Carmella Soprano”, createdDate:”2011-10-02”, email:”carmella@sopranos.com”, favoriateCar:”BMW”}
abc20 {name:”Meadow Soprano”, createdDate:”2012-01-02”, email:”meadow@sopranos.com”, favoriateCar:12.25}
abc21 {someField:”some value”, averageScore:5.75, someOtherDate:”2011-10-02”}
abc22 {…}
bcd1 {…}
bcd12 {…}
Key ordering is lexical
Records can be
different schemas
Ascending Timestamp
Bid/2014-10-26T09:00:00.000 {…}
Bid/2014-10-26T09:00:12.975 {…}
Bid/2014-10-26T09:00:14.221 {…}
Bid/2014-10-26T09:00:18.005 {…}
Bid/2014-10-26T09:00:35.572 {…}
Bid/2014-10-26T09:00:40.003 {…}
Bid/2014-10-26T09:00:41.123 {…}
Bid/2014-10-26T09:00:41.124 {…}
Bid/2014-10-26T09:00:41.150 {…}
Bid/2014-10-26T09:00:41.218 {…}
yyyy-MM-ddTHH:mm:ss.SSS
is a pretty standard timestamp and lexically orders chronologically
• Great for time-series data
• Timeline tracking (viewing data in the order it was processed etc.)
OlderNewer
UI Presentation
Descending Order
UI Presentation
Descending Order
Descending Timestamp
Bid/9223370622642200431 {…}
Bid/9223370622642200478 {…}
Bid/9223370622642200512 {…}
Bid/9223370622642203021 {…}
Bid/9223370622642203897 {…}
Bid/9223370622642204112 {…}
Bid/9223370622642204559 {…}
Bid/9223370622642207054 {…}
Bid/9223370622642215431 {…}
Bid/9223370622642235500 {…}
public class User {
// This will yield some ridiculous value like: 9223370622642200431
// Number of millseconds in a year: 3153600000
// This computation will reach 0 in the year 292,471,163
long descendingTimestamp = Long.MAX_VALUE – System.currentTimeMillis();
}
NewerOlder
Descending Timestamp
Bid/9223370622642200431 {… action_id:”12345” …}
Bid/9223370622642200478 {… action_id:”54321” …}
Bid/9223370622642200512 {… action_id:”12345” …}
Bid/9223370622642203021 {… action_id:”22222” …}
Bid/9223370622642203897 {… action_id:”22233” …}
Bid/9223370622642204112 {… action_id:”12345” …}
Bid/9223370622642204559 {… action_id:”22233” …}
Bid/9223370622642207054 {… action_id:”54321” …}
Bid/9223370622642215431 {… action_id:”54321” …}
Bid/9223370622642235500 {… action_id:”12345” …}
1
2
3
4
5
Start with ”Bid/”
Stop after 5 rows
5 most recent bids
• Known as a “range scan”
• Very easy to start with some prefix and read for N records
• Complexity stays constant for top 5 bids no matter how many bids are in the system
Descending Timestamp
Auction/11222/Bid/9223370622642203021 {… action_id:”11222” …}
Auction/12233/Bid/9223370622642203897 {… action_id:”12233” …}
Auction/12233/Bid/9223370622642204559 {… action_id:”12233” …}
Auction/12345/Bid/9223370622642200431 {… action_id:”12345” …}
Auction/12345/Bid/9223370622642200512 {… action_id:”12345” …}
Auction/12345/Bid/9223370622642204112 {… action_id:”12345” …}
Auction/12345/Bid/9223370622642235500 {… action_id:”12345” …}
Auction/54321/Bid/9223370622642200478 {… action_id:”54321” …}
Auction/54321/Bid/9223370622642207054 {… action_id:”54321” …}
Auction/54321/Bid/9223370622642215431 {… action_id:”54321” …}
1
2
3
4
Start with ”Auction/12345”
Stop after 4 rows
4 most recent bids
“Bid/9223370622642200431”“Auction/12345”
• Now, all Bids for each Auction are located right next to each other
• This matches our most used access pattern
• We now have information about related data just from the key
• Key-only queries can be used to help speed up apps
• Why 4 Bids instead of 5? My example only had 4 records
(or until row “Auction/12346”)
Linking Related Data With Intelligent Keys
1234
12341234
Bid
Auction/11222/... {…}
Auction/12233/... {…}
Auction/12233/... {…}
Auction/12345/... {…}
Auction/12345/... {…}
Auction/12345/... {…}
Auction/12345/... {…}
Auction/54321/... {…}
Auction/54321/... {…}
Auction/54321/... {…}
Auction
11222 {…}
12233 {…}
12345 {…}
54321 {…}
http://myapp.com/api/auctions/12345
datastore.get(”12345”);
datastore.rangeScan(”Auction/12345/”, 5);
Both reads can be done
in parallel
Linking Related Data With Intelligent Keys
1234
12341234
AuctionData
Auction/11222/Bid/987321... {…}
Auction/12233/Bid/987534... {…}
Auction/12233/Bid/987635... {…}
Auction/12345 {…, ..., ...}
Auction/12345/Bid/977534... {…}
Auction/12345/Bid/987501... {…}
Auction/12345/Bid/987687... {…}
Auction/12345/Bid/988012... {…}
Auction/54321 {…, ..., ...}
Auction/54321/... {…}
Auction/54321/... {…}
datastore.rangeScan(”Auction/12345”, 6);
Data of completely different
schemas / types can be written to
the same table co-located on disk
http://myapp.com/api/auctions/12345
Counters
Counters
public void placeBid(String userId, String auctionId) {
// Many NoSQL stores support a native counter via some increment-and-get
// After the counter has been incremented, we don’t need to worry about contention
long bidCount = datastore.incrementAndGet(auctionId + ”_counter”);
BigDecimal amount = bidCount * BID_INCREMENT;
long descendingTimestamp = Long.MAX_VALUE - System.currentTimeMillis();
String bidId = ”Auction/” + auctionId + ”/Bid/” + reverseTimestamp + ”/” + amount;
// Increment some helper counters...
datastore.incrementAndGet(”global_bidCounter”);
datastore.incrementAndGet(auctionId + ”_bidCounter”);
datastore.incrementAndGet(userId + ”_bidCounter”);
// ... other logic like creating the Bid object ...
bidDao.write(bidId, bid);
}
// Some datastores may have a first-order Counter object:
Counter bidCounter = datastore.getCounter(auctionId + ”_counter”);
long bidCount = counter.incrementAndGet();
UI Presentation
datastore.incrementAndGet(userId + ”_bidCounter”);
UI Presentation
datastore.incrementAndGet(”global_bidCounter”);
• Global counters are a major bottleneck
Sharding
Data Model Sharding
public class Auction {
private String id;
private String title;
private String imageUrl;
private String description;
private BigDecimal currentPrice;
private User highBidder;
private Date endTime;
...
}
public class AuctionState {
private String id;
private BigDecimal currentPrice;
private User highBidder;
private Date endTime;
...
}
• Separate frequently changing data from static data
• Allows caching of static data
• Makes reads/writes of changing data faster
• Separate values expensive to serialize but in-frequently read
12341234http://myapp.com/api/auctions/12345
More Parallel Reads
1234
AuctionState
Auction
11222 {…}
12233 {…}
12345 {…}
54321 {…}
datastore.get(”12345”);
datastore.get(”12345”);
Both records can
share the same key
11222 {…}
12233 {…}
12345 {…}
54321 {…}
Memcache Check
Cache
Both reads can be
done in parallel
1234
1234
AuctionData
Auction/11222/Bid/987321... {…}
Auction/12233/Bid/987534... {…}
Auction/12233/Bid/987635... {…}
Auction/12345 {…, ..., ...}
Auction/12345/AuctionState {…}
Auction/12345/Bid/977534... {…}
Auction/12345/Bid/987501... {…}
Auction/54321 {…, ..., ...}
Auction/54321/... {…}
More Parallel Reads
12341234http://myapp.com/api/auctions/12345
datastore.get(”Auction/12345/AuctionState”);
datastore.get(”Auction/12345”);
Again, records can be in the
same table
Memcache Check
Cache
1 4
Sharding a 64 bit Integer
long count = datastore.incrementAndGet(”global_bidCounter”);
176
52 84 40+ + = 176
global_bidCounter
52 84 41 177+ + =
53 84 40 177+ + =
52 85 40 177+ + =
• Decompose the counter
• Pick any part of the count and increment it
Implementing a Sharded Counter
public class ShardedCounter {
// the @Embedded annotation (both JDO and JPA)
// indicates that this is not an FK relationship:
private String name;
private int shards;
private void increment() {
int index = random(shards);
datastore.incrementAndGet(name + ”-” + index);
}
private long get() {
long count = 0;
// All the shards of the counter are located next to each other:
Result scan = datastore.rangeScan(name + ”-”, shards);
while (scan.hasNext()) {
Counter next = scan.next();
count += next.get();
}
return count;
}
}
We Love Feedback
Questions/Comments
Email: bryce.cottam@thinkbiganalytics.com
Rate This Session
with the PARTNERS Mobile App
Remember To Share Your Virtual Passes
Follow Teradata 2015 PARTNERS
www.teradata-partners.com/social

More Related Content

Similar to Data Modeling on NoSQL

Windows 8 Pure Imagination - 2012-11-24 - Getting your HTML5 game Windows 8 r...
Windows 8 Pure Imagination - 2012-11-24 - Getting your HTML5 game Windows 8 r...Windows 8 Pure Imagination - 2012-11-24 - Getting your HTML5 game Windows 8 r...
Windows 8 Pure Imagination - 2012-11-24 - Getting your HTML5 game Windows 8 r...Frédéric Harper
 
Creating a Facebook Clone - Part XIX - Transcript.pdf
Creating a Facebook Clone - Part XIX - Transcript.pdfCreating a Facebook Clone - Part XIX - Transcript.pdf
Creating a Facebook Clone - Part XIX - Transcript.pdfShaiAlmog1
 
MongoDB .local Houston 2019: Jumpstart: From SQL to NoSQL -- Changing Your Mi...
MongoDB .local Houston 2019: Jumpstart: From SQL to NoSQL -- Changing Your Mi...MongoDB .local Houston 2019: Jumpstart: From SQL to NoSQL -- Changing Your Mi...
MongoDB .local Houston 2019: Jumpstart: From SQL to NoSQL -- Changing Your Mi...MongoDB
 
Retail referencearchitecture productcatalog
Retail referencearchitecture productcatalogRetail referencearchitecture productcatalog
Retail referencearchitecture productcatalogMongoDB
 
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...it-people
 
SQL Saturday Madrid 2019 - Data model with Azure Cosmos DB
SQL Saturday Madrid 2019 - Data model with Azure Cosmos DBSQL Saturday Madrid 2019 - Data model with Azure Cosmos DB
SQL Saturday Madrid 2019 - Data model with Azure Cosmos DBAlberto Diaz Martin
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data modelPatrick McFadin
 
Dependency Injection and Aspect Oriented Programming presentation
Dependency Injection and Aspect Oriented Programming presentationDependency Injection and Aspect Oriented Programming presentation
Dependency Injection and Aspect Oriented Programming presentationStephen Erdman
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
From SQL to NoSQL -- Changing Your Mindset
From SQL to NoSQL -- Changing Your MindsetFrom SQL to NoSQL -- Changing Your Mindset
From SQL to NoSQL -- Changing Your MindsetLauren Hayward Schaefer
 
MongoDB World 2019: From SQL to NoSQL -- Changing Your Mindset
MongoDB World 2019: From SQL to NoSQL -- Changing Your MindsetMongoDB World 2019: From SQL to NoSQL -- Changing Your Mindset
MongoDB World 2019: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
CCM AlchemyAPI and Real-time Aggregation
CCM AlchemyAPI and Real-time AggregationCCM AlchemyAPI and Real-time Aggregation
CCM AlchemyAPI and Real-time AggregationVictor Anjos
 
Jumpstart! From SQL to NoSQL -- Changing Your Mindset
Jumpstart! From SQL to NoSQL -- Changing Your MindsetJumpstart! From SQL to NoSQL -- Changing Your Mindset
Jumpstart! From SQL to NoSQL -- Changing Your MindsetLauren Hayward Schaefer
 
Freeing Yourself from an RDBMS Architecture
Freeing Yourself from an RDBMS ArchitectureFreeing Yourself from an RDBMS Architecture
Freeing Yourself from an RDBMS ArchitectureDavid Hoerster
 
Privacy solutions decode2021_jon_oliver
Privacy solutions decode2021_jon_oliverPrivacy solutions decode2021_jon_oliver
Privacy solutions decode2021_jon_oliverJonathanOliver26
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 

Similar to Data Modeling on NoSQL (20)

Windows 8 Pure Imagination - 2012-11-24 - Getting your HTML5 game Windows 8 r...
Windows 8 Pure Imagination - 2012-11-24 - Getting your HTML5 game Windows 8 r...Windows 8 Pure Imagination - 2012-11-24 - Getting your HTML5 game Windows 8 r...
Windows 8 Pure Imagination - 2012-11-24 - Getting your HTML5 game Windows 8 r...
 
Creating a Facebook Clone - Part XIX - Transcript.pdf
Creating a Facebook Clone - Part XIX - Transcript.pdfCreating a Facebook Clone - Part XIX - Transcript.pdf
Creating a Facebook Clone - Part XIX - Transcript.pdf
 
MongoDB .local Houston 2019: Jumpstart: From SQL to NoSQL -- Changing Your Mi...
MongoDB .local Houston 2019: Jumpstart: From SQL to NoSQL -- Changing Your Mi...MongoDB .local Houston 2019: Jumpstart: From SQL to NoSQL -- Changing Your Mi...
MongoDB .local Houston 2019: Jumpstart: From SQL to NoSQL -- Changing Your Mi...
 
Retail referencearchitecture productcatalog
Retail referencearchitecture productcatalogRetail referencearchitecture productcatalog
Retail referencearchitecture productcatalog
 
The Rise of NoSQL
The Rise of NoSQLThe Rise of NoSQL
The Rise of NoSQL
 
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
 
Why no sql
Why no sqlWhy no sql
Why no sql
 
SQL Saturday Madrid 2019 - Data model with Azure Cosmos DB
SQL Saturday Madrid 2019 - Data model with Azure Cosmos DBSQL Saturday Madrid 2019 - Data model with Azure Cosmos DB
SQL Saturday Madrid 2019 - Data model with Azure Cosmos DB
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data model
 
Dependency Injection and Aspect Oriented Programming presentation
Dependency Injection and Aspect Oriented Programming presentationDependency Injection and Aspect Oriented Programming presentation
Dependency Injection and Aspect Oriented Programming presentation
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
From SQL to NoSQL -- Changing Your Mindset
From SQL to NoSQL -- Changing Your MindsetFrom SQL to NoSQL -- Changing Your Mindset
From SQL to NoSQL -- Changing Your Mindset
 
MongoDB World 2019: From SQL to NoSQL -- Changing Your Mindset
MongoDB World 2019: From SQL to NoSQL -- Changing Your MindsetMongoDB World 2019: From SQL to NoSQL -- Changing Your Mindset
MongoDB World 2019: From SQL to NoSQL -- Changing Your Mindset
 
CCM AlchemyAPI and Real-time Aggregation
CCM AlchemyAPI and Real-time AggregationCCM AlchemyAPI and Real-time Aggregation
CCM AlchemyAPI and Real-time Aggregation
 
Jumpstart! From SQL to NoSQL -- Changing Your Mindset
Jumpstart! From SQL to NoSQL -- Changing Your MindsetJumpstart! From SQL to NoSQL -- Changing Your Mindset
Jumpstart! From SQL to NoSQL -- Changing Your Mindset
 
Freeing Yourself from an RDBMS Architecture
Freeing Yourself from an RDBMS ArchitectureFreeing Yourself from an RDBMS Architecture
Freeing Yourself from an RDBMS Architecture
 
Privacy solutions decode2021_jon_oliver
Privacy solutions decode2021_jon_oliverPrivacy solutions decode2021_jon_oliver
Privacy solutions decode2021_jon_oliver
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Isomorphic react in real life
Isomorphic react in real lifeIsomorphic react in real life
Isomorphic react in real life
 
Isomorphic react in real life
Isomorphic react in real lifeIsomorphic react in real life
Isomorphic react in real life
 

Recently uploaded

DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 

Recently uploaded (20)

DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 

Data Modeling on NoSQL

  • 1. Data Modeling on NoSQL Bryce Cottam Principal Architect, Think Big a Teradata Company
  • 2. • Where we came from (RDBMS Modeling) • Migrate Existing Data Model to NoSQL • Questions Agenda
  • 3. • Migrate a SQL based solution to NoSQL • NoSQL Smack-Down (Battle of the NoSQL Bands) Anti-Agenda What we are NOT going to cover:
  • 4. Where We Came From (RDBMS Modeling)
  • 5. SQL Backdrop 123 Tony Soprano true 1963-04-15 124 Carmella Soprano false 1968-12-02 125 Johnny Sacrimoni true 1959-01-11 158 Paulie Gualtieri false 1960-08-04 159 Silvio Dante false 1965-10-11 162 Ralph Cifaretto false 1969-03-28 164 Christopher Moltisanti false 1974-01-11 165 Adriana La Cerva false 1976-11-02 • Column Order • Column Names • Column Width • Data Types Metadata Raw Data • Save space • Consistent format • Familiar syntax (ANSI SQL Standard)
  • 10. Where We Came From Auction User Bid Payment id email name profile_image_url access_level created_date id user_id auction_id amount timestamp id title image_url current_price high_bidder end_time id auction_id timestamp card_type confirmation_number
  • 11. Data Models public class User { private long id; private String email; private String name; private String profileImageUrl; // AccessLevel is an enum private AccessLevel accessLevel; private Date createdDate; private List<Auction> auctions; private List<Bid> bids; ... } public class Auction { private long id; private String title; private String imageUrl; private BigDecimal currentPrice; private User highBidder; private Date endTime; private List<Bid> bids; private Payment payment; ... } public class Bid { private long id; private User user; private Auction auction; private BigDecimal amount; private Date timestamp; ... } public class Payment { private long id; private Auction auction; private Date timestamp; // Visa, MasterCard, AmEx etc. private String cardType; private String confirmationNumber; ... }
  • 12. Support Queries select a.*, b.* from auction a join bid b on a.id = b.auction_id where a.id = 12345 order by b.timestamp desc • Either manual SQL or ORM generated SQL will wind up joining a few tables to get the desired results • Joins are not supported by most NoSQL solutions Get all Bids for a given Auction:
  • 13. Support Queries select count(*) from bid where user_id = 554422 • Aggregates in NoSQL are usually not supported • If they are supported, they often have performance or memory issues select avg(current_price) from auction select u.name, max(s.bid_count) as bids from (select user_id, count(*) as bid_count from bid group by user_id) as s join user u on u.id = s.user_id Count all Bids for a User: Get average final price of all Auctions: Get the User with the most Bids:
  • 14. Adapt to your Data Store Model • Most web app developers think in terms of tables, columns, queries • Many times the schema is simply mirrored in the application layer model objects • (Not a bad thing, but hard to change) • The most successful/scalable applications embrace the features and limitations of their chosen datastore Schema DAO Application Patterns defined here effect application behavior for data interaction Model Access PatternStorage Details Model
  • 15. Encouraging Scalable Access Patterns public class BidDao { // Common API structure, loads all in memory // Also requires that the full User object is available public List<Bid> getBids(User user) {…} ... } public class BidDao { // Paging is a good option to avoid memory issues public List<Bid> getBids(String userId, int offset, int limit) {…} // Streaming APIs encourages streaming processing public Iterator<Bid> getBids(String userId) {…} ... } Common: Alternative:
  • 16. Encouraging Scalable Access Patterns DAO DAO Common: Streaming: Small buffer Memory Required DAO Paging: Memory Required … Garbage Collected … Memory Required
  • 17. Adapt to your Data Store Application SQL-NoSQL Adapter DAO DAO DAO Danger!! If you mask your true datastore semantics, you risk your scalability • DataNucleus is a good option if used with discipline • Provides JDO/JPA support NoSQL Store
  • 18. Top level concepts to embrace • Denormalization • Intelligent Key Design • Counters • Sharding
  • 20. Identify Conceptually Immutable Fields public class User { private long id; private String email; private String name; private String profileImageUrl; // AccessLevel is an enum private AccessLevel accessLevel; private Date createdDate; private List<Auction> auctions; private List<Bid> bids; ... } public class Auction { private long id; private String title; private String imageUrl; private BigDecimal currentPrice; private User highBidder; private Date endTime; private List<Bid> bids; private Payment payment; ... } public class UserReference { private long id; private String name; private String profileImageUrl; ... } public class AuctionReference { private long id; private String title; private String imageUrl; ... }
  • 21. Modified Data Structures public class User { // Changed ids to Strings // (more on that soon) private String id; private String email; private String name; private String profileImageUrl; private AccessLevel accessLevel; private Date createdDate; private List<Auction> auctions; private List<Bid> bids; ... } public class Auction { private String id; private String title; private String imageUrl; private BigDecimal currentPrice; private UserReference highBidder; private Date endTime; private List<Bid> bids; private Payment payment; ... } public class Bid { private String id; private UserReference user; private AuctionReference auction; private BigDecimal amount; private Date timestamp; ... } public class Payment { private String id; private AuctionReference auction; private Date timestamp; // Visa, MasterCard, AmEx etc. private String cardType; private String confirmationNumber; ... }
  • 22. Modified Data Models public class Bid { // the @Embedded annotation (both JDO and JPA) // indicates that this is not an FK relationship: @Embedded private UserReference user; @Embedded private AuctionReference auction; ... } …/d288-4af3-8821-27a37269ec0c {amount:”14.00”, user_id:”abc123”, user_name:”Ralph Cifaretto”, user_profile_image:”http://…”, …} …/d288-4af3-8821-27a37283af10 {amount:”240.00”, user_id:”abc123”, user_name:”Ralph Cifaretto”, user_profile_image:”http://…”, …} Bid id user_id user_name user_profile_image amount timestamp auction_title … Under the hood in the data store: • JDO/JPA configuration is certainly not required • We’re making a copy of the conceptually immutable properties of the user • When we read a Bid record now, we don’t need to go fetch the User record • Nor do we need a join
  • 23. Manual Marshaling public class BidDao { public Bid read(String id) { // This is an HBase-like API, but the idea is the same for most all // NoSQL datastore native APIs: Result result = openConnection().get(“bid”, id); Bid bid = new Bid(); bid.setId(result.getValue(“id”)); ... String userId = result.getValue(“user_id”); String userName = result.getValue(“user_name”); String profileUrl = result.getValue(“user_profile_image”); UserReference user = new UserReference(userId, userName, profileUrl); bid.setUser(user); ... return bid; } ... } // To access user information: UserReference user = bid.getUser(); String userName = user.getName();
  • 24. We support access pattern without joins auction_title auction_title auction_title auction_title auction_image .somg Bid id user_id user_name user_profile_image amount timestamp auction_id auction_title auction_image_url Click on Auction image or name and go to details for Auction
  • 25. Data is duplicated many (many) times Bid id amount user_id user_name user_profile_image auction_id auction_title . . . 124 14.00 5432 Gustavo ‘Gus’ Fring http://nj.boss.com… 555111222 Barrel Methylamine . . . 125 13.00 1234 Walter White http://dead.users… 555111222 Barrel Methylamine . . . 126 12.00 2223 Hank Schrader http://dea.bro.com… 555111222 Barrel Methylamine . . . 127 11.00 1234 Walter White http://dead.users… 555111222 Barrel Methylamine . . . 128 10.00 1112 Jesse Pinkman http://facebook.com… 555111222 Barrel Methylamine . . . 129 9.00 2223 Hank Schrader http://dea.bro.com… 555111222 Barrel Methylamine . . . 130 8.00 1234 Walter White http://dead.users… 555111222 Barrel Methylamine . . . 131 7.00 1112 Jesse Pinkman http://facebook.com… 555111222 Barrel Methylamine . . . 132 6.00 1234 Walter White http://dead.users… 555111222 Barrel Methylamine . . . User id name profile_image email created_date . . . 5432 Gustavo ‘Gus’ Fring http://nj.boss.com… tony@breakingbad.com 2008-01-01 . . . 1234 Walter White http://chem.users… walter@breakingbad.com 2008-02-02 . . . 2223 Hank Schrader http://dea.bro.com… hank@breakingbad.com 2009-01-12 . . . 1112 Jesse Pinkman http://facebook.com… jessie@breakingbad.com 2008-11-16 . . .
  • 26. What about updates? Backend Node(s) Async Request to change all Bid records related to this user Name Change Request Edge Node Time Line NoSQL Response sent to user Use workers to modify affected records Possibly minutes
  • 27. Denormalization Observations • We don’t always need ACID compliance • Strict FK enforcement not always required • MySQL’s MyISAM storage works fine for many situations • Users are getting used to change latency • There is a trade off between horizontal scalability in your app and patterns we’ve been trained to rely on
  • 29. Sample NoSQL Storage Layout Server 1 key001 ...data... key002 ...data... key003 ...data... key004 ...data... key005 ...data... key006 ...data... key007 ...data... key008 ...data... key009 ...data... key010 ...data... … Server 2 key011 ...data... key012 ...data... key013 ...data... key014 ...data... key015 ...data... key016 ...data... key017 ...data... key018 ...data... key019 ...data... key020 ...data... Server 3 key021 ...data... key022 ...data... key023 ...data... key024 ...data... key025 ...data... key026 ...data... key027 ...data... key028 ...data... key029 ...data... key030 ...data... Server n key091 ...data... key092 ...data... key093 ...data... key094 ...data... key095 ...data... key096 ...data... key097 ...data... key098 ...data... key099 ...data... key100 ...data... • This scan is “get everything from key16 through key22” • A key-range scan returns N rows in linear time O(N) regardless of the number of rows in the table • This is not true for relational databases
  • 30. Intelligent Key Design abc123 {…} abc124 {name:”Tony Soprano”, createdDate:”2011-01-12”, email:”tony@sopranos.com”, role:”BOSS”} abc125 {name:”Salvator Bonpensiero”, createdDate:”2014-10-02”, email:”bonpensiero@sopranos.com”, role:”CAPO”} abc126 {name:”Christopher Moltisanti”, createdDate:”2012-10-02”, email:”christopher@sopranos.com”, role:”SOLDIER”} abc2 {name:”Carmella Soprano”, createdDate:”2011-10-02”, email:”carmella@sopranos.com”, favoriateCar:”BMW”} abc20 {name:”Meadow Soprano”, createdDate:”2012-01-02”, email:”meadow@sopranos.com”, favoriateCar:12.25} abc21 {someField:”some value”, averageScore:5.75, someOtherDate:”2011-10-02”} abc22 {…} bcd1 {…} bcd12 {…} Key ordering is lexical Records can be different schemas
  • 31. Ascending Timestamp Bid/2014-10-26T09:00:00.000 {…} Bid/2014-10-26T09:00:12.975 {…} Bid/2014-10-26T09:00:14.221 {…} Bid/2014-10-26T09:00:18.005 {…} Bid/2014-10-26T09:00:35.572 {…} Bid/2014-10-26T09:00:40.003 {…} Bid/2014-10-26T09:00:41.123 {…} Bid/2014-10-26T09:00:41.124 {…} Bid/2014-10-26T09:00:41.150 {…} Bid/2014-10-26T09:00:41.218 {…} yyyy-MM-ddTHH:mm:ss.SSS is a pretty standard timestamp and lexically orders chronologically • Great for time-series data • Timeline tracking (viewing data in the order it was processed etc.) OlderNewer
  • 34. Descending Timestamp Bid/9223370622642200431 {…} Bid/9223370622642200478 {…} Bid/9223370622642200512 {…} Bid/9223370622642203021 {…} Bid/9223370622642203897 {…} Bid/9223370622642204112 {…} Bid/9223370622642204559 {…} Bid/9223370622642207054 {…} Bid/9223370622642215431 {…} Bid/9223370622642235500 {…} public class User { // This will yield some ridiculous value like: 9223370622642200431 // Number of millseconds in a year: 3153600000 // This computation will reach 0 in the year 292,471,163 long descendingTimestamp = Long.MAX_VALUE – System.currentTimeMillis(); } NewerOlder
  • 35. Descending Timestamp Bid/9223370622642200431 {… action_id:”12345” …} Bid/9223370622642200478 {… action_id:”54321” …} Bid/9223370622642200512 {… action_id:”12345” …} Bid/9223370622642203021 {… action_id:”22222” …} Bid/9223370622642203897 {… action_id:”22233” …} Bid/9223370622642204112 {… action_id:”12345” …} Bid/9223370622642204559 {… action_id:”22233” …} Bid/9223370622642207054 {… action_id:”54321” …} Bid/9223370622642215431 {… action_id:”54321” …} Bid/9223370622642235500 {… action_id:”12345” …} 1 2 3 4 5 Start with ”Bid/” Stop after 5 rows 5 most recent bids • Known as a “range scan” • Very easy to start with some prefix and read for N records • Complexity stays constant for top 5 bids no matter how many bids are in the system
  • 36. Descending Timestamp Auction/11222/Bid/9223370622642203021 {… action_id:”11222” …} Auction/12233/Bid/9223370622642203897 {… action_id:”12233” …} Auction/12233/Bid/9223370622642204559 {… action_id:”12233” …} Auction/12345/Bid/9223370622642200431 {… action_id:”12345” …} Auction/12345/Bid/9223370622642200512 {… action_id:”12345” …} Auction/12345/Bid/9223370622642204112 {… action_id:”12345” …} Auction/12345/Bid/9223370622642235500 {… action_id:”12345” …} Auction/54321/Bid/9223370622642200478 {… action_id:”54321” …} Auction/54321/Bid/9223370622642207054 {… action_id:”54321” …} Auction/54321/Bid/9223370622642215431 {… action_id:”54321” …} 1 2 3 4 Start with ”Auction/12345” Stop after 4 rows 4 most recent bids “Bid/9223370622642200431”“Auction/12345” • Now, all Bids for each Auction are located right next to each other • This matches our most used access pattern • We now have information about related data just from the key • Key-only queries can be used to help speed up apps • Why 4 Bids instead of 5? My example only had 4 records (or until row “Auction/12346”)
  • 37. Linking Related Data With Intelligent Keys 1234 12341234 Bid Auction/11222/... {…} Auction/12233/... {…} Auction/12233/... {…} Auction/12345/... {…} Auction/12345/... {…} Auction/12345/... {…} Auction/12345/... {…} Auction/54321/... {…} Auction/54321/... {…} Auction/54321/... {…} Auction 11222 {…} 12233 {…} 12345 {…} 54321 {…} http://myapp.com/api/auctions/12345 datastore.get(”12345”); datastore.rangeScan(”Auction/12345/”, 5); Both reads can be done in parallel
  • 38. Linking Related Data With Intelligent Keys 1234 12341234 AuctionData Auction/11222/Bid/987321... {…} Auction/12233/Bid/987534... {…} Auction/12233/Bid/987635... {…} Auction/12345 {…, ..., ...} Auction/12345/Bid/977534... {…} Auction/12345/Bid/987501... {…} Auction/12345/Bid/987687... {…} Auction/12345/Bid/988012... {…} Auction/54321 {…, ..., ...} Auction/54321/... {…} Auction/54321/... {…} datastore.rangeScan(”Auction/12345”, 6); Data of completely different schemas / types can be written to the same table co-located on disk http://myapp.com/api/auctions/12345
  • 40. Counters public void placeBid(String userId, String auctionId) { // Many NoSQL stores support a native counter via some increment-and-get // After the counter has been incremented, we don’t need to worry about contention long bidCount = datastore.incrementAndGet(auctionId + ”_counter”); BigDecimal amount = bidCount * BID_INCREMENT; long descendingTimestamp = Long.MAX_VALUE - System.currentTimeMillis(); String bidId = ”Auction/” + auctionId + ”/Bid/” + reverseTimestamp + ”/” + amount; // Increment some helper counters... datastore.incrementAndGet(”global_bidCounter”); datastore.incrementAndGet(auctionId + ”_bidCounter”); datastore.incrementAndGet(userId + ”_bidCounter”); // ... other logic like creating the Bid object ... bidDao.write(bidId, bid); } // Some datastores may have a first-order Counter object: Counter bidCounter = datastore.getCounter(auctionId + ”_counter”); long bidCount = counter.incrementAndGet();
  • 44. Data Model Sharding public class Auction { private String id; private String title; private String imageUrl; private String description; private BigDecimal currentPrice; private User highBidder; private Date endTime; ... } public class AuctionState { private String id; private BigDecimal currentPrice; private User highBidder; private Date endTime; ... } • Separate frequently changing data from static data • Allows caching of static data • Makes reads/writes of changing data faster • Separate values expensive to serialize but in-frequently read
  • 45. 12341234http://myapp.com/api/auctions/12345 More Parallel Reads 1234 AuctionState Auction 11222 {…} 12233 {…} 12345 {…} 54321 {…} datastore.get(”12345”); datastore.get(”12345”); Both records can share the same key 11222 {…} 12233 {…} 12345 {…} 54321 {…} Memcache Check Cache Both reads can be done in parallel
  • 46. 1234 1234 AuctionData Auction/11222/Bid/987321... {…} Auction/12233/Bid/987534... {…} Auction/12233/Bid/987635... {…} Auction/12345 {…, ..., ...} Auction/12345/AuctionState {…} Auction/12345/Bid/977534... {…} Auction/12345/Bid/987501... {…} Auction/54321 {…, ..., ...} Auction/54321/... {…} More Parallel Reads 12341234http://myapp.com/api/auctions/12345 datastore.get(”Auction/12345/AuctionState”); datastore.get(”Auction/12345”); Again, records can be in the same table Memcache Check Cache 1 4
  • 47. Sharding a 64 bit Integer long count = datastore.incrementAndGet(”global_bidCounter”); 176 52 84 40+ + = 176 global_bidCounter 52 84 41 177+ + = 53 84 40 177+ + = 52 85 40 177+ + = • Decompose the counter • Pick any part of the count and increment it
  • 48. Implementing a Sharded Counter public class ShardedCounter { // the @Embedded annotation (both JDO and JPA) // indicates that this is not an FK relationship: private String name; private int shards; private void increment() { int index = random(shards); datastore.incrementAndGet(name + ”-” + index); } private long get() { long count = 0; // All the shards of the counter are located next to each other: Result scan = datastore.rangeScan(name + ”-”, shards); while (scan.hasNext()) { Counter next = scan.next(); count += next.get(); } return count; } }
  • 49. We Love Feedback Questions/Comments Email: bryce.cottam@thinkbiganalytics.com Rate This Session with the PARTNERS Mobile App Remember To Share Your Virtual Passes Follow Teradata 2015 PARTNERS www.teradata-partners.com/social