Enterprise applications are complex making it difficult to fit everything in one model. NoSQL is taking a leading role in the next generation database technologies and polyglot persistence a good option to leverage the strength of multiple data stores. This talk will introduce the Spring Data project, an umbrella project that provides a familiar and consistent Spring-based programming model for a wide range of data access technologies such as Redis, MongoDB, HBase, Neo4j...while retaining store-specific features and capabilities.
2. About Sergi Almar
‣CTO @ PayTouch
‣VMWare / SpringSource Certified
Instructor
‣javaHispano JUG core member
‣Spring I/O organizer (hope to see you next
year in Spain)
8. CAP Theorem
Hypertable HBase BigTable MongoDB
RDBMS
Redis Memcache Couchbase Terrastore
Dynamo Voldemort Cassandra
Riak SimpleDB CouchDB
9. CAP Theorem
Dynamo Voldemort Cassandra
RDBMS
Riak SimpleDB CouchDB
Which one should I choose?
Hypertable HBase BigTable MongoDB
Redis Memcache Couchbase Terrastore
13. Key-value stores
‣Based on Amazon’s Dynamo paper
‣Data stored as key / value pairs
‣Hard to query
‣Mostly in memory
K1 V1
K2 V2
K3 V2
14. ‣Redis is an advanced key-value store
‣Similar to Memcached but the dataset is not
volatile.
‣Data types: string, lists, sets, hashes, sorted
sets
‣Data expiration
‣Master-slave replication
‣Has “transactions” (batch operations)
‣Libraries - Many languages (Java: Jedis,
JRedis...)
19. Table / Documents
{ title: "Taming NoSQL with Spring
Data",
abstract: "NoSQL is taking a
leading ...",
speaker: "Sergi Almar",
topics: ["nosql", "spring"]}
21. ‣JSON-style documents
‣Full or partial document updates
‣GridFS for efficiently storing large files
‣Index support - secondary and
compound
‣Rich query language for dynamic queries
‣Map / Reduce
‣Replication and auto sharding
26. ‣DB is a collection of graph nodes,
relationships
‣Nodes and relationships have properties
‣Query is done via a traversal API
‣Indexes on node / relationship
properties
‣Written in Java, can be embedded
‣Transactions (ACID)
28. Spring Data http://www.springsource.com/spring-
data
‣An umbrella project for:
‣JPA - Repositories
‣JDBC Extensions
‣MongoDB - Document Database
‣Neo4J - Graph Database
‣Redis, Riak - Key Value Database
‣Gemfire - Distributes Data Grid
‣Hadoop / HBase - Big Data Storage and
Analysis platform
29. Spring Data Building Blocks
‣Mapping of POJOs to underlying data
model
‣Familiar Spring ‘Template’
‣MongoTemplate,
RedisTemplate,
Neo4JTemplate...
‣Generic Repository support
30. Spring Data Repositories I
public'interface'Repository<T,'ID'extends'Serializable>'{'
!
}
public'interface'CrudRepository<T,'ID'extends'Serializable>'extends'Repository<T,'ID>'{'
!
''T'save(T'entity);'
'
''Iterable<T>'save(Iterable<?'extends'T>'entities);'
'
''T'findOne(ID'id);'
!
''boolean'exists(ID'id);'
!
''Iterable<T>'findAll();'
!
''long'count();'
!
''void'delete(ID'id);'
!
''void'delete(T'entity);'
'
''void'delete(Iterable<?'extends'T>'entities);'
!
''void'deleteAll();'
}
31. Spring Data Repositories II
public'interface'PagingAndSortingRepository<T,'ID'extends'Serializable>'extends''
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''CrudRepository<T,'ID>'{'
''Iterable<T>'findAll(Sort'sort);'
'
''Page<T>'findAll(Pageable'pageable);'
}
public'interface'PersonRepository'extends'CrudRepository<Person,BigInteger>'{'
'
'//'Finder'for'a'single'entity'
'Person'findByEmailAddress(String'emailAddress);'
'
'//'Finder'for'multiple'entities'
'List<Person>'findByLastnameLike(String'lastName);'
'
'//'Finder'with'pagination'
'Page<Person>'findByFirstnameLike(String'firstName,'Pageable'page);'
''
}
32. Spring Data JPA I
@Entity'
public'class'Person'{'
'
''@Id'
''@GeneratedValue(strategy=GenerationType.AUTO)&
''private'BigInteger'id;'
''private'String'firstname,'lastname;'
'
''@Column(name="email")'
''private'String'emailAddress;'
'
''@OneToMany'
''private'Set<Person>'colleagues;'
'
}
By just defining the interface, Spring provides the implementation
<jpa:repositories,base.package="com.java2days.repository"2/>!
33. Spring Data JPA II
‣Query methods use method naming
conventions
‣ Can override with Query annotation
‣ Or method name references a JPA named
public'interface'PersonRepository'extends'CrudRepository<Person,BigInteger>'{'
'
'//'previous'methods'omitted…'
query
'
!@Query("select!p!from!Person!p!where!p.emailAddress!=!?1")!
!Person!findByEmailAddress(String!emailAddress);!
!!!
!@Query("select!p!from!Person!p!where!p.firstname!=!:firstname!or!p.lastname!=!:lastname")!
!Person!findByLastnameOrFirstname(@Param("lastname")!String!lastname,!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!@Param("firstname")!String!firstname);'
''
}
44. Spring Data Redis
‣Portable API across several Redis connectors
‣RedisTemplate
‣Access all Redis functionality, dedicated interfaces
for each data type
‣Value / Hash / Set / ZSet / List Operations
‣Handles serialization and type conversion automatically
‣String specific class through StringRedisTemplate (JSON,
XML...)
‣Fluent Query API
‣Async Pub / Sub support with MLC
‣Spring 3.1 Cache Abstraction provider
Most popular persistence choice today\nRelations, ACID guarantees, SQL, strict schema, difficult to scale, mismatch with OO lang\n
Online analytical processing that enable users to interactively analyze multidimensional data from multiple perspectives\n
\n
\n
\nC &#x2013; for &#x201C;Consistency&#x201D; :ability of a system to remain in consistent state after an update or an operation\nA &#x2013; for &#x201C;Availability&#x201D; :&#xA0;availability of a system even in the event of adversity or system issues\nP &#x2013; for &#x201C;Partition Tolerance&#x201D; :&#xA0;ability of system to function in presence of network partitions even if partitions are added/deletedYou can't have the three at the same time and get an acceptable latency.\nFast, good and cheap\nYou cannot scale without partition tolerance, so to scale you have to drop consistency\n\n
Most of the systems compromise between consistency and availability\nBASE - Basic Availability Soft-state Eventual consistency\nYou drop consistency for eventual consistency\n\nFirst were web frameworks (Struts, Spring MVC, Tapestry, Wicket, Stripes...)\nThen Ajax and Javascript frameworks (jQuery, prototype, Dojo...) (backbone.js, Knokout, batman.js...)\nNot it&#x2019;s time for persistence!!!\n
\nWhich one should I use for my use case?\n
Key-Value: like a globally distrubuted hasmap\nColumn: \n
Dynamo: Amazon&#x2019;s Highly available key-value store (2007)\nExtremely fast\nUse CasesSession data Shopping cartsUser preferences\n\nWhen to avoid?\nYou have relationsYou have multi-operational transactions \nYou want to query the valuesYou want to operate on sets of entries\n
Atomic\nUse cases: Counting views, who is online, social activity feeds, caching\nContentious benchmarks (memcached vs redis)\n
\n
\n
Based on Bigtable from Google: A Distributed storage system for Structured Data (2006)\nLike a big table where every row can have its own schema (one row may have 3 columns and another one 50 columns)\nBig Data problems, large scale data processing\n
\n
Easy to get started with\nSQL like query capabilities\nSchema less -no schema migration but cannot have data integrity\n
Rich Document: Closer to the data model that we have in our code\nArray of values are much more convenient than many-to-many relationships\nEmbedded documents\n\n_id -> PK globally unique identifier, you can override that value\n
Eventual consistency\nGridFS - supports native storage of binary data\nObjects in MongoDB are limited in size, the GridFS spec provides a mechanism for transparently dividing a large file among multiple documents.\n
\n
10gen nothing is gonna be more \n10gen education\nmost of the querying capabilities that you get with RDBSM, \n
\n
\n
3 core abstractions in the graph model: Node, Relationship, Properties (key-value pairs)\nSchema free\nOther dbs can model graphs, graph dbs make it the primary data structure.\n\n
\n
Spring Data makes it easier to build Spring-powered applications that use new data access technologies\nIt is worth taking a look at spring data even if you are not using nosql\nPromote classic Spring value propositions: \nProductivity (make the easy stuff a on-liner), \nconsistency: wide broad of APIs\nportability: repository support\nCommons: Repositories, Object Mapping\n
QueryDSL project, type-safe query API\nfields managed by different stores\n
\n
Pageable ->offset, page number, page size, sort (accepts multiple properties)\nQuery methods use method naming conventions to define query\n