SlideShare a Scribd company logo
1 of 79
*




     CouchDB
Sectional Sofa Edition
        Angel Pizarro
      angel@upenn.edu

              * www.bauwel-movement.co.uk/sculpture.php
About Me
Me: Bioinformatics, help scientists with big
data
Lots of data in lots of formats
Ruby and Ruby on Rails
 But that doesn’t matter for CouchDB!
Initially interested in CouchDB for AWS
deployment


                    2
Overview

CouchDB run through
A short example
Big Deployment Issues
Questions - ask away at any time



                  3
A sober moment




2,402 killed and 1,282 wounded
              4
A sober moment




2,402 killed and 1,282 wounded
              4
Key-Value Databases
Datastore of values indexed
by keys (duh!)
Must provide the ID for all
operations
Hash or B-Tree
  Hash is FAST, but only allows
  single-value lookups
  B-Tree is slower, but allows
  range queries
Horizontally scalable
                           5
CouchDB
Schema free, document oriented database
 Javascript Object Notation (JSON)
HTTP protocol using REST operations
 No direct native language drivers *
 Javascript is the lingua franca
ACID & MVCC guarantees on a per-
document basis
Map-Reduce model for indexing and views
Back-ups and replication are simple

                     6
CouchDB
Schema free, document oriented database
 Javascript Object Notation (JSON)
HTTP protocol using REST operations
 No direct native language drivers *
 Javascript is the lingua franca
ACID & MVCC guarantees on a per-
document basis
Map-Reduce model for indexing and views
Back-ups and replication are simple
                         * Hovercraft: http://github.com/jchris/hovercraft/
                     6
REST




 7
REST
Representational State Transfer




                      7
REST
Representational State Transfer
Clients-Server separation with uniform
interface




                    7
REST
Representational State Transfer
Clients-Server separation with uniform
interface
 Load-balancing, caching, authorization & authentication,
 proxies




                         7
REST
Representational State Transfer
Clients-Server separation with uniform
interface
  Load-balancing, caching, authorization & authentication,
  proxies
Stateless - client is responsible for creating a self-
sufficient request




                          7
REST
Representational State Transfer
Clients-Server separation with uniform
interface
  Load-balancing, caching, authorization & authentication,
  proxies
Stateless - client is responsible for creating a self-
sufficient request
Resources are cacheable - servers must mark
non-cacheable resources as such


                          7
REST
Representational State Transfer
Clients-Server separation with uniform
interface
  Load-balancing, caching, authorization & authentication,
  proxies
Stateless - client is responsible for creating a self-
sufficient request
Resources are cacheable - servers must mark
non-cacheable resources as such
Only 5 HTTP verbs

                          7
REST
Representational State Transfer
Clients-Server separation with uniform
interface
  Load-balancing, caching, authorization & authentication,
  proxies
Stateless - client is responsible for creating a self-
sufficient request
Resources are cacheable - servers must mark
non-cacheable resources as such
Only 5 HTTP verbs
 GET, PUT, POST, DELETE, HEAD
                          7
CouchDB
 REST/CRUD
 GET              read


 PUT         create or update


DELETE       delete something


POST         bulk operations
         8
CouchDB assumes:
Each document is completely independent
and should be self-sufficient
An operation on a document is ACID
compliant
Operations across documents are not ACID
Built for distributed applications
You can live with slightly stale data being
served to clients

                   9
MVCC                     Row/Table Lock   CouchDB
Multi-Version
Concurrency Control
RDBMS enforces consistency
using read/write locks
Instead of locks, CouchDB
just serve up old data
Multi-document (mutli-row)
transactional semantics
must be handled by the
application



                         10
Database API

$ curl -X PUT http://127.0.0.1:5984/friendbook
{"ok":true}
Try it Again: {"error":"db_exists"}




                             11
Database API
        Protocol

$ curl -X PUT http://127.0.0.1:5984/friendbook
{"ok":true}
Try it Again: {"error":"db_exists"}




                             11
Database API
                      CouchDB server

$ curl -X PUT http://127.0.0.1:5984/friendbook
{"ok":true}
Try it Again: {"error":"db_exists"}




                             11
Database API
                                        DB name

$ curl -X PUT http://127.0.0.1:5984/friendbook
{"ok":true}
Try it Again: {"error":"db_exists"}




                             11
Database API

$ curl -X PUT http://127.0.0.1:5984/friendbook
{"ok":true}
Try it Again: {"error":"db_exists"}

$ curl -X DELETE http://localhost:5984/friendbook
{"ok":true}




                             11
Backups & Replication
     Backup: simply copy the database files
     Replicate: send a POST request with a source and
     target database
        Source and target DB’s can either be local (just the db
        name) or remote (full URL)
        “continous”: true option will register the target to
        the source’s _changes notification API
$ curl -X POST http://localhost:5984/_replicate 
 -d '{"source":"db", "target":"db-replica", 
    "continuous":true}'


                               12
Backups & Replication
     Backup: simply copy the database files
     Replicate: send a POST request with a source and
     target database
        Source and target DB’s can either be local (just the db
        name) or remote (full URL)
        “continous”: true option will register the target to
        the source’s _changes notification API
$ curl -X POST http://localhost:5984/_replicate 
 -d '{"source":"db", "target":"db-replica", 
    "continuous":true}'

         Takes up a port
                               12
Inserting a document
$ curl -X PUT http://localhost:5984/friendbook/j_doe 
   -d @j_doe.json

{"ok":true,
"id":"j_doe",
"rev":"1-062af1c4ac73287b7e07396c86243432"}




                             13
Inserting a document
$ curl -X PUT http://localhost:5984/friendbook/j_doe 
   -d @j_doe.json

{"ok":true,
"id":"j_doe",
"rev":"1-062af1c4ac73287b7e07396c86243432"}
CouchDB can provide you with unique IDs:

$ curl -X GET http://localhost:5984/_uuids
{"uuids":["d1dde0996a4db7c1ebc78fb89c01b9e6"]}
$ curl -X GET http://localhost:5984/_uuids?count=10

*POSTing a new document to the database URL will auto-generate a UUID for the ID
                                          13
JSON document

{ "name": "J. Doe",
 "friends": 0 }


After insert:

{   "_id":       "j_doe",
    "_rev":      "1-062af1c4ac73287b7e07396c86243432",
    "name":      "J. Doe",
    "friends":   0 }

                             14
JSON document

{ "name": "J. Doe",
 "friends": 0 }


After insert:    operation counter - CRC32 of document

{   "_id":       "j_doe",
    "_rev":      "1-062af1c4ac73287b7e07396c86243432",
    "name":      "J. Doe",
    "friends":   0 }

                             14
Updating a document
revised.json = { "name":"J. Doe", "friends": 1   }




                            15
Updating a document
revised.json = { "name":"J. Doe", "friends": 1   }




{ "_rev":"1-062af1c4ac73287b7e07396c86243432",

{ "ok":true,
  "id":"j_doe",
  "rev":"2-0629239b53a8d146a3a3c4c63e2dbfd0"}

                                15
Updating a document
revised.json = { "name":"J. Doe", "friends": 1    }


$ curl -X PUT http://localhost:5984/friendbook/j_doe 
      -d @revised.json
{ "error":"conflict",
  "reason":"Document update conflict."}

{ "_rev":"1-062af1c4ac73287b7e07396c86243432",

{ "ok":true,
  "id":"j_doe",
  "rev":"2-0629239b53a8d146a3a3c4c63e2dbfd0"}

                                15
Update is a full write




               http://horicky.blogspot.com/2009/11/nosql-patterns.html
          16
Deleting a document
DELETE requires the revision as URL parameter or the E-Tag
HTTP header.
$ curl -X DELETE http://localhost:5984/friendbook/j_doe?
rev=2-0629239b53a8d146a3a3c4c63e2dbfd0

{"ok":true,"id":"j_doe",
 "rev":"3-57673a4b7b662bb916cc374a92318c6b"}

Returns a revision number for the delete, used
for synchronization and the changes API
$ curl -X GET http://localhost:5984/friendbook/j_doe
{"error":"not_found","reason":"deleted"}
                                17
Notables
MVCC != version control system
 POST to /db/_compact deletes all older vesions
 Deletes only keep metadata around for
 synchronization and merge conflict resolution
To “roll back a transaction” you must:
 Retrieve all related records, cache these
 Insert any updates to records.
 On failure, use the returned revision numbers to
 re-insert the older record as a new one

                     18
Our Example Problem




         19
Our Example Problem
 Hello world? Blog? Twitter clone?




                   19
Our Example Problem
 Hello world? Blog? Twitter clone?
 Let’s store all human proteins instead




                    19
Our Example Problem
               Hello world? Blog? Twitter clone?
               Let’s store all human proteins instead

LOCUS     YP_003024029           227 aa         linear PRI 09-JUL-2009
DEFINITION cytochrome c oxidase subunit II [Homo sapiens].
ACCESSION YP_003024029
VERSION    YP_003024029.1 GI:251831110
DBLINK    Project:30353
DBSOURCE REFSEQ: accession NC_012920.1
KEYWORDS .
SOURCE     mitochondrion Homo sapiens (human)
 ORGANISM Homo sapiens
       Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
       Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
       Catarrhini; Hominidae; Homo.




                                                  19
Our Example Problem
              Hello world? Blog? Twitter clone?
              Let’s store all human proteins instead

LOCUS     YP_003024029           227 aa         linear PRI 09-JUL-2009
DEFINITION cytochrome c oxidase subunit II [Homo sapiens].
ACCESSION YP_003024029
VERSION    YP_003024029.1 GI:251831110
DBLINK    Project:30353
                      FEATURES
DBSOURCE REFSEQ: accession NC_012920.1  Location/Qualifiers
KEYWORDS .               source       1..227
SOURCE                             /organism="Homo sapiens"
           mitochondrion Homo sapiens (human)
 ORGANISM Homo sapiens             /organelle="mitochondrion"
                                   /isolation_source="caucasian"
       Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
                                   /db_xref="taxon:9606"
       Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
       Catarrhini; Hominidae; Homo./tissue_type="placenta"
                                   /country="United Kingdom: Great Britain"
                                   /note="this is the rCRS"
                         Protein      1..227
                                   /product="cytochrome c oxidase subunit II"
                                   /calculated_mol_wt=25434
                                                   19                    http://www.ncbi.nlm.nih.gov/
Our Example Problem
              Hello world? Blog? Twitter clone?
              Let’s store all human proteins instead

LOCUS     YP_003024029           227 aa         linear PRI 09-JUL-2009
DEFINITION cytochrome c oxidase subunit II [Homo sapiens].
ACCESSION YP_003024029
VERSION    YP_003024029.1 GI:251831110
DBLINK    Project:30353
                      FEATURES
DBSOURCE REFSEQ: accession NC_012920.1  Location/Qualifiers
KEYWORDS .               source       1..227
SOURCE                             /organism="Homo sapiens"
           mitochondrion Homo sapiens (human)
 ORGANISM Homo sapiens             /organelle="mitochondrion"
                                   /isolation_source="caucasian"
       Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
                                   /db_xref="taxon:9606"
       Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
       Catarrhini; Hominidae; Homo./tissue_type="placenta"
                                   /country="United Kingdom: Great Britain"
                                   /note="this is the rCRS"
                         Protein      1..227
                                   /product="cytochrome c oxidase subunit II"
                                   /calculated_mol_wt=25434
                                                   19                    http://www.ncbi.nlm.nih.gov/
Futon : A Couchapp




        20
Futon : A Couchapp




        20
Futon : A Couchapp




        20
Futon : A Couchapp


              This one is
              going to be
             a bit tougher




        20
Design Documents
The key to using CouchDB as more than a
key-value store
Just another JSON document,
 Contains javascript functions
 Functions are executed within CouchDB
Map-reduce views, data validation,
alternate formatting, ...
JS libraries & data (PNG images)


                   21
{
    "_id" : "_design/gb",
    "language" : "javascript",
    "views" : {
      "gi" : {
         "map" : "function(doc) { emit(doc.gi, doc._id) }"
      },
      "dbXref" : {
         "map" : "function(doc) {
                    var ftLen= doc.features.length;
                    for ( var i=0; i < ftLen; i++ ) {
                      var ft = doc.features[i];
                      var qLen = ft.qualifiers.length;
                      for (var j = 0; j < qLen; j++) {
                        var ql = ft.qualifiers[j];
                        if (ql.qualifier.match('db_xref') ) {
                          emit(ql.value, doc._id);
                        }
                      }
                    }
                 }"
         }
}
                         }
                     }
                }"
        }
    },
    "shows" : {
       "fasta" : "function(doc, req) {
                   if (!doc) { return 'Not found' ; }

                     var tmp = '>' + doc._id + 'n';
                     var tmpseq = doc.seq;
                     while(tmpseq.length > 0) {
                       tmp = tmp + tmpseq.substring(0,79) + 'n';
                       tmpseq = tmpseq.substring(80);
                     }
                     return {
                       body: tmp,
                       headers: {'Content-Type' : 'text/plain'}
                     }
                }"
    }
}
Map function example




         23
Map function example




         23
Complex Map




     24
View Result




     25
View Result




     25
GET by the indexed key
GET /refseq_human/_design/gb/_view/dbXref?key="GeneID:10"


{ "total_rows":7,"offset":2,
  "rows":[
        { "id":"NP_000006",
          "key":"GeneID:10",
          "value":"NP_000006"
        }
       ]
}

                           26
Reduce function:
                 Keyword count
 Map function:              Reduce function:

function(doc) {             function(keys,values) {
  if(doc.foodz){               return sum(values);
     doc.foodz.forEach(     }
       function(food) {
         emit(food,1);
})}}
                            Output:
                          “rows”: [ “chinese”: 4,
                                    “ethiopian”: 3,
                                    “indian”: 17 ]



                           27
ReReduce
 Map function:                                Reduce function:
                                             function(keys,values,rereduce) {
function(doc) {
                                               if (rereduce){
  if(doc.foodz){
                                                 return sum(values)
     doc.foodz.forEach(
                                               } else {
       function(food) {
                                                 return values.length
         emit(food,1);
                                               }
})}}
                                             }



 Same result, but this let’s us put in some useful value in the map, as opposed 1 repeated ad
 nauseam
 Could also output null to save space since indexes store the emitted values



                                             28
ReReduce
 Map function:                                Reduce function:
                                             function(keys,values,rereduce) {
function(doc) {
                                               if (rereduce){
  if(doc.foodz){
                                                 return sum(values)
     doc.foodz.forEach(
                                               } else {
       function(food) {
                                                 return values.length
         emit(food,1);
                                               }
})}}
                                             }                  true / false

 Same result, but this let’s us put in some useful value in the map, as opposed 1 repeated ad
 nauseam
 Could also output null to save space since indexes store the emitted values



                                             28
“Joins”
A reduce function could create a virtual doc by collating
different doc types, but I don’t recommend it

       Map function:

      function(doc) {
        if (doc.type == "post") {
          map([doc._id, 0], doc);
        } else if (doc.type == "comment") {
          map([doc.post, doc.created_at], doc);
        }
      }


                          29
Distributed CouchDB
      Datastores


         30
The CAP theory : applies when business
     logic is separate from storage

Consistency vs. Availability
vs. Partition tolerance
RDBMS = enforced
consistency
PAXOS = quorum
consistency
CouchDB (and others) =
eventual consistency
and horizontally
scalable


  http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
                                 31
Considerations
Server & Data replication
 Load balancing and fail-over
Data partitioning and distribution
Query distribution and results collation




                     32
Consistent Hashing




        33
Consistent Hashing
     C
              A



                  B




         33
Consistent Hashing
              Q1
     C
                   A



                       B




         33
Consistent Hashing
              Q1
     C
                    A



                        B




               Q2



         33
Consistent Hashing
                 Q1
       C
                       A
                           Q5


  Q4                        B




       Q3
                  Q2



            33
Consistent Hashing
                 Q1
       C
                       A
                           Q5


  Q4                        B




       Q3
                  Q2



            33
Node Failure
               Q1
     C
                     A
                         Q5


Q4                        B




     Q3
                Q2



          34
Node Failure
               Q1

                     A
                         Q5


Q4                        B




     Q3
                Q2



          34
Data Replication
                A



 C




            B



       35
Data Replication
                A



 C




            B



       35
Data Partitioning
Partition data using URI
components                      C
CouchDB-Lounge’s                    A
dumbproxy module
  nginx module
HAProxy URI                             B




                           36
Data Partitioning
     Partition data using URI
     components                      C
     CouchDB-Lounge’s                    A
     dumbproxy module
        nginx module
     HAProxy URI                             B

http://tv.com/shows/1234   A




                                36
Data Partitioning
     Partition data using URI
     components                      C
     CouchDB-Lounge’s                                A
     dumbproxy module
         nginx module
     HAProxy URI                                         B

http://tv.com/shows/1234    A

http://tv.com/shows/34671   B


           But wait, they weren’t synchronizing?!?
                                36
The Full Picture
Data Replication = same color

      Data Partitioning

       Load Balancing




                           37
CouchDB Replication




              http://horicky.blogspot.com/2009/11/nosql-patterns.html

         38
Conflicts
Conflicting documents are tagged with
_conflict: true
Conflicts are resolved using the vector
clock
The “winning” document becomes the
most current version
The loser becomes the version previous to
the winner

                 39
Thank You!
Learn
http://couchdb.apache.org/
http://books.couchdb.org/relax
http://wiki.apache.org/couchdb/

Awesome posts by community
http://planet.couchdb.org
 (especially Ricky Ho)

Development Libraries
http://github.com/jchris/couchrest
http://github.com/couchapp/couchapp
                 40

More Related Content

What's hot

Azure DocumentDB for Healthcare Integration
Azure DocumentDB for Healthcare IntegrationAzure DocumentDB for Healthcare Integration
Azure DocumentDB for Healthcare IntegrationBizTalk360
 
Implementing and Visualizing Clickstream data with MongoDB
Implementing and Visualizing Clickstream data with MongoDBImplementing and Visualizing Clickstream data with MongoDB
Implementing and Visualizing Clickstream data with MongoDBMongoDB
 
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsOUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsAndrew Morgan
 
Building Spring Data with MongoDB
Building Spring Data with MongoDBBuilding Spring Data with MongoDB
Building Spring Data with MongoDBMongoDB
 
CouchDB – A Database for the Web
CouchDB – A Database for the WebCouchDB – A Database for the Web
CouchDB – A Database for the WebKarel Minarik
 
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...In-Memory Computing Summit
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMongoDB
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherMongoDB
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring dataMongoDB 2.4 and spring data
MongoDB 2.4 and spring dataJimmy Ray
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowTreasure Data, Inc.
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDBElieHannouch
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
DAC4B 2015 - Polybase
DAC4B 2015 - PolybaseDAC4B 2015 - Polybase
DAC4B 2015 - PolybaseŁukasz Grala
 
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...NoSQLmatters
 

What's hot (20)

Azure DocumentDB for Healthcare Integration
Azure DocumentDB for Healthcare IntegrationAzure DocumentDB for Healthcare Integration
Azure DocumentDB for Healthcare Integration
 
CouchDB
CouchDBCouchDB
CouchDB
 
MongoDB on Azure
MongoDB on AzureMongoDB on Azure
MongoDB on Azure
 
Implementing and Visualizing Clickstream data with MongoDB
Implementing and Visualizing Clickstream data with MongoDBImplementing and Visualizing Clickstream data with MongoDB
Implementing and Visualizing Clickstream data with MongoDB
 
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsOUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
 
Building Spring Data with MongoDB
Building Spring Data with MongoDBBuilding Spring Data with MongoDB
Building Spring Data with MongoDB
 
Couch db
Couch dbCouch db
Couch db
 
CouchDB
CouchDBCouchDB
CouchDB
 
CouchDB – A Database for the Web
CouchDB – A Database for the WebCouchDB – A Database for the Web
CouchDB – A Database for the Web
 
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDB
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop Together
 
Apache CouchDB
Apache CouchDBApache CouchDB
Apache CouchDB
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring dataMongoDB 2.4 and spring data
MongoDB 2.4 and spring data
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
 
Mongo db report
Mongo db reportMongo db report
Mongo db report
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDB
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
DAC4B 2015 - Polybase
DAC4B 2015 - PolybaseDAC4B 2015 - Polybase
DAC4B 2015 - Polybase
 
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
 

Similar to CouchDB : More Couch

Couchdb: No SQL? No driver? No problem
Couchdb: No SQL? No driver? No problemCouchdb: No SQL? No driver? No problem
Couchdb: No SQL? No driver? No problemdelagoya
 
Apache CouchDB talk at Ontario GNU Linux Fest
Apache CouchDB talk at Ontario GNU Linux FestApache CouchDB talk at Ontario GNU Linux Fest
Apache CouchDB talk at Ontario GNU Linux FestMyles Braithwaite
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...confluent
 
ORDS - Oracle REST Data Services
ORDS - Oracle REST Data ServicesORDS - Oracle REST Data Services
ORDS - Oracle REST Data ServicesJustin Michael Raj
 
Data Modeling and Relational to NoSQL
Data Modeling and Relational to NoSQLData Modeling and Relational to NoSQL
Data Modeling and Relational to NoSQLDATAVERSITY
 
Working with disconnected data in Windows Store apps
Working with disconnected data in Windows Store appsWorking with disconnected data in Windows Store apps
Working with disconnected data in Windows Store appsAlex Casquete
 
OSCON 2011 CouchApps
OSCON 2011 CouchAppsOSCON 2011 CouchApps
OSCON 2011 CouchAppsBradley Holt
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourPeter Friese
 
REST with Eve and Python
REST with Eve and PythonREST with Eve and Python
REST with Eve and PythonPiXeL16
 
UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...
UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...
UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...Ivanti
 
distributing over the web
distributing over the webdistributing over the web
distributing over the webNicola Baldi
 
Presenting CalDAV (draft 1)
Presenting CalDAV (draft 1)Presenting CalDAV (draft 1)
Presenting CalDAV (draft 1)Roberto Polli
 
Pres Db2 native rest json and z/OS connect
Pres Db2 native rest json and z/OS connect Pres Db2 native rest json and z/OS connect
Pres Db2 native rest json and z/OS connect Cécile Benhamou
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.jsWinston Hsieh
 
CouchDB Open Source Bridge
CouchDB Open Source BridgeCouchDB Open Source Bridge
CouchDB Open Source BridgeChris Anderson
 

Similar to CouchDB : More Couch (20)

Couchdb: No SQL? No driver? No problem
Couchdb: No SQL? No driver? No problemCouchdb: No SQL? No driver? No problem
Couchdb: No SQL? No driver? No problem
 
Apache CouchDB talk at Ontario GNU Linux Fest
Apache CouchDB talk at Ontario GNU Linux FestApache CouchDB talk at Ontario GNU Linux Fest
Apache CouchDB talk at Ontario GNU Linux Fest
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
 
ORDS - Oracle REST Data Services
ORDS - Oracle REST Data ServicesORDS - Oracle REST Data Services
ORDS - Oracle REST Data Services
 
Data Modeling and Relational to NoSQL
Data Modeling and Relational to NoSQLData Modeling and Relational to NoSQL
Data Modeling and Relational to NoSQL
 
Working with disconnected data in Windows Store apps
Working with disconnected data in Windows Store appsWorking with disconnected data in Windows Store apps
Working with disconnected data in Windows Store apps
 
OSCON 2011 CouchApps
OSCON 2011 CouchAppsOSCON 2011 CouchApps
OSCON 2011 CouchApps
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 Hour
 
REST with Eve and Python
REST with Eve and PythonREST with Eve and Python
REST with Eve and Python
 
Docker tlv
Docker tlvDocker tlv
Docker tlv
 
UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...
UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...
UEMB200: Next Generation of Endpoint Management Architecture and Discovery Se...
 
Couchdb Nosql
Couchdb NosqlCouchdb Nosql
Couchdb Nosql
 
distributing over the web
distributing over the webdistributing over the web
distributing over the web
 
Presenting CalDAV (draft 1)
Presenting CalDAV (draft 1)Presenting CalDAV (draft 1)
Presenting CalDAV (draft 1)
 
Pres Db2 native rest json and z/OS connect
Pres Db2 native rest json and z/OS connect Pres Db2 native rest json and z/OS connect
Pres Db2 native rest json and z/OS connect
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
CouchDB Open Source Bridge
CouchDB Open Source BridgeCouchDB Open Source Bridge
CouchDB Open Source Bridge
 
Unit 02: Web Technologies (2/2)
Unit 02: Web Technologies (2/2)Unit 02: Web Technologies (2/2)
Unit 02: Web Technologies (2/2)
 

More from delagoya

Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017delagoya
 
Machine Learning on the Cloud with Apache MXNet
Machine Learning on the Cloud with Apache MXNetMachine Learning on the Cloud with Apache MXNet
Machine Learning on the Cloud with Apache MXNetdelagoya
 
padrino_and_sequel
padrino_and_sequelpadrino_and_sequel
padrino_and_sequeldelagoya
 
Itmat pcbi-r-course-1
Itmat pcbi-r-course-1Itmat pcbi-r-course-1
Itmat pcbi-r-course-1delagoya
 
Everything comes in 3's
Everything comes in 3'sEverything comes in 3's
Everything comes in 3'sdelagoya
 

More from delagoya (6)

Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017
 
Machine Learning on the Cloud with Apache MXNet
Machine Learning on the Cloud with Apache MXNetMachine Learning on the Cloud with Apache MXNet
Machine Learning on the Cloud with Apache MXNet
 
Ruby FFI
Ruby FFIRuby FFI
Ruby FFI
 
padrino_and_sequel
padrino_and_sequelpadrino_and_sequel
padrino_and_sequel
 
Itmat pcbi-r-course-1
Itmat pcbi-r-course-1Itmat pcbi-r-course-1
Itmat pcbi-r-course-1
 
Everything comes in 3's
Everything comes in 3'sEverything comes in 3's
Everything comes in 3's
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

CouchDB : More Couch

  • 1. * CouchDB Sectional Sofa Edition Angel Pizarro angel@upenn.edu * www.bauwel-movement.co.uk/sculpture.php
  • 2. About Me Me: Bioinformatics, help scientists with big data Lots of data in lots of formats Ruby and Ruby on Rails But that doesn’t matter for CouchDB! Initially interested in CouchDB for AWS deployment 2
  • 3. Overview CouchDB run through A short example Big Deployment Issues Questions - ask away at any time 3
  • 4. A sober moment 2,402 killed and 1,282 wounded 4
  • 5. A sober moment 2,402 killed and 1,282 wounded 4
  • 6. Key-Value Databases Datastore of values indexed by keys (duh!) Must provide the ID for all operations Hash or B-Tree Hash is FAST, but only allows single-value lookups B-Tree is slower, but allows range queries Horizontally scalable 5
  • 7. CouchDB Schema free, document oriented database Javascript Object Notation (JSON) HTTP protocol using REST operations No direct native language drivers * Javascript is the lingua franca ACID & MVCC guarantees on a per- document basis Map-Reduce model for indexing and views Back-ups and replication are simple 6
  • 8. CouchDB Schema free, document oriented database Javascript Object Notation (JSON) HTTP protocol using REST operations No direct native language drivers * Javascript is the lingua franca ACID & MVCC guarantees on a per- document basis Map-Reduce model for indexing and views Back-ups and replication are simple * Hovercraft: http://github.com/jchris/hovercraft/ 6
  • 11. REST Representational State Transfer Clients-Server separation with uniform interface 7
  • 12. REST Representational State Transfer Clients-Server separation with uniform interface Load-balancing, caching, authorization & authentication, proxies 7
  • 13. REST Representational State Transfer Clients-Server separation with uniform interface Load-balancing, caching, authorization & authentication, proxies Stateless - client is responsible for creating a self- sufficient request 7
  • 14. REST Representational State Transfer Clients-Server separation with uniform interface Load-balancing, caching, authorization & authentication, proxies Stateless - client is responsible for creating a self- sufficient request Resources are cacheable - servers must mark non-cacheable resources as such 7
  • 15. REST Representational State Transfer Clients-Server separation with uniform interface Load-balancing, caching, authorization & authentication, proxies Stateless - client is responsible for creating a self- sufficient request Resources are cacheable - servers must mark non-cacheable resources as such Only 5 HTTP verbs 7
  • 16. REST Representational State Transfer Clients-Server separation with uniform interface Load-balancing, caching, authorization & authentication, proxies Stateless - client is responsible for creating a self- sufficient request Resources are cacheable - servers must mark non-cacheable resources as such Only 5 HTTP verbs GET, PUT, POST, DELETE, HEAD 7
  • 17. CouchDB REST/CRUD GET read PUT create or update DELETE delete something POST bulk operations 8
  • 18. CouchDB assumes: Each document is completely independent and should be self-sufficient An operation on a document is ACID compliant Operations across documents are not ACID Built for distributed applications You can live with slightly stale data being served to clients 9
  • 19. MVCC Row/Table Lock CouchDB Multi-Version Concurrency Control RDBMS enforces consistency using read/write locks Instead of locks, CouchDB just serve up old data Multi-document (mutli-row) transactional semantics must be handled by the application 10
  • 20. Database API $ curl -X PUT http://127.0.0.1:5984/friendbook {"ok":true} Try it Again: {"error":"db_exists"} 11
  • 21. Database API Protocol $ curl -X PUT http://127.0.0.1:5984/friendbook {"ok":true} Try it Again: {"error":"db_exists"} 11
  • 22. Database API CouchDB server $ curl -X PUT http://127.0.0.1:5984/friendbook {"ok":true} Try it Again: {"error":"db_exists"} 11
  • 23. Database API DB name $ curl -X PUT http://127.0.0.1:5984/friendbook {"ok":true} Try it Again: {"error":"db_exists"} 11
  • 24. Database API $ curl -X PUT http://127.0.0.1:5984/friendbook {"ok":true} Try it Again: {"error":"db_exists"} $ curl -X DELETE http://localhost:5984/friendbook {"ok":true} 11
  • 25. Backups & Replication Backup: simply copy the database files Replicate: send a POST request with a source and target database Source and target DB’s can either be local (just the db name) or remote (full URL) “continous”: true option will register the target to the source’s _changes notification API $ curl -X POST http://localhost:5984/_replicate -d '{"source":"db", "target":"db-replica", "continuous":true}' 12
  • 26. Backups & Replication Backup: simply copy the database files Replicate: send a POST request with a source and target database Source and target DB’s can either be local (just the db name) or remote (full URL) “continous”: true option will register the target to the source’s _changes notification API $ curl -X POST http://localhost:5984/_replicate -d '{"source":"db", "target":"db-replica", "continuous":true}' Takes up a port 12
  • 27. Inserting a document $ curl -X PUT http://localhost:5984/friendbook/j_doe -d @j_doe.json {"ok":true, "id":"j_doe", "rev":"1-062af1c4ac73287b7e07396c86243432"} 13
  • 28. Inserting a document $ curl -X PUT http://localhost:5984/friendbook/j_doe -d @j_doe.json {"ok":true, "id":"j_doe", "rev":"1-062af1c4ac73287b7e07396c86243432"} CouchDB can provide you with unique IDs: $ curl -X GET http://localhost:5984/_uuids {"uuids":["d1dde0996a4db7c1ebc78fb89c01b9e6"]} $ curl -X GET http://localhost:5984/_uuids?count=10 *POSTing a new document to the database URL will auto-generate a UUID for the ID 13
  • 29. JSON document { "name": "J. Doe", "friends": 0 } After insert: { "_id": "j_doe", "_rev": "1-062af1c4ac73287b7e07396c86243432", "name": "J. Doe", "friends": 0 } 14
  • 30. JSON document { "name": "J. Doe", "friends": 0 } After insert: operation counter - CRC32 of document { "_id": "j_doe", "_rev": "1-062af1c4ac73287b7e07396c86243432", "name": "J. Doe", "friends": 0 } 14
  • 31. Updating a document revised.json = { "name":"J. Doe", "friends": 1 } 15
  • 32. Updating a document revised.json = { "name":"J. Doe", "friends": 1 } { "_rev":"1-062af1c4ac73287b7e07396c86243432", { "ok":true, "id":"j_doe", "rev":"2-0629239b53a8d146a3a3c4c63e2dbfd0"} 15
  • 33. Updating a document revised.json = { "name":"J. Doe", "friends": 1 } $ curl -X PUT http://localhost:5984/friendbook/j_doe -d @revised.json { "error":"conflict", "reason":"Document update conflict."} { "_rev":"1-062af1c4ac73287b7e07396c86243432", { "ok":true, "id":"j_doe", "rev":"2-0629239b53a8d146a3a3c4c63e2dbfd0"} 15
  • 34. Update is a full write http://horicky.blogspot.com/2009/11/nosql-patterns.html 16
  • 35. Deleting a document DELETE requires the revision as URL parameter or the E-Tag HTTP header. $ curl -X DELETE http://localhost:5984/friendbook/j_doe? rev=2-0629239b53a8d146a3a3c4c63e2dbfd0 {"ok":true,"id":"j_doe", "rev":"3-57673a4b7b662bb916cc374a92318c6b"} Returns a revision number for the delete, used for synchronization and the changes API $ curl -X GET http://localhost:5984/friendbook/j_doe {"error":"not_found","reason":"deleted"} 17
  • 36. Notables MVCC != version control system POST to /db/_compact deletes all older vesions Deletes only keep metadata around for synchronization and merge conflict resolution To “roll back a transaction” you must: Retrieve all related records, cache these Insert any updates to records. On failure, use the returned revision numbers to re-insert the older record as a new one 18
  • 38. Our Example Problem Hello world? Blog? Twitter clone? 19
  • 39. Our Example Problem Hello world? Blog? Twitter clone? Let’s store all human proteins instead 19
  • 40. Our Example Problem Hello world? Blog? Twitter clone? Let’s store all human proteins instead LOCUS YP_003024029 227 aa linear PRI 09-JUL-2009 DEFINITION cytochrome c oxidase subunit II [Homo sapiens]. ACCESSION YP_003024029 VERSION YP_003024029.1 GI:251831110 DBLINK Project:30353 DBSOURCE REFSEQ: accession NC_012920.1 KEYWORDS . SOURCE mitochondrion Homo sapiens (human) ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo. 19
  • 41. Our Example Problem Hello world? Blog? Twitter clone? Let’s store all human proteins instead LOCUS YP_003024029 227 aa linear PRI 09-JUL-2009 DEFINITION cytochrome c oxidase subunit II [Homo sapiens]. ACCESSION YP_003024029 VERSION YP_003024029.1 GI:251831110 DBLINK Project:30353 FEATURES DBSOURCE REFSEQ: accession NC_012920.1 Location/Qualifiers KEYWORDS . source 1..227 SOURCE /organism="Homo sapiens" mitochondrion Homo sapiens (human) ORGANISM Homo sapiens /organelle="mitochondrion" /isolation_source="caucasian" Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; /db_xref="taxon:9606" Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo./tissue_type="placenta" /country="United Kingdom: Great Britain" /note="this is the rCRS" Protein 1..227 /product="cytochrome c oxidase subunit II" /calculated_mol_wt=25434 19 http://www.ncbi.nlm.nih.gov/
  • 42. Our Example Problem Hello world? Blog? Twitter clone? Let’s store all human proteins instead LOCUS YP_003024029 227 aa linear PRI 09-JUL-2009 DEFINITION cytochrome c oxidase subunit II [Homo sapiens]. ACCESSION YP_003024029 VERSION YP_003024029.1 GI:251831110 DBLINK Project:30353 FEATURES DBSOURCE REFSEQ: accession NC_012920.1 Location/Qualifiers KEYWORDS . source 1..227 SOURCE /organism="Homo sapiens" mitochondrion Homo sapiens (human) ORGANISM Homo sapiens /organelle="mitochondrion" /isolation_source="caucasian" Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; /db_xref="taxon:9606" Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo./tissue_type="placenta" /country="United Kingdom: Great Britain" /note="this is the rCRS" Protein 1..227 /product="cytochrome c oxidase subunit II" /calculated_mol_wt=25434 19 http://www.ncbi.nlm.nih.gov/
  • 43. Futon : A Couchapp 20
  • 44. Futon : A Couchapp 20
  • 45. Futon : A Couchapp 20
  • 46. Futon : A Couchapp This one is going to be a bit tougher 20
  • 47. Design Documents The key to using CouchDB as more than a key-value store Just another JSON document, Contains javascript functions Functions are executed within CouchDB Map-reduce views, data validation, alternate formatting, ... JS libraries & data (PNG images) 21
  • 48. { "_id" : "_design/gb", "language" : "javascript", "views" : { "gi" : { "map" : "function(doc) { emit(doc.gi, doc._id) }" }, "dbXref" : { "map" : "function(doc) { var ftLen= doc.features.length; for ( var i=0; i < ftLen; i++ ) { var ft = doc.features[i]; var qLen = ft.qualifiers.length; for (var j = 0; j < qLen; j++) { var ql = ft.qualifiers[j]; if (ql.qualifier.match('db_xref') ) { emit(ql.value, doc._id); } } } }" }
  • 49. } } } }" } }, "shows" : { "fasta" : "function(doc, req) { if (!doc) { return 'Not found' ; } var tmp = '>' + doc._id + 'n'; var tmpseq = doc.seq; while(tmpseq.length > 0) { tmp = tmp + tmpseq.substring(0,79) + 'n'; tmpseq = tmpseq.substring(80); } return { body: tmp, headers: {'Content-Type' : 'text/plain'} } }" } }
  • 55. GET by the indexed key GET /refseq_human/_design/gb/_view/dbXref?key="GeneID:10" { "total_rows":7,"offset":2, "rows":[ { "id":"NP_000006", "key":"GeneID:10", "value":"NP_000006" } ] } 26
  • 56. Reduce function: Keyword count Map function: Reduce function: function(doc) { function(keys,values) { if(doc.foodz){ return sum(values); doc.foodz.forEach( } function(food) { emit(food,1); })}} Output: “rows”: [ “chinese”: 4, “ethiopian”: 3, “indian”: 17 ] 27
  • 57. ReReduce Map function: Reduce function: function(keys,values,rereduce) { function(doc) { if (rereduce){ if(doc.foodz){ return sum(values) doc.foodz.forEach( } else { function(food) { return values.length emit(food,1); } })}} } Same result, but this let’s us put in some useful value in the map, as opposed 1 repeated ad nauseam Could also output null to save space since indexes store the emitted values 28
  • 58. ReReduce Map function: Reduce function: function(keys,values,rereduce) { function(doc) { if (rereduce){ if(doc.foodz){ return sum(values) doc.foodz.forEach( } else { function(food) { return values.length emit(food,1); } })}} } true / false Same result, but this let’s us put in some useful value in the map, as opposed 1 repeated ad nauseam Could also output null to save space since indexes store the emitted values 28
  • 59. “Joins” A reduce function could create a virtual doc by collating different doc types, but I don’t recommend it Map function: function(doc) { if (doc.type == "post") { map([doc._id, 0], doc); } else if (doc.type == "comment") { map([doc.post, doc.created_at], doc); } } 29
  • 60. Distributed CouchDB Datastores 30
  • 61. The CAP theory : applies when business logic is separate from storage Consistency vs. Availability vs. Partition tolerance RDBMS = enforced consistency PAXOS = quorum consistency CouchDB (and others) = eventual consistency and horizontally scalable http://www.julianbrowne.com/article/viewer/brewers-cap-theorem 31
  • 62. Considerations Server & Data replication Load balancing and fail-over Data partitioning and distribution Query distribution and results collation 32
  • 64. Consistent Hashing C A B 33
  • 65. Consistent Hashing Q1 C A B 33
  • 66. Consistent Hashing Q1 C A B Q2 33
  • 67. Consistent Hashing Q1 C A Q5 Q4 B Q3 Q2 33
  • 68. Consistent Hashing Q1 C A Q5 Q4 B Q3 Q2 33
  • 69. Node Failure Q1 C A Q5 Q4 B Q3 Q2 34
  • 70. Node Failure Q1 A Q5 Q4 B Q3 Q2 34
  • 71. Data Replication A C B 35
  • 72. Data Replication A C B 35
  • 73. Data Partitioning Partition data using URI components C CouchDB-Lounge’s A dumbproxy module nginx module HAProxy URI B 36
  • 74. Data Partitioning Partition data using URI components C CouchDB-Lounge’s A dumbproxy module nginx module HAProxy URI B http://tv.com/shows/1234 A 36
  • 75. Data Partitioning Partition data using URI components C CouchDB-Lounge’s A dumbproxy module nginx module HAProxy URI B http://tv.com/shows/1234 A http://tv.com/shows/34671 B But wait, they weren’t synchronizing?!? 36
  • 76. The Full Picture Data Replication = same color Data Partitioning Load Balancing 37
  • 77. CouchDB Replication http://horicky.blogspot.com/2009/11/nosql-patterns.html 38
  • 78. Conflicts Conflicting documents are tagged with _conflict: true Conflicts are resolved using the vector clock The “winning” document becomes the most current version The loser becomes the version previous to the winner 39
  • 79. Thank You! Learn http://couchdb.apache.org/ http://books.couchdb.org/relax http://wiki.apache.org/couchdb/ Awesome posts by community http://planet.couchdb.org (especially Ricky Ho) Development Libraries http://github.com/jchris/couchrest http://github.com/couchapp/couchapp 40

Editor's Notes

  1. This talk was given Dec 7, 2009, Pearl Harbor day.
  2. This talk was given Dec 7, 2009, Pearl Harbor day.
  3. This talk was given Dec 7, 2009, Pearl Harbor day.
  4. - If the key is a DateTime, then B-tree is a much better choice
  5. Highlighted words covered later in order that they appear
  6. Other stuff, but this is the most relevant for the discussion Most user&amp;#x2019;s browsers only support GET and POST, but that is changing
  7. Other stuff, but this is the most relevant for the discussion Most user&amp;#x2019;s browsers only support GET and POST, but that is changing
  8. Other stuff, but this is the most relevant for the discussion Most user&amp;#x2019;s browsers only support GET and POST, but that is changing
  9. Other stuff, but this is the most relevant for the discussion Most user&amp;#x2019;s browsers only support GET and POST, but that is changing
  10. Other stuff, but this is the most relevant for the discussion Most user&amp;#x2019;s browsers only support GET and POST, but that is changing
  11. Other stuff, but this is the most relevant for the discussion Most user&amp;#x2019;s browsers only support GET and POST, but that is changing
  12. Other stuff, but this is the most relevant for the discussion Most user&amp;#x2019;s browsers only support GET and POST, but that is changing
  13. CRUD = Create Read Update Delete
  14. In a perfect world, the documents should be self-sufficient, but sometimes reality gets in the way and documents will have to relate to each other. See GAE foreign key references
  15. Keep in mind we are talking theory here. Most RDBMS today use MVCC as well for row level read while a write is happening. Optimistic locking is another technique to enable concurrent data access and writes. MySQL MyISAM is a notable exception in that it does table level locks on write. Use InnoDB. Next is the API discussions
  16. Append-only file structure ensures that your DB is always valid, even during mid-write server failures.
  17. You must provide an ID for the insert. This is in contrast to RDBMS auto-generated primary keys. UUIDs are good for distributed systems, since duplicate ID likelihood is small
  18. Typically you GET the full document, revise it within the application, then submit the entire JSON document back as a PUT operation
  19. You cannot delete a specific revision! The revision number is only there so that the server can definitively say you are talking about the most recent record. You need delete rev for replication of delete operations on other servers that are being synced to this one.
  20. Might also be able to delete a particualr version. Will have to check that.
  21. Note: I could&amp;#x2019;ve made GI a number, but did not in this case Zipcodes would be a bad thing to turn into numbers, b/c of possible leading zeros
  22. Note: I could&amp;#x2019;ve made GI a number, but did not in this case Zipcodes would be a bad thing to turn into numbers, b/c of possible leading zeros
  23. Note: I could&amp;#x2019;ve made GI a number, but did not in this case Zipcodes would be a bad thing to turn into numbers, b/c of possible leading zeros
  24. Best practice = One design document per application or set of requirements Next: Map-Reduce Views
  25. Edit this slide. Maybe just show a full design document.
  26. See the CouchDB book for more information on rereduce and how it takes advantage of the B-tree index
  27. Reduce functions create an index with the emitted values. You would be duplicating all of your data (Not sure about map indexes) Instead emit a collection of docs and collate them on the client.
  28. Brewer&amp;#x2019;s CAP Theorem http://www.julianbrowne.com/article/viewer/brewers-cap-theorem Partition tolerance encompasses both business logic and data partitioning. PAXOS will override more recent updates to a disconnected resource if it did not vote on a previous transaction.
  29. Load balancing and failover are separate concerns, you don&amp;#x2019;t want your failover to be dependent on servers that are part of your load balance infrastructure. We&amp;#x2019;ll handle the easy stuff first, data replication and load balancing
  30. HAProxy added consistent hashing in version 1.3.21 but use 1.3.22
  31. HAProxy added consistent hashing in version 1.3.21 but use 1.3.22
  32. HAProxy added consistent hashing in version 1.3.21 but use 1.3.22
  33. HAProxy added consistent hashing in version 1.3.21 but use 1.3.22
  34. HAProxy added consistent hashing in version 1.3.21 but use 1.3.22