3. Binary JSON
JSON documents
are human-
readable. Like
this:
{ “spoiled” : true }
4. speed
But human-readable
code cannot be
traversed as quickly as
binary code can be.
Speed is important in a
database containing
millions of records.
5. BSON is binary JSON
BSON is a JSON that
has been serialized
as a binary
document.
6. parts of a BSON file
A BSON document consists of three parts:
int32 e_list “x00”
7. int32
The int32 is a 32-bit integer representing the size of
the document. For instance, this document consists
of 13000 bytes. In BSON encoding this becomes:
x13x00x00x00
8. e_list
The e_list is a sequence or list of elements. In our
example, there is only one element. The element
type is declared first, followed by an e_name, and a
trailing digit:
“x08” e_name “x00”
10. “x08” e_name “x00”
The e_name is the key name in a key-value pair. The
word (or ‘value’) “s - p - o -i - l - e- d” would be
represented in a byte array as a series of UTF-8
symbols:
{ 115,112,111,105,108,101,100 }
11. “x08” e_name “x00”
But in BSON a “string” is expressed as the symbol
for a UTF-8 string + value + trailing ‘0’:
x02spoiledx00
12. “x08” e_name “x00”
This is followed, in our example, with the Boolean
symbol ‘1’ for ‘true’ or ‘0’ for ‘false’:
x01
The BSON document is closed out with a trailing
integer ‘0’:
x00
14. in binary, x02 is 00000010
We said that BSON is a binary correlate of JSON. The
whole point of serialization with BSON is to “deflate”
code into a generic, low-level state.
15. 1’s & 0’s
A ‘serializer’ program would take the BSON code we
generated and convert it to binary.
01101000 00000000 00000000 0000000
00100010 100010000 011010010 00010011
01101010 10100100 11100101 0100100
10001000 01110011 10001000 10000100
11110000 10010111 10010010 0001000
10011111 01010101 00011100 11100011
10000010 11111000 11001001 11001010
11110000 10101010 10001111 01010101