This session is a walk through and best practices from installation and initial set up, through maintenance and performance tuning, all the way to production use for a series of Neo4j learning opportunities for administrators.
3. Dev Ops @ FiftyThree
MySQL user & admin since 1998
Multiple tiers of masters & slaves
Bare metal & AWS - EC2/RDS
MySQL & Percona
neo4j user & admin since 2012
neo4j 1.8, 1.9
AWS: Multiple 3-instance enterprise clusters
12. Physical Partitioning & Sharding
Improves write performance, usually disk I/O
MySQL
neo4j
ind_ieprtbe
nobfl_e_al
No logical partitioning by DB or table
Databases on separate partitions or devices
Highly connected data: no clear separation
Shard horizontally (e.g. by time range)
Logs can be on separate partitions for I/O
Shard vertically (e.g. by table or function)
Logs can be on separate partitions for I/O
gain
gain
20. Replication vs. HA
MySQL
Free
Slaves pull updates
Eventual consistency
One-way, asynchronous
neo4j
Enterprise edition: can cost $
depending on use
Slaves can pull asynchronous
updates
Eventual consistency, optimistic
pushes to slaves are the default
Writes to any cluster member
21. JVM
Buffers & Memory management =~ JVM settings
The database itself is extendable via Java
... if you're into that sort of thing
32. MySQL Configuration
Other
And these, depending on version & hardware...
sr_ufrsz
otbfe_ie
tptbesz
m_al_ie
=2
M
=3M
2
ji_ufrsz
onbfe_ie
=18
2k
qeyccetp
ur_ah_ye
qeyccesz
ur_ah_ie
=1
=6M
4
oe_ie_ii
pnflslmt
=89
12
..
..
33. neo4j Configuration Tuning
Simple Questions
How many nodes do you expect?
How many relationships do you expect?
Average number of properties per node and relationship?
Optional: How do you expect to traverse the graph?
Long paths and/or large result sets?
Short paths and/or small results sets?
3 things to calculate:
File Cache Mapped Memory & Object Caches
Heap Size
RAM for OS
34. neo4j Configuration
Store file
Record size
Contents
neostore.nodestore.db
9B
Nodes
neostore.relationshipstore.db
3 B
3
Relationships
neostore.propertystore.db
4 B
1
Properties for nodes and
relationships
neostore.propertystore.db.strings 1 8 B
2
Values of string properties
neostore.propertystore.db.arrays 1 8 B
2
Values of array properties
Capacity Planning Estimates:
Node size (9B) x expected nodes (14 B in 2.0)
Relaltionship size (33B) x expected relationships
Property size (41B) x expected properties
Strings & Arrays
41. neo4j: Buffers, Caching & I/O
neo4j.properties
Two types of caches: file buffer and object cache
File Buffer Cache:
#Dfutvle frtelwlvlgahegn
eal aus o h o-ee rp nie
notr.oetr.bmpe_eoy2M
esoendsoed.apdmmr=5
notr.eainhptr.bmpe_eoy5M
esoerltosisoed.apdmmr=0
notr.rprytr.bmpe_eoy9M
esoepoetsoed.apdmmr=0
notr.rprytr.bsrnsmpe_eoy10
esoepoetsoed.tig.apdmmr=3M
notr.rprytr.bary.apdmmr=3M
esoepoetsoed.rasmpe_eoy10
Object Cache:
nd_ah_ie26
oeccesz=5M
rltosi_ah_ie26
eainhpccesz=5M
#otoa
pinl
nd_ah_ra_rcin5
oeccearyfato=
rltosi_ah_ra_rcin5
eainhpccearyfato=
#TeG rssatccedsrbdblwi ol aalbei te
h C eitn ah ecie eo s ny vial n h
#NojEtrrs Eiin
e4 nepie dto.
#ccetp vle:sf (eal) wa,srn
ah_ye aus ot dfut, ek tog
ccetp=c
ah_yegr
46. Use
File System
$AHT_E4 =/p/e4
PT_ONOJ
otnoj
/p/e4/i (urbnmsl
otnojbn /s/i/yq)
noj
e4
nojbcu
e4-akp
/p/e4/of (ecmsl
otnojcn
/t/yq)
nojpoete
e4.rpris
nojsre.rpris
e4-evrpoete
nojwaprcn
e4-rpe.of
/p/e4/aa(vrlbmsl
otnojdt /a/i/yq)
/p/e4/aagahd (vrlbmsldt)
otnojdt/rp.b /a/i/yq/aa
Teata gahdt
h cul rp aa
/p/e4/aalg(vrlgmsl
otnojdt/o /a/o/yq)
Allg
l os
47. Use
Indexes
The database itself is a natural index
Lucene for searches
neo4j 2.0:
Nodes have labels: Person, Location, etc. that group them into sets
CET IDXO :esnnm)
RAE NE N Pro(ae
Look familiar?
CET IDXi_ne O Pro (d;
RAE NE didx N esn i)
48. Use
Indexes
neo4j 2.0:
Properties can have unique constraints
CET CNTAN O (okBo)ASR bo.snI UIU
RAE OSRIT N bo:ok SET okib S NQE
Look familiar?
CET UIU IDXealidxO Pro (mi)
RAE NQE NE mi_ne N esn eal;
53. Use
Querying via REST
PS ht:/oahs:44d/aacpe
OT tp/lclot77/bdt/yhr
Acp:apiainjo;castUF8
cet plcto/sn hre=TCnetTp:apiainjo
otn-ye plcto/sn
{
}
"ur":"tr x =nd:oeat_ne(ae{trNm}
qey
sat
oend_uoidxnm=satae)
mthpt =(-r-red
ac ah
x[]fin)
weefin.ae={ae rtr TP(),
hr rednm
nm} eun YEr"
"aas :{
prm"
"trNm":""
satae
I,
"ae :"o"
nm"
yu
}
Example response:
20 O
0: K
CnetTp:apiainjo;castUF8
otn-ye plcto/sn hre=T{
}
"oun":["YEr"]
clms
TP() ,
"aa :[["nw ]]
dt"
ko"
54. DBA Perspective
Use the best database for the job, or both
neo4j ships with great tools
neo4j is easier to configure: fewer options, less complex, still flexible
for optimization
HA more robust and more opaque than basic replication
For better or worse, JVM handles a lot for you
Authorization - it's up to you
Scaling up is easier than changing your data model