2. CONTENTS
1. What is distributed-computing system?
2. Principle of distributed database/storage
system
3. Distributed storage system paradigm
4. UniversalDistributedStorage
3. 1. WHAT IS DISTRIBUTED-COMPUTING
SYSTEM?
Distributed-Computing is the process of solving a
computational problem using a distributed
system.
A distributed system is a computing system in
which a number of components on multiple
computers cooperate by communicating over a
network to achieve a common goal.
4. DISTRIBUTED DATABASE/STORAGE
SYSTEM
A distributed database system, the database is
stored on several computers .
A distributed database is a collection of multiple
, Logic computer network .
5. DISTRIBUTED SYSTEM ADVANCE
Advance
Avoid bottleneck & single-point-of-failure
More Scalability
More Availability
Routing model
Client routing: client request to appropriate server to
read/write data
Server routing: server forward request of client to
appropriate server and send result to this client
* can combine the two model above into a system
6. DISTRIBUTED STORAGE SYSTEM
Store some data {1,2,3,4,6,7,8} into 1 server
And store them into 3 distributed server
1,2,3,4,
6,7,8
1,2,3
4,6
7,8
7. 2. PRINCIPLE OF DISTRIBUTED
DATABASE/STORAGE SYSTEM
Shard data key and store it to appropriate server
use Distributed Hash Table (DHT)
DHT must be consistent hashing:
Uniform distribution of generation
Consistent
Jenkins, Murmur are the good choice; MD5, SHA
slower
8. CANONICAL PROBLEMS IN DISTRIBUTED
SYSTEMS
Distributed data independence
Distributed transactions: ACID (Atomicity,
Consistency, Isolation, Durability) requirement
Fault tolerance
Transparency
9. 3. DISTRIBUTED STORAGE SYSTEM
PARADIGM
Data Hashing/Addressing
Determine server for data store in
Data Replication
Store data into multi server node for more available,
fault-tolerance
10. DISTRIBUTED STORAGE SYSTEM
ARCHITECT
Data Hashing/Addressing
Use DHT to addressing server (use server-name) to a
number, performing it on one circle called the keys
space
Use DHT to addressing data and find server store it
by successor(k)=ceiling(addressing(k))
successor(k): server store k
0
server3
server1
server2
11. DISTRIBUTED STORAGE SYSTEM
ARCHITECT
Addressing – Virtual node
Each server node is generated to more node-id for
evenly distributed, load balance
Server1: n1, n4, n6
Server2: n2, n7
Server3: n3, n5
0
server3
server1
server2
n7
n1
n5
n2
n4
n6
n3
n6