Optimal Network Locality in Distributed Services

Optimal Network
Locality in Distributed
Services
Gwendal Simon
Department of Computer Science
Institut Telecom - Telecom Bretagne
2010

Telecom Bretagne

Institut Telecom: graduate engineering schools
Telecom Bretagne: 1200 students (200 PhD)
Computer Science: 20 full-time research lecturers
2 / 27 Gwendal Simon Network Locality in Distributed Services

Credits
Funding:
Thomson R&D (now Technicolor)
French grant with Orange, NDS Tech. and INRIA


Credits
Funding:
Thomson R&D (now Technicolor)
French grant with Orange, NDS Tech. and INRIA

Co-authors:
Jimmy Leblet (post-doc)
Yiping Chen (PhD student)
Zhe Li (PhD student)
Gilles Straub (senior researcher Thomson)
Di Yuan (Ass. Professor: Linkopping Univ. Sweden)

Service Delivery Network

CLOUD

end user



data-center

CLOUD

end user



data-center CDN

CLOUD

end user



data-center CDN

in-network servers
CLOUD

end user



data-center CDN

in-network servers
set-top-box
CLOUD
set-top-box
set-top-box

set-top-box

end user


Toward a Decentralized Architecture
servers’ capacities scale down1

services scale up2

=⇒ multi-servers multi-component architectures3
1
J. He, A. Chaintreau and C. Diot. “A performance evaluation of scalable
live video streaming with nano data centers” Computer Networks, 2009
2
J. Pujol, V. Erramilli, and P. Rodriguez, “Divide and Conquer:
Partitioning Online Social Networks” Arxiv preprint arXiv:0905.4918, 2009.
3
R. Baeza-Yates, A. Gionis, F. Junqueira, V. Plachouras, and L. Telloli,
“On the feasibility of multi-site web search engines”, in ACM CIKM 2009

Problem Modelling
and Analysis


Problem Formulation (Assumptions)
About the n servers and the k components:
only one component per server
no capacity bounds
components are uniformly accessed


Problem Formulation (Assumptions)
About the n servers and the k components:
only one component per server
no capacity bounds
components are uniformly accessed

About the global service architecture:
client requests are routed toward the closest server
characteristics of links between servers are known
a generic distance (cost) function dĳ


Problem Formulation (Deﬁnition)
Rainbow distance for a server i:
total cost to fetch all missing components

j4
5
=
dĳ4 j3
=4
dĳ3
3 j2
dĳ2 =
i dĳ1 = 2 j1


Problem Formulation (Deﬁnition)
Rainbow distance for a server i:
total cost to fetch all missing components

j4
d(i) = dĳ1 + dĳ4 = 7 =
5
j3
dĳ4
=4
dĳ3
3 j2
dĳ2 =
i dĳ1 = 2 j1


Problem Formulation (Objective)
Global goal: assign components to servers

Optimization: minimize sum of rainbow distances

d(i)
0<i≤n


Problem Formulation (Objective)
Global goal: assign components to servers

Optimization: minimize sum of rainbow distances

d(i)
0<i≤n

Motivations:
network operator: reduce cross-domain traﬃc
service provider: reduce overall latency
academic: funny unknown problem

Problem Complexity
The problem is NP-complete:
closely related with domatic partition

6 4
7
3
8
9
5
1 0
2


Integer Programming
1 if component c is allocated at server i
xic =
0 otherwise

c 1 if i obtains component c from j
yĳ =
0 otherwise

n k n
c
Minimize d(i, j)yĳ
i=1 c=1 j=1
k
Subject to xic = 1, only one component per server
c=1
c
yĳ = 1 − xic , a server has c or has exactly one pointer to c
j=i
c
yĳ ≤ xjc , a server has c from another server if this latter has c


Related Works


Related Works
Facility Location Problem
⇒ open a subset of facilities with minimal overall cost

c1 = 9 c2 = 3 c3 = 5

u1 u2 u3

Related Works

c1 = 9 c2 = 3 c3 = 5

u1 u2 u3

i1
i2

Related Works

c1 = 9 c2 = 3 c3 = 5

u1 u2 u3

6 8
3

i1
i2

Related Works

c1 = 9 c2 = 3 c3 = 5

u1 u2 u3

6 8 12 11
3

7
i1
i2


Related Works
most variants are NP-complete
3
close variant is k-PUFLP: a 2 k − 1 -approx. algo4
possible transformation from our prob. to k-PUFLP

4
H. C. Huang and R. Li, “A k-product uncapacitated facility location
problem”, European Journal of Op. Res., vol. 185, no. 2, 2008.

Related Works
3
a 2k − 1 -approx. algo


Related Works
3

Content Delivery Networks
k-median problem: no multiple servers


Related Works
3

Content Delivery Networks
k-median problem: no multiple servers

Nano data centers powered by set-top-boxes
uniform random allocation of components to servers


Our Algorithms


Approximate Algorithm
For a server i:
¯
1. compute distance d(i) to k − 1 closest servers
¯
2. wait until every server j with smaller d(j) are OK
3. try to optimize locally −→ optimized state
4. if impossible −→ saved state
5. uncolored saved nodes get furthest components



sorted list nearest neighbors
2 1,4,5
3 1,8,16
1 2,3,16
6 8 3,11,12
7 5 1,2,4
11 8,12,13
10 4 2,5,7
16 1,3,5
13 4 12 8,9,11
15 2
15 1,10,11
5 10 2,6,15
1 18 14 3,16,17
11
17 5,14,16
8 3 13 11,12,15
16 7 2,4,6
12
6 2,7,10
9 8,12,14
17 18 4,5,17
14
9



2 1,4,5
3 1,8,16
1 2,3,16
6 8 3,11,12
7 5 1,2,4
11 8,12,13
10 optimized 4
16
2,5,7
1,3,5
13 4 12 8,9,11
15 2
15 1,10,11
5 10 2,6,15
1 18 14 3,16,17
11
17 5,14,16
8 3 13 11,12,15
16 7 2,4,6
12
6 2,7,10
9 8,12,14
17 18 4,5,17
14
9



2 1,4,5
3 1,8,16
1 2,3,16
6 8 3,11,12
7 5 1,2,4
11 8,12,13
10 4 2,5,7
16 1,3,5
13 4 12 8,9,11
15 2
15 1,10,11
5 10 2,6,15
1 18 14 3,16,17
11

8 3
optimized 17
13
5,14,16
11,12,15
16 7 2,4,6
12
6 2,7,10
9 8,12,14
17 18 4,5,17
14
9



2 1,4,5
conﬂict 3 1,8,16
1 2,3,16
6 8 3,11,12
7 5 1,2,4
11 8,12,13
10 4 2,5,7
16 1,3,5
13 4 12 8,9,11
15 2
15 1,10,11
5 10 2,6,15
1 18 14 3,16,17
11
17 5,14,16
8 3 13 11,12,15
16 7 2,4,6
12
6 2,7,10
9 8,12,14
17 18 4,5,17
14
9

saved but colored


2 1,4,5
conﬂict 3 1,8,16
1 2,3,16
6 8 3,11,12
7 5 1,2,4
11 8,12,13
10 4 2,5,7
16 1,3,5
13 4 12 8,9,11
15 2
15 1,10,11
5 10 2,6,15
1 18 14 3,16,17
11
17 5,14,16
8 3 13 11,12,15
16 7 2,4,6
12
6 2,7,10
9 8,12,14
17 18 4,5,17
14
9

saved and uncolored


2 1,4,5
3 1,8,16
1 2,3,16
6 8 3,11,12
7 5 1,2,4
11 8,12,13
10 4 2,5,7
16 1,3,5
13 4 12 8,9,11
15 2
15 1,10,11
5 10 2,6,15
1 18 14 3,16,17
11
17 5,14,16
8 3 13 11,12,15
16 7 2,4,6
12
6 2,7,10
9 8,12,14
17 18 4,5,17
14
9

colored by node 10


2 1,4,5
3 1,8,16
1 2,3,16
6 8 3,11,12
7 5 1,2,4
11 8,12,13
10 4 2,5,7
16 1,3,5
13 4 12 8,9,11
15 2
15 1,10,11
5 10 2,6,15
1 18 14 3,16,17
11
17 5,14,16
8 3 13 11,12,15
16 7 2,4,6
12
6 2,7,10
9 8,12,14
17 18 4,5,17
14
9

only node uncolored


2 1,4,5
3 1,8,16
1 2,3,16
6 8 3,11,12
7 5 1,2,4
11 8,12,13
10 4 2,5,7
16 1,3,5
13 4 12 8,9,11
15 2
15 1,10,11
5 10 2,6,15
1 18 14 3,16,17
11
17 5,14,16
8 3 13 11,12,15
16 7 2,4,6
12
6 2,7,10
9 8,12,14
17 18 4,5,17
14
9

choose farthest component

Proof
¯
d(i), cost to i’s k − 1 nearest neighbors.
d(i), rainbow cost of i.

6
7

10
13 4
15 2
5
1 18
11
8 3
16
12

17
14
9


Proof
¯

¯
optimized node: d(i) = d(i)
6
7

10
13 4
15 2
5
1 18
11
8 3
16
12

17
14
9


Proof
¯

¯
6
7

10 i node conﬂicting with its nearest neighbor:
13 4
15 2
optimized node at one hop
5 i
1 18
11 ¯
d(i) ≤ (k − 2)d(i)
8 3
16
12

17
14
9


Proof
¯

¯
6
7

10 node conﬂicting with its nearest neighbor:
13 4
15 2
optimized node at one hop
5
1 18
11 ¯
d(i) ≤ (k − 2)d(i)
3 i
8
16
12

17
node with two conﬂicting nearest neighbors:
14
9 j1 optimized node at two hops
i
¯
d(i) ≤ ( 3 k − 5 )d(i)
2 2


A Heuristic Algorithm
Idea: use the similarity with domatic partition
domatic coloring of a proximity graph


A Heuristic Algorithm
Idea: use the similarity with domatic partition
domatic coloring of a proximity graph

1. build a k-nearest neighbor graph O(n · log n)
2. augment it into an interval graph O(n)
3. build the domatic partition O(n)


Another Heuristic Algorithm
Based on a k-nearest neighbor graph, two rounds
1. explore surroundings: do not pick a component
hosted by a direct neighbor
hosted by a peer that considers you as a direct neighbor
hosted by one of its direct neighbors


Another Heuristic Algorithm
Based on a k-nearest neighbor graph, two rounds
1. explore surroundings: do not pick a component
hosted by a direct neighbor
hosted by a peer that considers you as a direct neighbor
hosted by one of its direct neighbors

2. try to maximize the beneﬁts
pick component satisfying in average the direct neighbors


Simulations


Conﬁgurations
Several contexts have been considered:
network of latencies
select randomly n peers among 20, 000 entries
⇒ minimize the global latency

network of Autonomous Systems
put µ peers into every AS
inter-AS routing
⇒ minimize the cross-domain traﬃc


Comparing to Exact Solutions


Going Further


Cross-Domain Gain

6

5
1,346
Average Hops

4
3,778 1,072 1,074 1,085
3
2,695 2,764 2,746
2

1

0
Random k-nearest Topo k-nearest Rela k-PUFLP

Peering Transit


Conclusion


Only Preliminary Works
Many theoretical results can be obtained:
relax assumptions (esp. capacity, number of
components)
study families of instances
better approximation


Only Preliminary Works
Many theoretical results can be obtained:
relax assumptions (esp. capacity, number of
components)
study families of instances
better approximation

Many realistic variants can be formulated:
take into account network architecture
objective of fairness


Any question?
<no image>


Optimal Network Locality in Distributed Services

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Viewers also liked

Viewers also liked (9)

Similar to Optimal Network Locality in Distributed Services

Similar to Optimal Network Locality in Distributed Services (20)

More from Gwendal Simon

More from Gwendal Simon (14)

Recently uploaded

Recently uploaded (20)

Optimal Network Locality in Distributed Services