In age of cloud computing, any equipment can become server, e.g. set-top-boxes or access routers. For service providers, a challenge consists in accurately making use of these servers. We address the problem of locating a large service (or content) into these Internet edges so that the delivery to clients is efficient from a networking point of view.
A Beginners Guide to Building a RAG App Using Open Source Milvus
Optimal Network Locality in Distributed Services
1. Optimal Network
Locality in Distributed
Services
Gwendal Simon
Department of Computer Science
Institut Telecom - Telecom Bretagne
2010
2. Telecom Bretagne
Institut Telecom: graduate engineering schools
Telecom Bretagne: 1200 students (200 PhD)
Computer Science: 20 full-time research lecturers
2 / 27 Gwendal Simon Network Locality in Distributed Services
3. Credits
Funding:
Thomson R&D (now Technicolor)
French grant with Orange, NDS Tech. and INRIA
3 / 27 Gwendal Simon Network Locality in Distributed Services
4. Credits
Funding:
Thomson R&D (now Technicolor)
French grant with Orange, NDS Tech. and INRIA
Co-authors:
Jimmy Leblet (post-doc)
Yiping Chen (PhD student)
Zhe Li (PhD student)
Gilles Straub (senior researcher Thomson)
Di Yuan (Ass. Professor: Linkopping Univ. Sweden)
3 / 27 Gwendal Simon Network Locality in Distributed Services
5. Service Delivery Network
CLOUD
end user
4 / 27 Gwendal Simon Network Locality in Distributed Services
6. Service Delivery Network
data-center
CLOUD
end user
4 / 27 Gwendal Simon Network Locality in Distributed Services
7. Service Delivery Network
data-center CDN
CLOUD
end user
4 / 27 Gwendal Simon Network Locality in Distributed Services
8. Service Delivery Network
data-center CDN
in-network servers
CLOUD
end user
4 / 27 Gwendal Simon Network Locality in Distributed Services
9. Service Delivery Network
data-center CDN
in-network servers
set-top-box
CLOUD
set-top-box
set-top-box
set-top-box
end user
4 / 27 Gwendal Simon Network Locality in Distributed Services
10. Toward a Decentralized Architecture
servers’ capacities scale down1
services scale up2
=⇒ multi-servers multi-component architectures3
1
J. He, A. Chaintreau and C. Diot. “A performance evaluation of scalable
live video streaming with nano data centers” Computer Networks, 2009
2
J. Pujol, V. Erramilli, and P. Rodriguez, “Divide and Conquer:
Partitioning Online Social Networks” Arxiv preprint arXiv:0905.4918, 2009.
3
R. Baeza-Yates, A. Gionis, F. Junqueira, V. Plachouras, and L. Telloli,
“On the feasibility of multi-site web search engines”, in ACM CIKM 2009
5 / 27 Gwendal Simon Network Locality in Distributed Services
11. Problem Modelling
and Analysis
6 / 27 Gwendal Simon Network Locality in Distributed Services
12. Problem Formulation (Assumptions)
About the n servers and the k components:
only one component per server
no capacity bounds
components are uniformly accessed
7 / 27 Gwendal Simon Network Locality in Distributed Services
13. Problem Formulation (Assumptions)
About the n servers and the k components:
only one component per server
no capacity bounds
components are uniformly accessed
About the global service architecture:
client requests are routed toward the closest server
characteristics of links between servers are known
a generic distance (cost) function dij
7 / 27 Gwendal Simon Network Locality in Distributed Services
14. Problem Formulation (Definition)
Rainbow distance for a server i:
total cost to fetch all missing components
j4
5
=
dij4 j3
=4
dij3
3 j2
dij2 =
i dij1 = 2 j1
8 / 27 Gwendal Simon Network Locality in Distributed Services
15. Problem Formulation (Definition)
Rainbow distance for a server i:
total cost to fetch all missing components
j4
d(i) = dij1 + dij4 = 7 =
5
j3
dij4
=4
dij3
3 j2
dij2 =
i dij1 = 2 j1
8 / 27 Gwendal Simon Network Locality in Distributed Services
16. Problem Formulation (Objective)
Global goal: assign components to servers
Optimization: minimize sum of rainbow distances
d(i)
0<i≤n
9 / 27 Gwendal Simon Network Locality in Distributed Services
17. Problem Formulation (Objective)
Global goal: assign components to servers
Optimization: minimize sum of rainbow distances
d(i)
0<i≤n
Motivations:
network operator: reduce cross-domain traffic
service provider: reduce overall latency
academic: funny unknown problem
9 / 27 Gwendal Simon Network Locality in Distributed Services
18. Problem Complexity
The problem is NP-complete:
closely related with domatic partition
6 4
7
3
8
9
5
1 0
2
10 / 27 Gwendal Simon Network Locality in Distributed Services
19. Problem Complexity
The problem is NP-complete:
closely related with domatic partition
6 4
7
3
8
9
5
1 0
2
10 / 27 Gwendal Simon Network Locality in Distributed Services
20. Problem Complexity
The problem is NP-complete:
closely related with domatic partition
6 4
7
3
8
9
5
1 0
2
10 / 27 Gwendal Simon Network Locality in Distributed Services
21. Problem Complexity
The problem is NP-complete:
closely related with domatic partition
6 4
7
3
8
9
5
1 0
2
10 / 27 Gwendal Simon Network Locality in Distributed Services
22. Problem Complexity
The problem is NP-complete:
closely related with domatic partition
6 4
7
3
8
9
5
1 0
2
10 / 27 Gwendal Simon Network Locality in Distributed Services
23. Integer Programming
1 if component c is allocated at server i
xic =
0 otherwise
c 1 if i obtains component c from j
yij =
0 otherwise
n k n
c
Minimize d(i, j)yij
i=1 c=1 j=1
k
Subject to xic = 1, only one component per server
c=1
c
yij = 1 − xic , a server has c or has exactly one pointer to c
j=i
c
yij ≤ xjc , a server has c from another server if this latter has c
11 / 27 Gwendal Simon Network Locality in Distributed Services
24. Related Works
12 / 27 Gwendal Simon Network Locality in Distributed Services
25. Related Works
Facility Location Problem
⇒ open a subset of facilities with minimal overall cost
c1 = 9 c2 = 3 c3 = 5
u1 u2 u3
26. Related Works
Facility Location Problem
⇒ open a subset of facilities with minimal overall cost
c1 = 9 c2 = 3 c3 = 5
u1 u2 u3
i1
i2
27. Related Works
Facility Location Problem
⇒ open a subset of facilities with minimal overall cost
c1 = 9 c2 = 3 c3 = 5
u1 u2 u3
6 8
3
i1
i2
28. Related Works
Facility Location Problem
⇒ open a subset of facilities with minimal overall cost
c1 = 9 c2 = 3 c3 = 5
u1 u2 u3
6 8 12 11
3
7
i1
i2
13 / 27 Gwendal Simon Network Locality in Distributed Services
29. Related Works
Facility Location Problem
most variants are NP-complete
3
close variant is k-PUFLP: a 2 k − 1 -approx. algo4
possible transformation from our prob. to k-PUFLP
4
H. C. Huang and R. Li, “A k-product uncapacitated facility location
problem”, European Journal of Op. Res., vol. 185, no. 2, 2008.
13 / 27 Gwendal Simon Network Locality in Distributed Services
30. Related Works
Facility Location Problem
3
a 2k − 1 -approx. algo
13 / 27 Gwendal Simon Network Locality in Distributed Services
31. Related Works
Facility Location Problem
3
a 2k − 1 -approx. algo
Content Delivery Networks
k-median problem: no multiple servers
13 / 27 Gwendal Simon Network Locality in Distributed Services
32. Related Works
Facility Location Problem
3
a 2k − 1 -approx. algo
Content Delivery Networks
k-median problem: no multiple servers
Nano data centers powered by set-top-boxes
uniform random allocation of components to servers
13 / 27 Gwendal Simon Network Locality in Distributed Services
33. Our Algorithms
14 / 27 Gwendal Simon Network Locality in Distributed Services
34. Approximate Algorithm
For a server i:
¯
1. compute distance d(i) to k − 1 closest servers
¯
2. wait until every server j with smaller d(j) are OK
3. try to optimize locally −→ optimized state
4. if impossible −→ saved state
5. uncolored saved nodes get furthest components
15 / 27 Gwendal Simon Network Locality in Distributed Services
43. Proof
¯
d(i), cost to i’s k − 1 nearest neighbors.
d(i), rainbow cost of i.
6
7
10
13 4
15 2
5
1 18
11
8 3
16
12
17
14
9
17 / 27 Gwendal Simon Network Locality in Distributed Services
44. Proof
¯
d(i), cost to i’s k − 1 nearest neighbors.
d(i), rainbow cost of i.
¯
optimized node: d(i) = d(i)
6
7
10
13 4
15 2
5
1 18
11
8 3
16
12
17
14
9
17 / 27 Gwendal Simon Network Locality in Distributed Services
45. Proof
¯
d(i), cost to i’s k − 1 nearest neighbors.
d(i), rainbow cost of i.
¯
optimized node: d(i) = d(i)
6
7
10 i node conflicting with its nearest neighbor:
13 4
15 2
optimized node at one hop
5 i
1 18
11 ¯
d(i) ≤ (k − 2)d(i)
8 3
16
12
17
14
9
17 / 27 Gwendal Simon Network Locality in Distributed Services
46. Proof
¯
d(i), cost to i’s k − 1 nearest neighbors.
d(i), rainbow cost of i.
¯
optimized node: d(i) = d(i)
6
7
10 node conflicting with its nearest neighbor:
13 4
15 2
optimized node at one hop
5
1 18
11 ¯
d(i) ≤ (k − 2)d(i)
3 i
8
16
12
17
node with two conflicting nearest neighbors:
14
9 j1 optimized node at two hops
i
¯
d(i) ≤ ( 3 k − 5 )d(i)
2 2
17 / 27 Gwendal Simon Network Locality in Distributed Services
47. A Heuristic Algorithm
Idea: use the similarity with domatic partition
domatic coloring of a proximity graph
18 / 27 Gwendal Simon Network Locality in Distributed Services
48. A Heuristic Algorithm
Idea: use the similarity with domatic partition
domatic coloring of a proximity graph
1. build a k-nearest neighbor graph O(n · log n)
2. augment it into an interval graph O(n)
3. build the domatic partition O(n)
18 / 27 Gwendal Simon Network Locality in Distributed Services
49. Another Heuristic Algorithm
Based on a k-nearest neighbor graph, two rounds
1. explore surroundings: do not pick a component
hosted by a direct neighbor
hosted by a peer that considers you as a direct neighbor
hosted by one of its direct neighbors
19 / 27 Gwendal Simon Network Locality in Distributed Services
50. Another Heuristic Algorithm
Based on a k-nearest neighbor graph, two rounds
1. explore surroundings: do not pick a component
hosted by a direct neighbor
hosted by a peer that considers you as a direct neighbor
hosted by one of its direct neighbors
2. try to maximize the benefits
pick component satisfying in average the direct neighbors
19 / 27 Gwendal Simon Network Locality in Distributed Services
52. Configurations
Several contexts have been considered:
network of latencies
select randomly n peers among 20, 000 entries
⇒ minimize the global latency
network of Autonomous Systems
put µ peers into every AS
inter-AS routing
⇒ minimize the cross-domain traffic
21 / 27 Gwendal Simon Network Locality in Distributed Services
53. Comparing to Exact Solutions
22 / 27 Gwendal Simon Network Locality in Distributed Services
54. Going Further
23 / 27 Gwendal Simon Network Locality in Distributed Services
55. Cross-Domain Gain
6
5
1,346
Average Hops
4
3,778 1,072 1,074 1,085
3
2,695 2,764 2,746
2
1
0
Random k-nearest Topo k-nearest Rela k-PUFLP
Peering Transit
24 / 27 Gwendal Simon Network Locality in Distributed Services
56. Conclusion
25 / 27 Gwendal Simon Network Locality in Distributed Services
57. Only Preliminary Works
Many theoretical results can be obtained:
relax assumptions (esp. capacity, number of
components)
study families of instances
better approximation
26 / 27 Gwendal Simon Network Locality in Distributed Services
58. Only Preliminary Works
Many theoretical results can be obtained:
relax assumptions (esp. capacity, number of
components)
study families of instances
better approximation
Many realistic variants can be formulated:
take into account network architecture
objective of fairness
26 / 27 Gwendal Simon Network Locality in Distributed Services
59. Any question?
<no image>
27 / 27 Gwendal Simon Network Locality in Distributed Services