The document discusses testing forest isomorphism in the adjacency list model. It proposes a partitioning oracle that removes small fractions of edges to partition graphs into parts with good properties, like bounded degree trees. It then checks if each corresponding part in the two forests is isomorphic or far. This reduces the problem to poly(log n) queries by testing individual parts. The approach provides a general technique for testing any graph property on forests in poly(log n) queries. A lower bound of Ω(√log n) queries is also shown.
Testing Forest-Isomorphismin the Adjacency List Model
1. Testing Forest-Isomorphism
in the Adjacency List Model
Mitsuru Kusumoto†, Yuichi Yoshida†*
† : Preferred Infrastructure, Inc.
* : National Institute of Informatics.
1
2. Overview
Given two forests G and H, determine if G ≅ H or G and H are
far from being so by looking at very small parts of G and H.
Outline
Introduction
Property testing
Problem setting
Our algorithms
≅
?
2 / 21
4. Property Testing
We want to solve decision problem as efficiently as possible!!
Example : Graph connectivity
Standard setting : BFS is enough. → Θ(n) time.
Property testing : Check if G is connected or G is far from
being connected. → O(1) time!?
Connected Not connected
4 / 21
5. Property Testing
Property testing algorithm is a (randomized) algorithm that
checks if input satisfies property P or is far from P with high
probability (e.g., ≥ 2/3) with sublinear query or time complexity.
Main Interest
What kinds of properties are testable efficiently?
Connected Not connected
We want to
distinguish them
Far from being
connected
Close to being
connected
5 / 21
6. Graph Property Testing - Review
The efficiency of property testing algorithms depends on the
input models.
Adjacency matrix model
[01010]
[10110]
G = [01001]
[11001]
[00110]
Adjacency list model
v
A
B
C
1
2
3
O(v, 1) = A
O(v, 2) = B
O(v, 3) = C
• Input model for dense graphs. [GGR’98]
• Many properties are testable.
(e.g., connectivity, △-freeness, ... .)
• Necessity & sufficiency for constant-
time testability are known. [Alon+’09]
• Input model for sparse graphs. [GR’02]
[KKR’04]
• Many properties are testable.
(e.g., connectivity, H-minor-freeness.)
• But many results assume bounded-
degree condition: degrees of vertices
must be bounded by some constant.
6 / 21
7. Graph Property Testing - Review
Only a few efficient algorithms.
Many hardness results: △-freeness, k-colorability, etc.,
requires Ω(√n) queries. [A+08, B+08, K+04]
Question : Is it possible to obtain efficient algorithms for
fundamental problems without bounded-degree condition?
Adjacency list model
v
A
B
C
1
2
3
O(v, 1) = A
O(v, 2) = B
O(v, 3) = C
• Input model for sparse graphs. [GR’02]
[KKR’04]
• Many properties are testable.
(e.g., connectivity, H-minor-freeness.)
• But many results assume bounded-
degree condition: degrees of vertices
must be bounded by some constant.
What happens if we do not assume
the bounded-degree condition?
7 / 21
8. Forest-Isomorphism
We focus on forest-isomorphism in adjacency list model.
Input : Two forests G and H represented by adjacency lists
and proximity parameter ε > 0.
Query Model : We can access to G and H via following queries:
deg(v): returns the degree of vertex v.
adj(v, i): returns a vertex adjacent to v by i-th edge.
random(): returns a randomly chosen vertex.
≅
?
8 / 21
9. Forest-Isomorphism
We focus on forest-isomorphism in adjacency list model.
Input : Two forests G and H represented by adjacency lists
and proximity parameter ε > 0.
ε-Farness : d(G, H) := # of edge-(additions / deletions) to
transform G to H. (Graph edit distance)
For ε>0, (G, H) are ε-far from being isomorphic ⇔ d(G, H) ≥ εn.
Objective: Determine G≅H or d(G, H) ≥ εn.
≅
?
9 / 21
10. Forest-Isomorphism
We focus on forest-isomorphism in adjacency list model.
Motivation
Problem is fundamental: Forest is simple structure and
isomorphism is a theoretically important problem.
Isomorphism was sometimes considered in property testing
literature. [AS’05, AS’08, NS’11]
≅
?
10 / 21
11. Forest-Isomorphism
We focus on forest-isomorphism in adjacency list model.
Related Work
If there is no restriction on input, graph isomorphism testing
in the adjacency list model requires Ω(√n) queries. [FM’08]
Good motivation for our focus on forests.
If input is a bounded-degree hyperfinite graph, then graph
isomorphism is constant-time testable. [NS’11]
But if there is no degree bound, testability was unknown.
≅
?
11 / 21
12. Our Contribution
Furthermore, we obtained more general result:
If the input is a forest, every graph property is testable in
poly(log n) queries in the adjacency list model.
We use a similar technique with [Newman and Sohler’11].
Query complexity
Upper bound poly(log n)
Lower bound Ω(√log n)
12 / 21
14. Overview of Our Method
1. Partitioning oracle:
We define a procedure that removes
small fractions of edges to partition
the graph into several parts with
“good” properties.
G
The Partitioning Oracle
H
2. We check if each corresponding
part in G and H is isomorphic or far
from so.
If G, H are far from being isomorphic,
there is at least one corresponding part
in G, H that is also far from being
isomorphic.
14 / 21
15. Partitioning Oracle
Partitioning Oracle: Given ε>0 and access to G, there exists
integer s=s(ε) and subgraph G’⊆ G s.t.,
|E(G) – E(G’)| ≤ εn / 3
Each connected component of G’ is either
s-bounded-degree-tree or s-rooted-tree.
s-rooted tree:
A tree where there exists v ∈ V(T) s.t.
deg(v) ≥ s and (size of each sub-tree) < s.
(We call the vertex v a root.)
s-bounded-degree-tree:
A tree where
(degree of each vertex) < s.
v
15 / 21
16. Partitioning Oracle
Partitioning Oracle: Given ε>0 and access to G, there exists
integer s=s(ε) and subgraph G’⊆ G s.t.,
|E(G) – E(G’)| ≤ εn / 3
Each connected component of G’ is either
s-bounded-degree-tree or s-rooted-tree.
We can provide query access to G’.
Alive Edge Query: Check if edge (v, i) still exists in G’.
The subgraph G’ is chosen deterministically.
If G ≅ H, then G’ ≅ H’.
v
A
B
C
1
2
3
(v, 1) : not alive
(v, 2) : not alive
(v, 3) : alive
16 / 21
17. Partitioning Oracle
Partitioning Oracle: Given ε>0 and access to G, there exists
integer s=s(ε) and subgraph G’⊆ G s.t.,
|E(G) – E(G’)| ≤ εn / 3
Each connected component of G’ is either
s-bounded-degree-tree or s-rooted-tree.
So…
If d(G, H) = 0 ⇒ d(G’, H’) = 0
G’ and H’ are chosen deterministically.
If d(G, H) ≥ εn ⇒ d(G’, H’) ≥ εn / 3
We remove at most εn / 3 edges from G and H.
Thus, it is enough to consider the partitioned graphs G’ and H’.
17 / 21
18. Graph Partition
Suppose that G is obtained through the partitioning oracle.
We split G into the following parts for some constants α,γ>1.
G[0] := s-bounded degree trees in G
G[1] := s-rooted trees in G with root degrees in [s, αγ)
G[2] := s-rooted trees in G with root degrees in [αγ, αγ2)
G[3] := s-rooted trees in G with root degrees in [αγ2, αγ3)
...
O(log n) parts
G[0] G[1]
G[2] ......
18 / 21
19. Isomorphism between Each Partitions
Graph partition is useful in the following sense.
Lemma. d(G, H) ≤ Σi d(G[i], H[i]).
Proof. Transformation from G[i] to H[i] for each i would transform
G to H. □
Corollary. If d(G, H) ≥ εn, then for βi > 0 with Σ βi = ε,
∃i s.t. d(G[i], H[i]) ≥ βin. □
Thus, it suffices to check the isomorphism between G[i] and H[i]
for each i=0,1,2,….
We set β0=ε/2, β1=β2=…=O(ε / log n).
19 / 21
20. Isomorphism between Each Partitions
Testing G[i]≅H[i]
For i=0 : We can use a tester for the bounded-degree model
[NS’11].
For i≥1 : We develop a new algorithm.
Sketch : We randomly sample root vertices.
For each root vertex, we randomly sample its subtrees and
create a histogram of subtrees.
After this, we compute the minimum matching between
the histograms in G and H.
This minimum matching turns out to be a good
approximation to d(G, H).
:2
:2
:1
… 20 / 21
21. Conclusion
If the input is a forest, every graph property is testable in
poly(log n) queries.
Future Work?
Can we obtain similar results for larger graph class than forests?
Outerplanar graphs, Bounded-tree width graphs,
Scale-free graphs, …
Query complexity
Upper bound poly(log n)
Lower bound Ω(√log n)
Actually O(log^2^poly(1/ε)(n))
21 / 21
23. Lower bound - Overview
1. We construct two distributions of input, D1, D2.
∀(G, H) ∈ D1, G ≅ H
∀(G, H) ∈ D2, d(G, H) ≥ n/8
2. We reduce the isomorphism testing to checking if two
probabilistic distributions are the same or not. This requires
Ω(√N) queries.
≅
?
≅
?
23 / 21
24. Lower bound
Let Fk := (n / (2klogn)) copies of a star graph with 2k vertices
(Remark that |Fi| = n / logn)
F3
F2
F1
F0
…
Flogn 24 / 21
25. Lower bound
Construct two distributions D1, D2 :
D1 : G=H
D2 : randomly assign Fk to
either G or H so that
|V(G)| = |V(H)|.
G = F0 ∪ F1 ∪ … Flogn
H = F0 ∪ F1 ∪ … Flogn
G = ................................
H = ...............................
F0 F1 … Flogn
25 / 21
26. Lower bound
Because we can perform only “random-sampling” and
(degree/neighbor)-query, checking if G ≅ H is equivalent to
checking two probabilistic distributions are the same.
Lemma. We need Ω(√logn) queries to distinguish D1 and D2.
proba. to observe by
random-sampling
F0 F1 F2 Flogn
G
H
G=H
26 / 21
27. Lower bound
Lemma. ∀(G, H) ∈ D2, d(G, H) ≥ n/8
Proof.
Let Φ:V(G)→V(H) be a bijection achieves minimum graph edit
distance. It holds that
d(G, H) ≥ Σv∈V(G) |deg(v) – deg(Φ(v))| / 2.
If we restrict v in the sum to the root of stars, we obtain
d(G, H) ≥ Σk=2,3,4,... (n / (2k logn)) ∙ 2k-1/2 ≥ n/8. □
Thus, Ω(logn) lower bound holds.
Φ
27 / 21