Finding Euler Tours in the StrSort Model by Kliemann, Lasse et al.
Finding Euler Tours in the StrSort Model
Lasse Kliemann Jan Schiemann Anand Srivastav
Department of Computer Science
Kiel University
Christian-Albrechts-Platz 4
24118 Kiel, Germany
{lki,jasc,asr}@informatik.uni-kiel.de
Abstract:We present a first algorithm for finding Euler tours in undirected
graphs in the StrSort model. This model is a relaxation of the semi streaming
model. The graph is given as a stream of its edges and can only be read
sequentially, but while doing a pass over the stream we are allowed to write
out another stream which will be the input for the next pass. In addition,
items in the stream are sorted between passes. This model was introduced by
Aggarwal et al. in 2004. Here we apply this model to the problem of finding
an Euler tour in a graph (or to decide that the graph does not admit an
Euler tour). The algorithm works in two steps. In the first step, a single pass
is conducted while a linear (in the number of vertices n) amount of RAM
is required. In the second step, O(log(n)) passes are conducted while only
O(log(n)) RAM is required.
We use an alteration of the algorithm of Atallah and Vishkin from 1984
for finding Euler tours in parallel. It finds a partition of edge-disjoint circuits
and arranges them in a tree expressing their connectivity. Then the circuits
are merged according to this tree. In order to minimize the needed amount of
RAM, we evade the need to store the entire tree and use techniques suggested
by Aggarwal et al. to exchange information concerning the merging of circuits.
1 Introduction
For the processing of large graphs, the graph streaming or semi streaming model has been
studied extensively in the last decade. In this model, the graph is given as a stream of its
edges meaning that only sequential access is possible. Random-access memory (RAM)
is restricted to O(n · polylog(n)) edges at a time. This makes the model non-applicable
to problems where already the size of the solution can be larger than that. In the Euler
tour problem, we are looking for a closed walk in an undirected graph such that each
1
ar
X
iv
:1
61
0.
03
41
2v
1 
 [c
s.D
S]
  1
1 O
ct 
20
16
edge is visited exactly once (or we wish to determine that the graph does not admit such
a walk). The solution size (in the positive case) can be of order Θ(n2), since it contains
all edges of the graph. This problem hence calls for a relaxation of the graph streaming
model.
1.1 StrSort and W-Stream
Aggarwal et al. [7, 1] presented a less restrictive streaming model, called StrSort-model.
It consists of alternating streaming and sorting passes. A streaming pass consists of a
Turing machine with local memory of size m and two tapes. On one tape, the Turing
machine reads a sequence S = x1, ..., xk of k ∈ N items. On the other tape, an output
stream is written. On both tapes, the Turing machine can move only left-to-right. In a
sorting pass, a Turing machine with a global partial order sorts items on a tape according
to this order and gives the sorted items as output.
Definition 1. StrSort(pStr, pSort,m) is the class of functions computable by the com-
position of up to pStr streaming passes and pSort sorting passes, each with memory m,
where:
• the local memory is maintained between streaming passes
• streams produced at intermediate stages are of length O(n), where n is the length
of the input stream.
Using only O(polylog(n)) memory space and O(polylog(n)) passes is sufficient for solving
many graph problems in this streaming model, such as minimum spanning tree, maximal
independent set and mincut [7], hence the following definition of Aggarwal et al.:
Definition 2. PL-StrSort := ∪k StrSort (O(logk n),O(logk n))
Demetrescu et al. [5] showed for a few graph problems that the sorting steps are not
necessary. In the so-called W-Stream-model, which uses only the streaming steps (i.e.
StrSort(pStr, 0,m)), they show a tradeoff between internal memory and streaming passes
for undirected connectivity and single-source shortest paths in directed graphs.
1.2 Euler tours
The Euler tour problem is one of the fundamental problems of graph theory. Given a
graph G = (V ,E ), find an Euler tour or state that the graph is not Eulerian. In RAM
model finding Euler tours in polynomial time is relatively easy, and there are multiple
2
well known algorithms for that task. But the problem gets more complicated considering
a big data environment in the form of a streaming or external memory model. For the
latter, an algorithm of Atallah and Vishkin [2] for solving Euler tours in PRAM is used.
The algorithm has a running time of O(log(n)) and uses n + m processors, where n is
the number of vertices and m is the number of edges in G . Since PRAM algorithms can
be transferred to external memory [3], this result can be remodeled to get an external
memory algorithm solving the Euler tour problem in O(log(n) sort(n +m)) I/Os. While
the different problem “Euler tour on a tree” is regarded in multiple papers (e.g. [4], also
with a transfer of PRAM algorithms), to the best of our knowledge the classical Euler
tour problem was not considered in a streaming model before.
1.3 Our contribution
We give the 2-step StrSort-algorithm EulerStr for finding an Euler tour in a given graph
G = (V ,E ) with n := |V | and m := |E |. The first step is a single pass W-stream
algorithm with memory space O(n log(n)), that is, the bound which is usually used
in the semi-streaming environment. The second step is a PL-StrSort algorithm with
O(log(n)) alternating streaming and sorting passes and O(log(n)) memory space. The
stream length will be O(m) the whole time. We use the technique of Atallah and Vishkin
for finding Euler tours in parallel, but with two differences:
• The algorithm of Atallah and Vishkin uses memory space of a size inappropriate
for a streaming environment. We limit the memory space needed in the different
steps using the storage of suitable subgraphs and different standard techniques of
the StrSort model.
• In contrast to the algorithm of Atallah and Vishkin, we don’t save the predecessor
edge in the Euler tour for every edge. We output the edges in the right order given
by a found Euler tour. This can be interesting for further processing the Euler tour.
2 Preliminaries
Let G = (V ,E ) be an undirected graph with vertex set V and edge set E . A walk of
length k is an alternating sequence v1 − e1 − v2 − e2 − ... − vk − ek − vk+1 of vertices
and edges, where ei = {vi , vi+1} for all i ∈ {1, ..., k}. A trail is a walk without repeating
edges, i.e. for all i , j ∈ {1, ..., k}: i 6= j ⇔ ei 6= ej . A circuit is a trail with the property
v1 = vk+1, i.e. a closed trail. An Euler tour is a circuit that uses each edge in E exactly
once. A graph that contains an Euler tour is called Eulerian. A path is a walk without
repeating vertices or edges. A cycle is a circuit with vi 6= vj for all i , j ∈ {1, ..., k}.
3
A rooted tree is a tree, in which one vertex r is assigned as a root. In a rooted tree, the
depth of a vertex v is the length of the unique path to its root. The vertex u adjacent
to v , which is on the v -r -path is called predecessor of v . If for an vertex w , v is the
predecessor of w , w is called an successor of v . An out-tree is a rooted, directed tree,
where all edges point to the respective successor. For an directed edge ~e = (u, v), u is
called the tail, and v the head of ~e.
For an undirected Graph G = (V ,E ), each vertex is presented with a distinct number
of the set {1, ...,n} with n := |V |. The input stream consists of the m edges of G , given
in random order.
3 Generel idea of EulerStr
Let G = (V ,E ) be an undirected graph. Unless said otherwise, we define n := |V | and
m := |E | for the rest of the paper. The algorithm EulerStr will test, if the graph is
Eulerian, and if it is, will output directed edges in order (u1, v1), ..., (um , vm) with the
following properties:
• xi ∈ V for all x ∈ {u, v}, i ∈ {1, ...,m}
• for all e ∈ E there is exactly one i ∈ {1, ...,m} with e = {ui , vi}
• vi = ui+1 for all i ∈ {1, ...,m − 1}, and vm = u1
Hence the sequence u1−{u1, v1}− v1−{u2, v2}− v2, ...−{um , vm}− vm is a closed trail
that uses each edge exactly once, i.e. an Euler tour. We will often describe walks, circuits
etc. analog to this as a sequence of directed edges instead of an alternating sequence of
vertices and undirected edges. This way, when sorting edges we can sort by the label
of either the head or the tail, and don’t have to consider the random inner order of
undirected edges.
Remark 1. We use a slight alteration of the algorithm of Atallah and Vishkin [2] It
consists of three general steps:
1. Partition the graph into q edge-disjoint circuits C1, ...,Cq .
2. Create an out-tree T = (V ′,E ′) with V ′ = {w1, ...,wq} and for all i , j ∈ {1, ..., q} :
(wi ,wj ) ∈ E ′ ⇒ Ci and Cj share a common vertex in G.
3. Iteratively: Merge all circuits presented in T by vertices with odd depth with the
circuit presented in T by the predecessor.
4
Step 1 is easily done in W-stream with O(n log(n)) memory space, because n Edges
fit into internal memory, and every subgraph with n edges contains at least one cycle.
So iteratively, edges can be taken from the input stream until n edges are present in
internal memory. Then, the edges of a circuit can be found, written on the output stream
and deleted from internal memory. If there are edges left in internal memory after the
W-stream step, the graph was not Eulerian. Alternatively, n variables can be placed in
internal memory, that keep track of the degree of the vertices. This is helpful because of
the following well known result:
Lemma 1. Let G = (V ,E ) be an undirected graph. Then G is Eulerian, iff every vertex
has even degree and the graph is connected. 
That step 2 and 3 with additional properties are giving us an Euler tour is shown in the
following lemma:
Lemma 2. Let G =
⋃
1≤i≤q Ci be an Eulerian graph partitioned into q circuits vi1−e1i −
vi2− ...−vili+1 = vi1 with i ∈ {1, ..., q} and
∑k
i=1 li = m. li is the length of the circuit Ci .
Let T = (V ′,E ′) be a rooted tree with V ′ = {w1, ...,wq}, root w1 and for i , j ∈ {1, ..., q}:
(wi ,wj ) ∈ E ′ ⇒ Ci and Cj share a common vertex in G. For every i ∈ {2, ..., k}, let vi1
be a vertex that the circuit Ci shares with its predecessor. Then the following recursive
algorithm gives an Euler tour:
Algorithm 1: Algorithm euler-tree
1 S:={2,...,q} (global);
2 output vertex v11 ;
3 eul-suc(C1);
Algorithm 2: eul-suc(Cj )
1 i := 1;
2 repeat
3 if wj has a successor wk with k ∈ S and vji = v1k then
4 S := S\{k};
5 eul-suc(Ck );
6 end
7 else
8 output edge e ij and vertex vji+1;
9 i := i + 1;
10 end
11 until i ≤ lj ;

5
Remark 2. The route in the tree chosen by the algorithm describes an ’Euler tour on a
tree’ (for Definition see e.g. [6]).
Proof of lemma 2: Because of the set S , every vertex wi in T is regarded at most once.
When wi is regarded, with EulSuc(Ci) every edge of Ci is part of the output at some
point. Now we have to show two things:
1. The algorithm runs EulSuc(Ci) for every i ∈ {1, ..., q}.
2. The output is an circuit of G .
With both properties it is shown, that the output is an Euler tour. We use an induction
over q . For q = 1, the algorithm starts with v11 , and since C1 is a circuit that contains
all edges of G in correct order, the output is an Euler tour. Now we assume, that both
properties are correct for all Eulerian graphs with partition of q circuits. Let G be an
Eulerian graph with partition of q + 1 circuits. W.l.o.g. let wq+1 be a leaf in the rooted
tree T . Then T\{wq+1} is a connected graph, therefore G˜ := G\{e1q+1, ..., e lq+1q+1 } is
connected. When a circuit is deleted from an Eulerian graph and the result is connected,
then this graph is also Eulerian. This graph has a partition of q circuits, so by assumption
the algorithm works for G˜ . Let wj (j ∈ {1, ..., q}) be the predecessor of wq+1. Then at
some point the algorithm runs EulSuc(Cj ). Furthermore there is a k ∈ {1, ..., lj } with
vjk = vq+11 . At EulSuc(Cj ) with variable i = k , the algorithm doesn’t continue with
edge ekj until all successors of wj are taken care of. So at some point EulSuc(Cq+1)
starts, proving the first property. Since wq+1 is a leaf, the algorithm outputs all edges
of Cq+1 at once in correct order, ending again at vertex vjk . Therefore, the algorithm
combines an Euler tour of G˜ with the circuit Cq+1, resulting in an Euler tour of G ,
proving the second property. 
6
Lemma 2 shows that, if we have an Eulerian Graph, a partition into circuits C1, ...,Cq
and a rooted tree T with the mentioned properties, a vertex wi in T can be merged with
his predecessor wj by combining the circuits Ci and Cj , i.e. inserting Ci into Cj at the
right place. For this, we want to make sure that the first vertex of Ci is a common vertex
of Cj , so we don’t have to change the order of Ci before combining it with Cj . Notice
that after the merging into a longer circuit Cj ′ , the first vertex of this circuit is still a
common vertex of its predecessor, therefore we just have to take care of the order of all
circuits once. Since in the actual algorithm EulerStr we will store circuits as a sequence
of directed edges, this translates to: The tail of the first directed edge of a circuit Ci has
to be the head of a directed edge of Cj , where wj is the predecessor of wi in T .
4 The semi-W-stream step
4.1 Illustrating the step
In this section, we describe the one pass step of EulerStr with O(n log(n)) memory. In
this pass, we want to perform step 1 and 2 of remark 1. For finishing step 2, we will
have to use an additional StrSort(O(1),O(1), log(n))-algorithm, which will be described
in the following section.
As mentioned, in the input stream we have the m edges in random order. The vertices
of G are called {v1, ..., vn}. In internal memory we keep the following variables with
O(log(m)) = O(log(n)) space each:
• comi ∈ {0, ...,n} for i ∈ {1, ...,n}, starting with comi = 0 for all i ∈ {1, ...,n}
• prei ∈ {0, ...,m} for i ∈ {1, ...,n}, starting with prei = 0 for all i ∈ {1, ...,n}
• cir ∈ {0, ...,m}, the number of circuits found yet
Additionally, we build a tree T¯ with O(n) vertices in internal memory. It will later be
extended to the desired rooted tree T .
Step 1 of remark 1 is easily done as explained before. We read up to n edges, find a circuit
C and output the edges in correct order in relation to the circuit as well as the direction in
which the respective edge is traversed. These edges will get 4 log(m) additional memory
space and be called ’graph edges’. In these edges, we store the label cir of the circuit
the edge is in, and the position of the edge in the circuit sequence. Occasionally, we also
output ’information edges’. The purpose and form of these information edges and the
actual memory usage of the graph edges will be explained later.
7
For l ∈ {1, ..., q} let Gl be the graph consisting of all vertices and edges that are used
in at least one circuit C1, ...,Cl . For i ∈ {1, ...,n}, the variable comi keeps track of the
connected component the vertex vi is currently in, considering the current graph Gl .
The variable prei stores the label of the first circuit found that uses the vertex vi .
The tree T¯ is constructed as follows: We create a vertex wl ∈ T¯ every time a found circle
Cl has at least one of the following properties:
1. prei = 0 for some i ∈ {1, ...,n} with vi ∈ Cl
2. comi 6= comj for some i , j ∈ {1, ...,n} with vi , vj ∈ Cl
So for every circuit Cl that contains a vertex not used before, or connects two connected
components in the graph Gl−1, a vertex wl in T¯ is created. For each property, there
can be at most n circuits fulfilling it, so the graph T¯ has O(n) vertices. Edges in T¯ are
build the following way: Let Gi be the graph that contains all vertices and edges used
by the circuits C1, ...,Ci . If a circuit Ci+1 is found, that has vertices of the connected
components A1, ...,Ak in Gi , let vi1 , ..., vik ∈ V with vij ∈ Aj for all j ∈ {1, ..., k}. Let
Cj1 , ...,Cjk′ be the circuits stated in prei1 , ..., preik , i.e. the circuits that used the vertices
vi1 , ..., vik for the first time. Then the edges {wi+1,wj1}, ..., {wi+1,wjk′} are added to T¯ .
The vertices wj1 , ...,wjk′ exist, because the circuits Cj1 , ...,Cjk′ fulfill property 1.
Example:
C1 C2
C3
C4 C5
v1 v2
v3 v4
v5 v6
v9
v7
v8
Figure 1: Partition into circuits (cycles here)
Figure 1 gives an example on a graph with nine vertices v1, ..., v9. Assume that the circuits
found are the cycles C1, ...,C5 in that order. C1 fulfills property 1, so a vertex w1 in T¯ is
8
created. We set prei = 1 and comi = 1 for i ∈ {5, 7, 8}. C2 only has property 1 and shares
the vertex v7 with C1 (this information is stored in pre7), so com6 = 1, com9 = 1 and
w2 is created in T¯ with edge {w1,w2}. Furthermore pre6 = 2 and pre9 = 2, because v6
and v9 are used for the first time. With C3, we set prei = 3 for i ∈ {1, 2, 3, 4} and have a
new connected component in G3 with comi = 3 for i ∈ {1, 2, 3, 4}. We place a vertex w3
in T¯ without additional edges. C4 only has property 2 and connects the components ’1’
and ’3’. Vertices v1 and v5 are selected with com5 = 1 and com1 = 3. We create a vertex
w4, and since pre5 = 1 and pre1 = 3, we connect the vertex with edges {w4,w1} and
{w4,w3} in T¯ . The circuit C5 has neither of the two properties, so there is no additional
vertex in T¯ . However, to get the extended graph T , we will store an ’information edge’ in
the output stream, containing the information, that T¯ with vertex w5 and edge {w5,w3}
(selected because v2 ∈ C5 and pre2 = 3) would still be a tree. The result is shown in
figure 2
w1 w3
w2 w4
Figure 2: Creating the graph T¯
We have to show that the resulting graph is a tree. In that case, the graph can be stored
in internal memory
Lemma 3. After the streaming procedure, T¯ is a tree.
4.2 Graph edges and information edges
We store two kinds of edges in the stream: Graph edges, which are the actual edges in G
with additional information, and information edges which represent the tree T . A graph
edge eki of circuit Ci has 2 log(n) + 4 log(m) memory space and is at first set up as
follows (li is the length of circuit Ci):
eki := (vik , vik+1 , i , k , 0, 0) for k ∈ {1, ..., li} (1)
• {vik , vik+1} is the original edge in G .
• eki ∈ Ci and walking on Ci , eki is passed from vik to vik+1 .
• k is the placement of eki in Ci in the order stored in the output stream.
9
• Later when merging circuits, the last two variables will help representing the pre-
decessor circuit Cj and the placement k ′ of the edge of Cj , behind which the circuit
Ci will be inserted.
Information edges are the edges build in T¯ and later T . They also contain additional
information. Since we need a rooted tree, variables concerning this are placed in these
edges. They have log(n) + 4 log(m) memory space and are build as follows:
f ji := (i , j , di , v , pi) (2)
• f ji represents the edge {wi ,wj } ∈ T and wi is the predecessor of wj in T .
• di is the depth of wi in T .
• v is a common vertex of Ci and Cj in G .
• Similar to graph edges, pi will be the placement of the edge in Ci , which has v
as its head, so when merging Ci and Cj , this can be done by inserting Cj into Ci
behind this edge. But for now, this memory space will be used for storing different
variables.
10
4.3 The algorithm
Algorithm 3: Algorithm circuit-find
input : Undirected graph G = ({v1, ..., vn},E ) with edges in random order, m := |E |
output: m graph edges and q information edges for q ≤ m
1 comi := 0 for all i ∈ {1, ...,n};
2 prei := 0 for all i ∈ {1, ...,n};
3 cir := 0;
4 s := false, scr := false ; // indicates if vertex in T¯ will be or is created
5 sedge := 0 ; // indicated potential edge in T¯
6 svert := 0 ; // indicated common vertex in G
7 T¯ := (V¯ ′, E¯ ′), T¯ := ∅, T¯ := ∅;
8 Scomp := {0} ; // keeps track of conn. comp. concerning current circuit
9 com∗ := 0;
10 repeat
11 read stream until (n edges are in internal memory) or (end of stream);
12 find circuit C = vi1 − e1i − ...− vili − e
li
i − vi1 with vertices vi ′1 , ..., vi ′l′ (li , l ′ ∈ N) ;
13 if there is no such circuit, return ’graph is not Eulerian’ ;
14 cir := cir + 1;
15 new-test(C );
16 comp-test(C );
17 if s=false then
18 output information edge (sedge , cir , 0, vsvert , 1);
19 sort C , s.t. C = vi1 − e1 − ...− vili − eli − vi1 with vi1 = vsvert ;
20 end
21 for j:=1 to li -1 do
22 output graph edge (vij , vij+1 , cir , j , 0, 0);
23 end
24 output graph edge (vili , vi1 , cir , li , 0, 0);
25 delete C in internal memory;
26 s := false, scr := false, sedge := 0, svert := 0, Scomp := {0}, com∗ := 0;
27 until (end of stream) and (no edges in internal memory);
28 for i:=1 to n-1 do
29 if comi 6= comi+1 then
30 return ’graph is not Eulerian’
31 end
32 end
33 write T¯ as rooted tree;
34 for every wi ∈ V¯ ′, let di be the depth of wi in T¯ ;
35 for every information edge (i , j , 0, v , 0) in internal memory output information edge
(i , j , di , v , 0);
11
Algorithm 4: Algorithm new-test
1 for j:=1 to l ′ do
2 if prei ′j = 0 then
3 s := true;
4 prei ′j := cir ;
5 end
6 else
7 if sedge = 0 then
8 sedge := prei ′j ;
9 svert := i ′j ;
10 Scomp := Scomp ∪ {comi ′j };
11 com∗ := comi ′j
12 end
13 end
14 end
15 if s = true then
16 create vertex wcir , V¯ ′ := V¯ ′ ∪ {wcir};
17 if sedge 6= 0 then
18 create edge {wcir ,wsedge}, E¯ ′ := E¯ ′ ∪ {{wcir ,wsedge}};
19 create information edge (sedge , cir , 0, vsvert , 0);
20 end
21 else
22 for j:=1 to l ′ do
23 comi ′j := cir
24 end
25 end
26 end
12
Algorithm 5: Algorithm comp-test
1 if com∗ 6= 0 then
2 for j:=1 to l’ do
3 if comi ′j 6= com∗ then
4 if s = false then
5 s := true;
6 create vertex wcir , V¯ ′ := V¯ ′ ∪ {wcir};
7 create edge {wcir ,wsedge}, E¯ ′ := E¯ ′ ∪ {{wcir ,wsedge}};
8 create information edge (sedge , cir , 0, vsvert , 0);
9 end
10 if comi ′j /∈ Scomp then
11 create edge {wcir ,wprei′j }, E¯
′ := E¯ ′ ∪ {{wcir ,wprei′j }};
12 create information edge (prei ′j , cir , 0, vi ′j , 0);
13 Scomp := Scomp ∪ comi ′j ;
14 end
15 end
16 end
17 for k:=1 to n do
18 if comk ∈ Scomp\{com∗} then
19 comk := com∗;
20 end
21 end
22 end
13
Remark 3. When algorithm circuit-find found a circuit Ci in step 12, it is tested if Ci
uses a vertex of G for the first time (new-test) or connects connected components in Gi−1
(comp-test). In new-test, step 2 to 5 test if a vertex is used for the first time. If this is
the case, s indicates that a new vertex wi is created in the tree T¯ . Step 6 to 13 test if
the circuit uses a vertex used by a circuit Cj before. If wi is created, an edge {wi ,wj }
is stored and an information edge is output (step 17 to 20). Scomp keeps track of the
connected components in Gi−1 touched by Ci . If Ci only uses new vertices, they will be a
connected component in Gi . This is noted in step 21 to 25. Algorithm comp-test starts if
Ci uses a vertex used before. Let Ak be the connected component of that vertex in Gi−1.
In comp-test it is tested if Ci uses vertices, which are not in Ak and not used for the first
time. If this happens for the first time, and there is not already a vertex wi in T¯ , such a
vertex is created in step 4 to 9 with the necessary graph and information edge. Otherwise,
just the graph and information edge is made. In step 17 to 21 the variables comk are
updated. If after new-test and comp-test there is still no vertex wi in T¯ , in step 17 to 20
of circuit-find an information edge is output. The last entry is ’1’, indicating that Ci has
no representative in T¯ . In step 19, the circuit is output such that the tail of the first edge
is a common vertex of the circuit noted in the information edge. The connectivity of G is
tested in step 28 to 32. Finally the rooted tree is build, and the stored information edges
are updated and output.
5 PL-StrSort algorithm
5.1 Merging circuits
The information edges indicate a rooted tree T like in lemma 2. Let us have two circuits
Ci , Cj and an information edge e = (i , j , d , v , p), where wi is the predecessor of wj in
T , d is the depth of wi in T , v ∈ V is a common vertex of Ci and Cj in G and p ∈ N
is the position of an edge in Ci which has v as its head. If v is the tail of the first edge
representing Cj , then the two circuits can be merged in the following way:
The graph edges of Ci stay the same with eki := (vik , vik+1 ,Ci , k , 0, 0) for k ∈ {1, ..., li}
and the length of the circuit lk , and the graph edges of Cj are changed to ekj :=
(vjk , vjk+1 ,Ci , p, cj , k) for k ∈ {1, ..., lj }. When sorting these edges by the size of the
four last labels (from left to right), both circuits are placed in the same region because
of the label Ci . Furthermore with label 4, Cj is placed between the edges p and p + 1 of
Ci , and since edge p of Ci has the common vertex v as its head and vj1 = v , the resulting
order is a circuit containing the edges of Ci and Cj . With the labels 5 and 6, inner order
of Cj is maintained, even if multiple circuits are inserted at position p of circuit Ci .
Getting the informations needed for the graph edges of Cj to be changed is the task of
the information edge. But first we have to take care of a few things that couldn’t be
14
finished in the last algorithm.
5.2 Preparations
We are missing a few key points for the merging to work:
1. Every circuit Ci with wi ∈ T¯ was output before the predecessor in T was decided.
The orders of their graph edges have to be changed, so that the tail of the first
edge is a common vertex with the predecessor in T .
2. The information edges with a vertex not contained in T¯ were output before the
rooted tree was made, so they miss the information about the depth of the prede-
cessor in T .
3. All information edges lack the last information: The position of the graph edge of
the predecessor circuit, behind which the successor circuit will be inserted.
3. won’t be a problem. The algorithm will iteratively merge circuits and produce informa-
tion edges belonging to a rooted tree T ′ with height about half the height of the original
tree T . At that point, the information edges will again miss the information about graph
edge positions.
We will now show StrSort algorithms with respectively O(1) passes and O(log(n)) mem-
ory space for each of problem 1 and 2. Analog to the strategies in [7] and [1], the sorting
step is used to put edges needing information next to edges having said information, so
both can be put in internal memory for information transfer during the next streaming
step.
5.2.1 Rotating circuits
Let Cj be a circuit with wj ∈ T¯ . If dj > 0, wj has a predecessor wi in T¯ . The information
edge f ji contains a common vertex v of Ci and Cj , but the order of Cj stored in the graph
edges wasn’t changed according to v during algorithm circuit-find. The order of Cj can
be changed as follows:
• Sort the graph edges by circuit label and placement, and the information edges by
successor circuit s.t. in the stream a circuit is stored directly behind the information
edge with the regarding successor circuit.
15
• While streaming a circuit Cj , keep the information edge f
j
i and the first graph edge
e1j of the circuit in internal memory. Count the number lj of edges in the circuit,
and find the placement p of the edge with v as its tail. Store both informations in
the last two entries of e1j .
• Output and delete f ji and e
1
j after reaching the next circuit in the stream (in most
cases an information edge). Continue with the next circuit.
• Sort the same way as before.
• The necessary informations lj and p are stored in e1j . In the next streaming step, af-
ter reaching e1j and storing these informations, output (vj1 , vj2 , j , ((k−p) mod lj )+
1, 0, 0) and delete e1j .
• Read graph edges ekj := (vjk , vjk+1 , j , k , 0, 0) and output (vjk , vjk+1 , j , ((k−p) mod lj )+
1, 0, 0) for k ∈ {2, ..., lj }.
• Delete p and lj . Continue with the next circuit.
5.2.2 Information edges and depth
Let Cj be a circuit with wj /∈ T¯ . Then there is exactly one information edge with second
entry j . Let Ci be the stored predecessor circuit and f
j
i be the concerning information
edge. Then wi ∈ T¯ , and the last entry of f ji is ’1’. There are two cases:
• wi is the root of T¯ . Then dj = 0.
• wi has a predecessor wk in T¯ . Then the information edge concerning {wi ,wk}
contains the depth dk of wk . It is dj = dk + 1.
With two simple stream steps and one sort step f ji = (i , j , 0, v , 1) for some v ∈ V will
get the needed information from f ik if existing:
• Change f ji = (i , j , 0, v , 1) to (j , i , 0, v , 1), i.e. change predecessor and successor, and
mark that at the last variable of f ji .
• Sort the information edges lexicographically according to the successor (the second
entry) and the last entry.
• The information edges with second entry ’i ’ will now appear consecutively on the
next input stream.
16
• If before (j , i , 0, v , 1), there is no edge with a ’0’ as last entry and second entry ’i ’,
output a depth of 0, i.e. (i , j , 0, v , 0)
• If there is an edge with a ’0’ as last entry, e.g. (k , i , dk , v , 0), then for all edges
(j , i , 0, v , 1) with i as second entry output (i , j , dk + 1, v , 0)
5.3 The merging step
Now we come to the merging step explained in section 5.1. Due to algorithm circuit-find
and the two preparation steps, the graph edges and information edges have the following
properties:
1. For the q circuits found, let i ∈ {1, ..., q}. Then circuit Ci of length li is represented
by the li graph edges e
j
i = (vij , vij+1 , i , j , 0, 0) for j ∈ {1, ..., li − 1} and e lii =
(vili , vi1 , i , li , 0, 0).
2. Let T = (V ′, ~E ′) with V ′ := {w1, ...,wq} and (~e ji ∈ ~E ′ ⇔ there exists an informa-
tion edge with circuit entries i and j in that order). Then T is an out-tree on q
vertices. Let h be the height of T .
3. For i , j ∈ {1, ..., q} let f ji be an information edge. Then the edge has the form
f ji = (i , j , di , v , 0), where wi is the predecessor of wj in T , di is the depth of wi and
v is a common vertex of Ci and Cj . Furthermore vj1 = v .
The algorithm will output graph edges and information edges s.t. these properties are
still fulfilled and the out-tree represented by the information edges has height bh/2c.
The number of graph edges will stay the same, still representing the edges of G . After
O(log(h)) = O(log(n)) iterations of the algorithm, the underlying out-tree has a height
of 0, so the graph edges form a single circuit i.e. an Euler-tour of G .
17
Algorithm 6: Algorithm tree-merge
input : Graph edges e ji for some i , j ∈ {1, ...,n} and information edges f ji for some
i , j ∈ {1, ...,m} fulfilling the properties above with a graph T of height h
output: Graph edges and information edges representing an out-tree T ′ of height bh/2c
and fulfilling the properties above
1 count := 0;
2 for all f ji = (i , j , di , v , 0) with di odd do
3 change information edge to (j , i , di , v , 1);
4 end
5 sort:
- information edges in front of graph edges
- information edges: (i1, j1, di1 , v1, x1) < (i2, j2, di2 , v2, x2)⇔ (j1 < j2) or
(j1 = j2 and x1 < x2) or (j1 = j2 and x1 = x2 and i1 < i2)
- order of graph edges does not matter
6 stream: for every information edge (i , j , di , v , 0) (with 0 as last entry) do
7 store i in internal memory and output (i , j , di , v , 0) ;
8 as long as information edges of form (i ′, j , dj , v ′, 1) are read, output (i , i ′, dj , v ′, 0)
instead;
9 end
10 sort:
- information edges with odd depth after every other edge, order does not matter
- information edges, even depth: (i1, j1, di1 , v , 0) < (i2, j2, di2 , v ′, 0)⇔ (i1 < i2) or
((i1 = i2) and (v < v ′))
- graph edge and information edge with even depth:
(vij , vi(j+1) , i , j , 0, 0) < (i
′, j ′, di ′ , v ′, 0)⇔ (i < i ′) or ((i = i ′) and (vi(j+1) ≤ v ′))
- graph edges: (vij , vi(j+1) , i , j , 0, 0) < (vi ′j ′ , vi ′(j ′+1) , i
′, j ′, 0, 0)⇔ ((i < i ′) or
(i = i ′) and (vi(j+1) < vi ′(j ′+1))) or (i = i
′) and (vi(j+1) < vi ′(j ′+1)) and (j < j
′))
11 stream: for every graph edge (vij , vij+1 , i , j , 0, 0) do
12 read all information edges of even depth until the next graph edges follows;
13 for each such information edge (i ′, j ′, di ′ , v ′, 0), output (i ′, j ′, di ′ , v ′, j ) instead;
14 end
15 sort:
- graph edges: (vij , vij+1 , i , j , 0, 0) < (vi ′j ′ , vi ′j ′+1 , i
′, j ′, 0, 0)⇔ (i < i ′) or
(i = i ′ and j < j ′)
- information edges: (i1, j1, di1 , v , x1) < (i2, j2, di2 , v ′, x2)⇔ (j1 < j2)
- information edge and graph edge: (i ′, j ′, di ′ , v ′, 0) < (vij , vij+1 , i , j , 0, 0)⇔ (j ′ ≤ j )
16 stream: for every information edge (i ′, j ′, di ′ , v ′, x ) with even di ′ do
17 store i ′ and x in internal memory, delete the information edge without output;
18 as long as graph edges (vij , vij+1 , i , j , 0, 0) are read, output (vij , vij+1 , i ′, x , i , j )
instead;
19 end
20 tree-merge2;
18
Algorithm 7: Continuation tree-merge2
1 sort:
- information edges in front of graph edges, order does not matter
- graph edges: (vij , vij+1 , i¯ , j¯ , i , j ) < (vi ′j ′ , vi ′j ′+1 , i¯
′, j¯ ′, i ′, j ′)⇔ (i¯ < i¯ ′) or
(i¯ = i¯ ′ and j¯ < j¯ ′) or (i¯ = i¯ ′ and j¯ = j¯ ′ and i < i ′) or
(i¯ = i¯ ′ and j¯ = j¯ ′ and i = i ′ and j < j ′)
2 stream:
3 for every information edge (i , j , di , v , 0) do
4 change to (i , j , ((di − 1)/2), v , 0);
5 end
6 repeat
7 count := 2;
8 read graph edge (vij , vij+1 , i , j , 0, 0);
9 store i , output (vij , vij+1 , i , 1, 0, 0) and delete graph edge. repeat
10 read graph edge (vi ′j ′ , vi ′j ′′ , i , x , i
′, y), output (vi ′j ′ , vi ′j ′′ , i , count , 0, 0) and delete
graph edge;
11 count := count + 1;
12 until graph edge is read that doesn’t have i as entry 3 ;
13 until end of stream;
Remark 4. Since we merge circuits Ci with its predecessor circuit, iff di is odd, the
information edges with odd predecessor depth are not used in this iteration. Instead, these
edges have to be prepared for the next iteration. Steps 5 to 9 are for that purpose. Infor-
mation edges with odd predecessor depth store the predecessor of the predecessor, because
that will be the predecessor in the next iteration. In step 10 to 14, the information edges
concerning circuit merges get to know the placement in which the successor circuit has to
be inserted. The information edges share this knowledge with the graph edges in step 15
to 19. In tree-merge2, the circuit insertions take place, and the graph edges are renamed
according to their new circuit and placement.
Lemma 4. Including the preparation algorithms of section 5.2, algorithm tree-merge
is a PL-StrSort algorithm with O(log(n)) alternating streaming and sorting passes and
O(log(n)) memory space.
Theorem 5. Algorithm EulerStr, consisting of ’circuit-find’, preparation steps and ’tree-
merge’ has the following properties:
1. In an undirected graph it finds an Euler-tour, if existing.
2. The first part is a single step W-stream algorithm with O(n log(n)) memory space.
3. The second part is a PL-StrSort algorithm.
19
4. The stream never exceeds a length of O(m).
6 Conclusion
We have presented an algorithm for finding Euler tours in undirected graphs in the
StrSort model. It uses a single pass preparation step with O(nlog(n)) memory space,
followed by a PL-StrSort algorithm. With this result, various open questions appear:
• Can the preparation step be replaced by an StrSort algorithm using O(log(n))
passes and memory space? In this case, the Euler tours problem could be solved
entirely by a PL-StrSort algorithm. However, as implied by Ruhl ([7]) finding cycles
might be difficult.
• Are there more problems where a single pass with larger memory enables it to
be solved by a PL-StrSort algorithm? Such a preparation step might be a useful
addition to the StrSort model.
• Since the algorithm of Atallah and Vishkin can be used for directed graphs, can our
algorithm be altered to work on them? A direct transfer is not possible, because
we can’t find directed cycles in one pass with only O(n log(n)) memory space. We
need to look for possibilities for finding directed cycles in the StrSort model.
• With the algorithm of Atallah and Vishkin an external memory algorithm can be
designed which uses O(log(n) sort(n + m)) I/O steps for finding an Euler tour.
Since for O(log(n)) memory space the StrSort model is more restrictive than the
external memory model, can our technique be transferred to external memory to
improve the current result? Again for this we have to run the preparation step with
less memory space and probably more passes.
References
[1] Gagan Aggarwal, Mayur Datar, Sridhar Rajagopalan, and Matthias Ruhl. On the
streaming model augmented with a sorting primitive. In Proceedings of the 45th
Annual IEEE Symposium on Foundations of Computer Science, FOCS ’04, pages
540–549, Washington, DC, USA, 2004. IEEE Computer Society.
[2] Mikhail Atallah and Uzi Vishkin. Finding euler tours in parallel. J. Comput. Syst.
Sci., 29(3):330–337, December 1984.
20
[3] Yi-Jen Chiang, Michael T. Goodrich, Edward F. Grove, Roberto Tamassia, Dar-
ren Erik Vengroff, and Jeffrey Scott Vitter. External-memory graph algorithms.
In Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms,
SODA ’95, pages 139–149, Philadelphia, PA, USA, 1995. Society for Industrial and
Applied Mathematics.
[4] Camil Demetrescu, Bruno Escoffier, Gabriel Moruz, and Andrea Ribichini. Adapt-
ing parallel algorithms to the w-stream model, with applications to graph problems.
Theor. Comput. Sci., 411(44-46):3994–4004, October 2010.
[5] Camil Demetrescu, Irene Finocchi, and Andrea Ribichini. Trading off space for passes
in graph streaming problems. ACM Trans. Algorithms, 6(1):6:1–6:17, December 2009.
[6] Dinesh P. Mehta and Sartaj Sahni. Handbook Of Data Structures And Applications
(Chapman & Hall/Crc Computer and Information Science Series.). Chapman &
Hall/CRC, 2004.
[7] Jan Matthias Ruhl. Efficient Algorithms for New Computational Models. PhD thesis,
Cambridge, MA, USA, 2003. AAI0805714.
21
