A logical reconstruction of Batcher’s mergers, or, bitonicity is a red herring by Hinze, Ralf & Martin, Clare
 RADAR 
Research Archive and Digital Asset Repository 
 
 
 
Hinze, R. and Martin, C. (2017) 'A Logical Reconstruction of Batcher’s Mergers Or: Bitonicity is a Red 
Herring', Journal of Universal Computer Science, 23 (1), pp. 21-41. 
 
http://www.jucs.org/jucs_23_1/a_logical_reconstruction_of 
 
 
This document is the Version of Record. 
License: https://creativecommons.org/licenses/by-nc-nd/4.0  
Available from RADAR: https://radar.brookes.ac.uk/radar/items/51b735e3-2eb7-4f3b-adaa-31b2ccac49cc/1/ 
 
  
 
Copyright © and Moral Rights are retained by the author(s) and/ or other copyright owners unless otherwise waved in 
a license stated or linked to above. A copy can be downloaded for personal non-commercial research or study, without 
prior permission or charge. This item cannot be reproduced or quoted extensively from without first obtaining 
permission in writing from the copyright holder(s). The content must not be changed in any way or sold commercially 
in any format or medium without the formal permission of the copyright holders. 
A Logical Reconstruction of Batcher’s Mergers
Or: Bitonicity is a Red Herring
Ralf Hinze
(Institute for Computing and Information Sciences
Radboud University, 6525EC Nijmegen, The Netherlands
ralf@cs.ru.nl)
Clare Martin
(Department of Computing and Communication Technologies
Oxford Brookes University, Wheatley, Oxford, OX33 1HX, England
cemartin@brookes.ac.uk)
Abstract: Almost half a century after Batcher wrote his seminal paper on sorting
networks, we revisit the key algorithmic design decisions for oblivious merging to re-
discover his schemes in a disciplined way. The design space of sorting networks is ex-
plored, resulting in a systematic reconstruction of schemes that appear in the literature
in various guises.
Key Words: Hardware design, functional programming, parallel algorithms
Category: D3.2, D3.3
1 Introduction
To iterate is human, to recurse divine.
L Peter Deutsch
In 1968, K.E. Batcher introduced two related schemes for merging networks: the
bitonic merger and the merge exchange network. The former method requires
more comparisons but appears to be the more popular of the two [Batcher 1968].
Both are instances of comparison networks, which can only perform comparisons
between single pairs of inputs, but these operations can be executed in parallel.
The concept was originally devised for describing hardware implementations in
the 1950’s [Knuth 1998, OConnor and Nelson 1962] but is equally applicable
to parallel algorithms on multiple processors. The model has a rich underlying
theory and has become an active topic of research since its introduction. Pro-
cedures presented in this model are oblivious in the sense that the comparisons
they perform are independent of the input data.
The aim of this paper is to explore the design space for oblivious merging
systematically, starting from minimal assumptions. So, whilst most other formal
treatments of these algorithms have focused on correctness proofs, the emphasis
here is on derivation from ﬁrst principles, with simultaneous veriﬁcation. The
Journal of Universal Computer Science, vol. 23, no. 1 (2017), 21-41
submitted: 29/6/16, accepted: 3/12/16, appeared: 28/1/17 © J.UCS
result is a comprehensive overview of various schemes and presentations, with
the connections between them made precise.
In particular, our contribution is to show explicitly how the two mergers of
Batcher are derived by similar calculations from identical assumptions, diﬀer-
ing only in one initial design decision. The eﬀect of using diﬀerent strategies to
divide the input are also explored and used to relate various rearrangements of
Batcher’s methods. Correctness proofs are often expressed in terms of the zero-
one principle [Knuth 1998]. The style adopted here favours the fundamental
monotonicity property of comparison networks instead, as the resulting proofs
and derivations are more enlightening. On a related note, we also ﬁnd Batcher’s
notion of bitonicity dispensable—both mergers can be derived using monotonic-
ity alone. In summary, we show how the remarkable intuition of Batcher can be
explained in a methodical way through the judicious use of algebraic manipula-
tion.
Batcher’s mergers have been presented in a variety of sometimes compet-
ing styles. These can be roughly categorized along the following dimensions:
algebraic versus diagrammatic, imperative versus functional, recursive versus it-
erative, and sequential versus parallel. For this enterprise—deriving the mergers
from ﬁrst principles—we found that an algebraic or functional style using recur-
sion equations as implementable speciﬁcations of the algorithms worked best.
This practice, now so routine for functional programmers, is part of the abun-
dant legacy of David Turner [Turner 1982]. His language designs, SASL, KRC,
and Miranda, emphasized elegance and clarity. We hope that our derivations
are in his spirit, conveying the beauty of the equational approach to algorithm
design.
We begin with some preliminary notation and background in Section 2. Then
we explore the design space for oblivious merging in Sections 3 and 4, deriving
two mergers in the process. We claim that the ﬁrst is Batcher’s bitonic merger,
and Sections 5 and 6 are devoted to proving this claim. The second merger is
instantly recognizable from previous work [Hinze and Martin 2017a] as the merge
exchange scheme. Building on these mergers, Section 7 explores the design space
for oblivious sorting and relates the resulting schemes to the literature. Section 8
continues to review related work and, ﬁnally, Section 9 concludes.
2 Preliminaries
2.1 Lattices and order
Sorting networks work well if the underlying structure is a distributive lattice
instead of a total order. The output is not necessarily a permutation of the
input in this setting, but it is still an ordered, generalized permutation [Bove
and Coquand 2006]. We assume only the existence of a ﬁxed partial order 
22 Hinze R., Martin C.: A Logical Reconstruction ...
and lattice operators meet ↓ and join ↑, which distribute over each other and are
deﬁned by:
x  a ∧ x  b ⇐⇒ x  a ↓ b (1a)
a ↑ b  x ⇐⇒ a  x ∧ b  x (1b)
If a and b are comparable, then these operators are simply the minimum and
maximum. It is also useful to postulate that our lattices are bounded by a bottom
element ⊥ and a top element :
⊥  a ∧ a   (2)
Algebraically, bottom is the unit of join and, dually, top is the unit of meet.
2.2 Sequences
We write sequences using angle brackets, e.g. 〈1〉 is a singleton sequence and
〈1, 3, 1, 2, 6, 4〉 is a sequence of length 6. For readability, we shall omit the angle
brackets on the singleton sequences 〈⊥〉 and 〈〉. We construct sequences in two
ways: using concatenation and interleaving. In the former case, we have:
〈x0, . . . , xm−1〉 · 〈y0, . . . , yn−1〉 = 〈x0, . . . , xm−1, y0, . . . , yn−1〉
For emphasis, we write x ‖ y for x · y if both arguments have the same length
(or, are implicitly required to have the same length). Interleaving is deﬁned for
equal-length arguments only:
〈x0, . . . , xn−1〉 〈y0, . . . , yn−1〉 = 〈x0, y0, . . . , xn−1, yn−1〉
Together, these operators satisfy the following interchange law:
(u ‖ v) (x ‖ y) = (u x) ‖ (v  y) (3)
We also use both ‖ and  as patterns on the left-hand side of deﬁnitions: x ‖ y
is halving, dividing an input sequence into a lower half x and an upper half y ,
while x  y is uninterleaving, unzipping the input into the sub-sequence x of
elements at even positions and the sub-sequence y of elements at odd positions
(assuming sequence indexing starts from 0 not 1).
The ﬁrst element of a sequence is denoted by head , and dually for the last
element, with tail and init the remaining parts respectively:
〈head x 〉 · tail x = x = init x · 〈last x 〉
For convenience, the lifting of binary relations, such as , and functions, like
↓ and ↑, to sequences, is denoted by the same symbol as the original:
〈x0, . . . , xn−1〉R 〈y0, . . . , yn−1〉 ⇐⇒ x0Ry0 ∧ · · · ∧ xn−1Ryn−1
〈x0, . . . , xn−1〉 ⊕ 〈y0, . . . , yn−1〉 = 〈x0 ⊕ y0, . . . , xn−1 ⊕ yn−1〉
23Hinze R., Martin C.: A Logical Reconstruction ...
Liftings interact with concatenation according to another interchange law:
(u⊕ v) · (x⊕ y) = (u · x)⊕ (v · y) (4)
provided that u has the same length as v , and x has the same length as y .
2.3 Ordered sequences
We can use the lifted order to capture that a sequence is ordered :
x ordered ⇐⇒ ⊥ · x  x ·  (5)
Notice that if x is non-empty, then condition (5) is equivalent to init x  tail x .
Concatenations of non-empty sequences enjoy the split property :
x · y ordered ⇐⇒ x ordered ∧ last x  head y ∧ y ordered (6)
while interleavings enjoy the zig-zag property, so-called because of its pictorial
representation, see [Hinze and Martin 2017a]:
x  y ordered ⇐⇒ x  y ∧ ⊥ · y  x ·  (7)
2.4 Comparison Networks
Merging and sorting networks are data oblivious algorithms in the sense that
the comparisons used to merge or sort a sequence are the same regardless of the
input data. Both are instances of comparison networks. The primary component
of such a network is a comparator, depicted as a box by Batcher, but more
commonly as a so-called Knuth diagram like that below, where horizontal lines
represent wires, and comparators are vertical connections. Data ﬂows from left
to right along the wires. Each comparator outputs the smaller of its two input
values on the lower wire, and the larger value on the upper one:
a
b
a ↓ b
a ↑ b
In general, a comparison network consists of a number of horizontal wires, which
are vertically connected using comparators. Below are some examples of small
sorting networks for four inputs 〈a1, a2, a3, a4〉:
a1
a2
a3
a4
b1
b2
b3
b4
a1
a2
a3
a4
b1
b2
b3
b4
a1
a2
a3
a4
b1
b2
b3
b4
24 Hinze R., Martin C.: A Logical Reconstruction ...
Note that many of the comparators can operate in parallel. Consider the second
diagram. From the layout we can easily infer that the ﬁrst two and the last two
comparators can act in parallel. Perhaps less obviously, the same holds for the
two comparators in the middle. They are only drawn beside each other to avoid
overlapping wires. The diagrams below are alternative drawings of the second
and third network.
a1
a2
a3
a4
b1
b2
b3
b4
a1
a2
a3
a4
b1
b2
b3
b4
We identify two diagrams if one can be transformed into the other by sliding
comparators horizontally, without moving connectors (depicted by ﬁlled circles)
past each other.
To illustrate the use of the combinators introduced in Section 2.2, we de-
ﬁne two fundamental building blocks of merging and sorting networks: low-high
cleaners (also known as half-cleaners) and even-odd cleaners.
000
001
010
011
100
101
110
111
low-high–clean (x ‖ y)
=
(x ↓ y) ‖ (x ↑ y)
even-odd-clean (x  y)
=
(x ↓ y) (x ↑ y)
000
001
010
011
100
101
110
111
Notice that the two cleaners are in a sense dual to each other: numbering the
inputs in binary (most signiﬁcant digit ﬁrst), the low-high cleaner connects in-
puts 0w and 1w, while the even-odd cleaner connects inputs w0 and w1.
Comparator networks enjoy a fundamental monotonicity property [Knuth
1998, Hinze and Martin 2017a] that is central to all of our correctness proofs:
every network nw transforms greater inputs to greater outputs.
x  y =⇒ nw x  nw y (8)
3 Deriving Batcher’s Bitonic Merger
Given these prerequisites, let us now explore the design space for oblivious merg-
ing, starting from ﬁrst principles. A merging network takes two ordered inputs
to an ordered output, as formalized by the speciﬁcation:
x ordered ∧ y ordered =⇒ x y ordered (9)
25Hinze R., Martin C.: A Logical Reconstruction ...
Although this does not state explicitly that the output is a permutation of the
two inputs, any network consisting solely of comparators does indeed produce a
permutation in a generalized sense [Bove and Coquand 2006].
Now, applying an inductive approach to problem solving, the ﬁrst question is
how to reduce a given merging problem to simpler sub-problems. For simplicity,
we restrict attention to input sequences with length an exact power of two. There
are many ways to divide the input, including two principled approaches: each
sequence can either be halved or uninterleaved. We choose the latter option,
as the former seems to go nowhere; roughly speaking, the zig-zag property (7)
is more pleasant to use than the split property (6). This may also explain the
observation [Misra 1994] that most discussions of the principles of parallel sorting
prefer interleaving to halving. So, the ﬁrst assumption is that the inputs to be
merged are treated as interleaved sequences, say s  t and u  v .
The next choice is how to combine the sub-sequences in the recursive calls.
There are two options: given the inputs s t and uv , we can either recursively
combine s with u or with v , partnering t with the remainder. We pursue the ﬁrst
route in Section 4; the second is explored below. We obtain as an initial sketch:
〈a〉 〈b〉 = 〈a ↓ b, a ↑ b〉
(s  t) (u  v) = (s  v) ? (t  u)
The base case is uncontentious: if both input sequences are singletons, then
the network is a single comparator. For the inductive step, we observe that the
merge of interleavings is not quite the same as the interleaving of the merges, for
example, (〈3〉 〈4〉) (〈1〉 〈2〉) = (〈3〉 〈2〉) (〈4〉 〈1〉). In other words, the
placeholder ‘?’ above is not just . But perhaps we can clean things up using a
few additional comparators? The goal of the subsequent calculations is to derive
the additional circuitry denoted by ‘?’.
We can assume that the recursive invocations are proper mergers. In particu-
lar, we make use of the fact that their output is ordered (10b), and that bottom
and top elements are propagated to the front and to the rear respectively (10c).
For reference, the pre-condition of (9) is also reiterated in (10a):
s  t ordered ∧ u  v ordered (10a)
s  v ordered ∧ t  u ordered (10b)
(⊥ · w) z = ⊥ · (w  z ) ∧ (w · ) z = (w  z ) ·  (10c)
for all ordered sequences w and z . In case you ﬁnd the last property unmotivated,
recall that the deﬁnition of ‘x ordered’ (5) involves both bottom ⊥ and top .
Property (10c) makes precise how merge interacts with them.
Since the arguments of merge are given as interleavings, we begin by applying
the zig-zag property (7) to the pre-condition of merge (10a):
s  t ordered ∧ u  v ordered
26 Hinze R., Martin C.: A Logical Reconstruction ...
⇐⇒ { zig-zag property (twice) (7) }
s  t ∧ ⊥ · v  u ·  ∧ ⊥ · t  s ·  ∧ u  v
=⇒ { monotonicity of comparison networks (twice) (8) }
s  (⊥ · v)  t  (u · ) ∧ (⊥ · t) u  (s · ) v
⇐⇒ { property of merge (10c) }
⊥ · (s  v)  (t  u) ·  ∧ ⊥ · (t  u)  (s  v) · 
Abbreviating s  v by x and t  u by y , our goal is to establish ‘x ? y ordered’,
deriving the placeholder ‘?’ in the process. It stands to reason that the unknown
circuitry is some interleaving; so we work towards a situation where we can
apply the zig-zag property (7). Introducing the induction assumption (10b), we
continue:
x ordered ∧ ⊥ · x  y ·  ∧ ⊥ · y  x ·  ∧ y ordered
⇐⇒ { deﬁnition of ordered (twice) (5) }
⊥ · x  x ·  ∧ ⊥ · x  y ·  ∧ ⊥ · y  x ·  ∧ ⊥ · y  y · 
⇐⇒ { characterization of minimum (1a) and maximum (1b) }
(⊥ · x ) ↑ (⊥ · y)  (x · ) ↓ (y · )
⇐⇒ { interchange law (4) }
(⊥ ↑ ⊥) · (x ↑ y)  (x ↓ y) · ( ↓ )
⇐⇒ { idempotence: a ↓ a = a and a ↑ a = a }
⊥ · (x ↑ y)  (x ↓ y) · 
⇐⇒ { minimum is smaller than maximum: a ↓ b  a ↑ b }
x ↓ y  x ↑ y ∧ ⊥ · (x ↑ y)  (x ↓ y) · 
⇐⇒ { zig-zag property (7) }
(x ↓ y) (x ↑ y) ordered
So the additional circuitry is just a single column of comparators: the even-
odd cleaner of Section 2.4. It is convenient to introduce an inﬁx operator for
the circuitry: x  y = even-odd-clean (x  y). We can then substitute this new
operator for the placeholder in our original sketch of the merger and by the
preceding calculation we guarantee that the output is sorted.
〈a〉 〈b〉 = 〈a ↓ b, a ↑ b〉 (11a)
(s  t) (u  v) = (s  v)  (t  u) (11b)
x  y = (x ↓ y) (x ↑ y)
We claim that this is Batcher’s bitonic merger. This assertion is veriﬁed in Sec-
tions 5 and 6, but ﬁrst let us explore the second design option mentioned at
27Hinze R., Martin C.: A Logical Reconstruction ...
the start of this section. Batcher called his second arrangement the “odd-even
merger”, but it is also known as the merge exchange network [Knuth 1998].
4 Deriving Batcher’s Merge Exchange Network
The bitonic merger “mixes” even and odd sub-sequences in the recursive calls.
The only other option is to recurse on the even sequences and on the odd ones:
〈a〉 〈b〉 = 〈a ↓ b, a ↑ b〉
(s  t) (u  v) = (s  u) ? (t  v)
We aim to derive a circuit to substitute for the placeholder in this speciﬁcation
in a similar way as we did in Section 3. Now that s is combined with u instead
of v , the inductive assumption (10b) becomes:
s  u ordered ∧ t  v ordered (12)
The derivation then proceeds as before. First we massage the pre-condition of
merge (10a). A similar calculation to that in Section 3 produces:
s  u  t  v ∧ ⊥ · ⊥ · (t  v)  (s  u) ·  · 
Let us again introduce some shortcuts for the results of the recursive calls. This
time we choose to replace su by 〈a〉·x and tv by y ·〈b〉. The reason for picking
this generalization instead of simply x and y as in Section 3 is that head (s u)
is the overall minimum and, dually, last (t  v) the overall maximum. This fact
actually drops out naturally from the ﬁnal calculation. Using these shortcuts the
pre-condition can be further simpliﬁed.
〈a〉 · x  y · 〈b〉 ∧ ⊥ · ⊥ · y · 〈b〉  〈a〉 · x ·  · 
⇐⇒ { pointwise ordering, bottom and top (2) }
〈a〉 · x  y · 〈b〉 ∧ ⊥ · y  x · 
⇐⇒ { pointwise ordering }
a  head y ∧ init x  tail y ∧ last x  b ∧ ⊥ · y  x · 
⇐⇒ { init x  tail y ⇐⇒ ⊥ · x  y ·  }
a  head y ∧ ⊥ · x  y ·  ∧ last x  b ∧ ⊥ · y  x · 
Next we introduce the induction assumption (12).
〈a〉 · x ordered ∧ a  head y ∧ ⊥ · x  y ·  ∧ last x  b
∧ ⊥ · y  x ·  ∧ y · 〈b〉 ordered
⇐⇒ { split property (twice) (6) }
a  head x ∧ x ordered ∧ a  head y ∧ ⊥ · x  y · 
∧ last x  b ∧ ⊥ · y  x ·  ∧ y ordered ∧ last y  b
28 Hinze R., Martin C.: A Logical Reconstruction ...
⇐⇒ { see last calculation in Section 3 }
a  head x ∧ a  head y ∧ x  y ordered
∧ last x  b ∧ last y  b
⇐⇒ { characterization of minimum (1a) and maximum (1b) }
a  head x ↓ head y ∧ x  y ordered ∧ last x ↑ last y  b
⇐⇒ { head and tail distribute over lifted operations }
a  head (x ↓ y) ∧ x  y ordered ∧ last (x ↑ y)  b
⇐⇒ { head (x  y) = head (x ↓ y) and last (x  y) = last (x ↑ y) }
a  head (x  y) ∧ x  y ordered ∧ last (x  y)  b
⇐⇒ { split property (6) }
〈a〉 · (x  y) · 〈b〉 ordered
Again, a single column of comparators does the job. However, one less com-
parator is used than in Section 3 as a is the overall minimum and b the overall
maximum. The resulting circuit is called an odd-even cleaner, denoted by updownharpoonrightleft.
Substituting this operator for the placeholder in our original sketch, we obtain:
〈a〉 〈b〉 = 〈a ↓ b, a ↑ b〉
(s  t) (u  v) = (s  u) updownharpoonrightleft (t  v)
(〈a〉 · x ) updownharpoonrightleft (y · 〈b〉) = 〈a〉 · (x  y) · 〈b〉
This is identical to the deﬁnition of Batcher’s exchange merger from previous
work [Hinze and Martin 2017a]. Quite amazingly, by systematically exploring
the design space, we claim to have obtained both of Batcher’s mergers. It is
also crystal clear that the exchange merger requires fewer comparators than the
bitonic merger as it builds on the odd-even cleaner updownharpoonrightleft rather than the even-odd
cleaner . It now remains to conﬁrm that the bitonic merger really is the same
as that invented by Batcher. This is the subject of the next two sections.
5 Deriving the Bitonic Sorter
Batcher did not introduce a bitonic merger in his seminal paper [Batcher 1968],
but what he called a bitonic sorter. This is actually something of a misnomer,
since such networks do not sort arbitrary inputs. Instead they sort only bitonic
sequences: ones that ﬁrst increase then decrease, or can be circularly shifted to
such.
We will now show how the deﬁnition of the merger calculated in Section 3 is
related to Batcher’s bitonic sorter, taking two steps to do so. We start with
a recap of Batcher’s construction in Section 5.1, as presented, for example,
in [Cormen et al. 2001]. Next in Section 5.2 we derive an algebraic deﬁnition of
29Hinze R., Martin C.: A Logical Reconstruction ...
the bitonic sorter from our merger, further simplifying the design in the process.
Finally in Section 6, we show that the two deﬁnitions do indeed coincide.
5.1 Cormen et al.’s diagrammatic presentation
Of Batcher’s two designs, the bitonic merger seems to be the more popular
one, perhaps because it enjoys an attractive diagrammatic decomposition, see
diagram on the left below.
The circuit is composed of several stages, each corresponding to a low-high-
cleaner, as deﬁned in Section 2.4. For example, the leftmost 4 comparators in
the left-hand diagram above constitute the cleaner for 8 inputs. When applied
to a bitonic input sequence, the cleaner produces output with smaller numbers
in the bottom half and larger ones in the top, and both halves bitonic. (The
original name for the low-high-cleaner, half-cleaner, stems from the fact that if
the input contains only zeros and ones, at least one half of the output will be
clean: consisting solely of either zeros or ones.)
The shading in the diagram on the left above indicates how low-high-cleaners
are combined recursively to create a bitonic sorter:
bisort 〈a〉 = 〈a〉
bisort (x ‖ y) = bisort (x ↓ y) ‖ bisort (x ↑ y)
The merger is then deﬁned in terms of the bitonic sorter. The result is illustrated
in the network on the right above, formed by modifying that on the left. It is
based on the intuition that two ordered sequences x and y can be merged by
applying a bitonic sorter to the ﬁrst concatenated to the reverse of the second:
x  y = bisort (x ‖ rev y) (14)
This reversal of the second half of the input can be performed implicitly, hence
the rearrangement of the leftmost 4 comparators in the diagram on the right.
5.2 Misra’s algebraic presentation
Misra uses relationship (14) to apply a correctness proof of the bitonic sorter
to the merger [Misra 1994]. Here we do the opposite. We derive the bitonic
30 Hinze R., Martin C.: A Logical Reconstruction ...
sorter from the bitonic merger, simultaneously establishing its correctness. This
order reversal leads us to question the necessity for the very existence of the
misleadingly named bitonic sorter or the associated notion of bitonicity.
Rearranging (14), the bitonic sorter is speciﬁed as a one-argument version of
the bitonic merger, with the twist that the second argument is reversed:
bisort (x ‖ y) = x  rev y (15)
In the following derivation we distinguish between a base case, two wires,
and an inductive case, more than two wires. For reasons that become clear from
the calculation, the point of departure is in both cases an interleaving:
bisort (〈a〉 〈b〉)
= { singletons }
bisort (〈a〉 ‖ 〈b〉)
= { speciﬁcation (15) }
〈a〉 rev 〈b〉
= { deﬁnition of reverse }
〈a〉 〈b〉
= { deﬁnition of merge (11a) }
〈a ↓ b, a ↑ b〉
= { deﬁnition of even-odd cleaner }
〈a〉  〈b〉
bisort ((s ‖ t) (u ‖ v))
= { interchange law (3) }
bisort ((s  u) ‖ (t  v))
= { speciﬁcation (15) }
(s  u) rev (t  v)
= { reverse and interleaving }
(s  u) (rev v  rev t)
= { deﬁnition of merge (11b) }
(s rev t)  (u rev v)
= { speciﬁcation (twice) (15) }
bisort (s ‖ t)  bisort (u ‖ v)
We could use the resulting equalities as deﬁning equations. A moment’s reﬂec-
tion, however, reveals that we can streamline the deﬁnition:
bisort 〈a〉 = 〈a〉
bisort (x  y) = bisort x  bisort y
We have introduced a base case for singletons and generalized the inductive case.
6 Relating Diagrams and Algebra
We are ﬁnally in a position to reconcile art with mathematics, diagrams with
algebra. Reconciliation is asked for as the recursive decomposition of the bitonic
sorter is quite diﬀerent in the two approaches, the diagrammatic presentation
31Hinze R., Martin C.: A Logical Reconstruction ...
[Cormen et al. 2001] and the algebraic presentation [Misra 1994].
diagrammatic presentation algebraic presentation
bisort 〈a〉 = 〈a〉
bisort (x ‖ y)
= bisort (x ↓ y) ‖ bisort (x ↑ y)
bisort 〈a〉 = 〈a〉
bisort (x  y)
= bisort x  bisort y
In the deﬁnition on the left, the divide step requires additional circuitry (the low-
high-cleaner), while the conquer step is free. On the right-hand side, the divide
step is free, while the conquer step requires additional circuitry (the even-odd-
cleaner). These diﬀerences are, however, illusory: the structure of the circuits
is identical. The diagram on the left is obtained from the diagram on the right
by shifting the comparators that span ﬁve wires as far as possible to the left.
Moreover, the correspondence does not depend on the speciﬁcs of the underlying
comparator circuit—it is purely structural; it holds for any gate with two inputs
and two outputs. What follows is an algebraic proof of this claim.
Up to this point we have only conducted pointwise calculations. However,
experience shows that for structural proofs a point-free argument is preferable.
To this end we lift the operators we have seen before to function spaces:
(f ‖ g) x = f x ‖ g x (f  g) x = f x  g x
We also require projection functions as a substitute for the use of concatenation
and interleaving in patterns:
low (x ‖ y) = x
high (x ‖ y) = y
even (x  y) = x
odd (x  y) = y
Constructors and projections are related by the following universal properties:
f = low ◦ g ∧ g = high ◦ h ⇐⇒ f ‖ g = h
f = even ◦ g ∧ g = odd ◦ h ⇐⇒ f  g = h
where ◦ is function composition. You may recognize the similarity to categorical
products. Indeed, we can use the ingredients above to deﬁne the product arrow:
32 Hinze R., Martin C.: A Logical Reconstruction ...
f × g = f ◦ low ‖ g ◦ high f  g = f ◦ even  g ◦ odd
The universal properties imply the following functor laws:
id × id = id (16a)
(f ◦ g)× (h ◦ k) = (f × h) ◦ (g × k) (16b)
id  id = id (16c)
(f ◦ g)  (h ◦ k) = (f  h) ◦ (g  k) (16d)
Finally, the interchange laws below make precise how the two products interact:
(f  g) ‖ (h k) = (f ‖ h) (g ‖ k) (17a)
(f  g)× (h  k) = (f × h)  (g × k) (17b)
We illustrate the use of these point-free combinators by redeﬁning the low-
high and even-odd cleaners which were deﬁned pointwise in Section 2.4. They
are obtained by repeatedly “squaring” the underlying comparator circuit, for
example:
(cmp  cmp)

(cmp  cmp)
(cmp × cmp)
×
(cmp × cmp)
In general, the low-high cleaner is cmpk, while the even-odd is cmpk×, where
f0× = f
f (k+1)× = fk× × fk×
f0 = f
f (k+1) = fk  fk
We can use these combinators to express the algebraic deﬁnitions of the
bitonic sorters in a point-free style:
bisort0 = id bisort0 = id
bisortk+1 = (bisortk × bisortk) ◦ cmp
k bisortk+1 = cmp
k× ◦ (bisortk  bisortk)
This presentation very clearly exhibits the symmetry between the two deﬁnitions.
In a sense they are dual: we obtain one from the other simply by exchanging ×
and . Turning to the equivalence proof, we ﬁrst abstract away from the speciﬁcs
of the application:
f0 = id g0 = id
fk+1 = (fk × fk) ◦ c
k gk+1 = c
k× ◦ (gk  gk)
33Hinze R., Martin C.: A Logical Reconstruction ...
To establish f = g, we show that f also satisﬁes the recursion equation of g:
fk+1 = c
k× ◦ (fk  fk) (18)
The proof proceeds by induction over k:
f1
= { deﬁnition of f }
(f0 × f0) ◦ c
0
= { deﬁnition of f }
(id× id) ◦ c0
= { functor law (16a) }
c0
= { deﬁnition of c }
c
= { deﬁnition of c× }
c0×
= { functor law (16c) }
c0× ◦ (id  id)
= { deﬁnition of f }
c0× ◦ (f0  f0)
fk+2
= { deﬁnition of f }
(fk+1 × fk+1) ◦ c
(k+1)
= { induction assumption (18) }
((ck× ◦ (fk  fk))× (c
k× ◦ (fk  fk))) ◦ c
(k+1)
= { functor law (16d) }
(ck× × ck×) ◦ ((fk  fk)× (fk  fk)) ◦ c
(k+1)
= { deﬁnition of c× }
c(k+1)× ◦ ((fk  fk)× (fk  fk)) ◦ c
(k+1)
= { interchange law (17b) }
c(k+1)× ◦ ((fk × fk)  (fk × fk)) ◦ c
(k+1)
= { deﬁnition of c }
c(k+1)× ◦ ((fk × fk)  (fk × fk)) ◦ (c
k  ck)
= { functor law (16d) }
c(k+1)× ◦ ((fk × fk) ◦ c
k)  ((fk × fk) ◦ c
k)
= { deﬁnition of f }
c(k+1)× ◦ (fk+1  fk+1)
The base case on the left is entirely straightforward. The central manoeuvre in
the inductive step on the right is the application of the interchange law (17b).
(As an aside, observe that the terms surrounding the rewrite also exhibit a nice
symmetric recursion structure, where some work is done before the recursive
calls and some work is done afterwards.)
7 Relation to Sorting Networks
We have derived two mergers from ﬁrst principles, using an inductive approach
based on uninterleaving the input sequences. We then showed that the resulting
schemes correspond to those proposed by Batcher. Let us now apply the same
approach to sorters. Once again there are two disciplined ways to sub-divide
the input: halving and uninterleaving. This time we will explore both strate-
gies because examples of almost all of the resulting arrangements appear in the
literature, in various guises.
34 Hinze R., Martin C.: A Logical Reconstruction ...
7.1 The Sorters
The ﬁrst step is to rewrite the binary merge operator as an equivalent function
of one argument. The resulting two versions of the merger are denoted as follows:
merge‖ (x ‖ y) = x  y merge (x  y) = x  y
The associated Knuth diagrams for circuits with 8 inputs are presented below,
where (a) and (b) depict the exchange merger, while (c) and (d) represent the
bitonic one. In each case the diagram on the left uses halving and that on the
right interleaving; the diﬀerence amounts to a bit reversal permutation of the
input.
merge exchange bitonic merge
(a) merge‖ (b) merge (c) merge‖ (d) merge
Each of these four merging networks gives rise to a sorter:
sort⊕ 〈a〉 = 〈a〉
sort⊕ (x ⊕ y) = merge⊕ (sort⊕ x ⊕ sort⊕ y)
where ⊕ is instantiated either to ‖ or . The sorters that use halving represent
Batcher’s original methods. The interleaved merge exchange sorter has appeared
before [Codish and Zazon-Ivry 2010], as has the bitonic sorter [Misra 1994].
7.2 The Twist
The mergers can be mechanically divided into two sub-components by consid-
ering the base case separately from the inductive one. This arrangement is par-
ticularly interesting for the exchange merger, which has an intriguing twist. In
that case, the factorized deﬁnitions are as follows:
merge‖ = pair‖ ◦ low-high–clean merge = pair ◦ even-odd-clean
where pair⊕ (x ⊕ y) = x  y is merge with a trivial base case:
〈a〉 〈b〉 = 〈a, b〉
(s  t) (u  v) = (s  u) updownharpoonrightleft (t  v)
The function pair

is a “pair sorter”, which sorts sequences of sorted pairs. Its
halving analogue, pair‖, behaves similarly. The Knuth diagrams for the mergers
now have slightly diﬀerent layouts from before:
35Hinze R., Martin C.: A Logical Reconstruction ...
twisted merge exchange
(a′) merge‖ (b
′) merge

The non-base cases of the corresponding sorters become:
sort‖ = pair‖ ◦ low-high–clean ◦ (sort‖ × sort‖)
sort = pair ◦ even-odd-clean ◦ (sort  sort)
Now, here is the twist. Since each component preserves orderedness, the positions
of the ﬁrst two phases can be swapped:
sort‖ = pair‖ ◦ (sort‖ × sort‖) ◦ low-high–clean
sort = pair ◦ (sort  sort) ◦ even-odd-clean
The resulting interleaved sorter is a recursive expression of Parberry’s pairwise
sorting network [Parberry 1992, Hinze and Martin 2017b]. We have not seen the
version using halving in the literature.
The Knuth diagrams below show the diﬀerence between various layouts of
the interleaved exchange sorter: (e) is the original from Section 7.1 and (f ) has
the phases swapped, as in Parberry’s adaptation. The third arrangement, (g),
corresponds to two accounts of “Batcher’s Baﬄer” with correctness proofs in a
traditional imperative style [Gries 1986, Dijkstra 1987]. These latter two are the
earliest expositions of the interleaved exchange sorter of which we are aware.
merge exchange sorter Parberry’s twist Batcher’s Baﬄer
(e) (f ) (g)
The altered layout of (g) compared to (e) reﬂects the iterative nature of these
presentations. Algebraically, it can be computed by unfolding the recursion then
rearranging the components using the interchange law (16d), so for example:
sort  sort
= (pair

 pair

) ◦ (even-odd-clean  even-odd-clean)
◦ ((sort  sort)  (sort  sort))
36 Hinze R., Martin C.: A Logical Reconstruction ...
It is natural to ask whether the phases can also be swapped for the bitonic
mergers. They have a similar decomposition involving a pair-sorter:
merge‖ = pair
′
‖ ◦ low-high–clean ◦ (id × rev)
merge

= pair ′

◦ even-odd-clean ◦ (id  rev)
where pair ′⊕ has the same deﬁnition as pair⊕ except that the odd-even cleaner updownharpoonrightleft
is replaced by the even-odd one . Unfortunately the components are not as well-
behaved as those for the exchange merger. Neither id × rev nor id  rev preserve
orderedness, and so a similar interchange of sub-parts is not possible.
There are, of course, many other possible permutations of the input if we re-
move the restriction to halving and interleaving. For example, there are 315 dif-
ferent rearrangements of the 8-key merge exchange sorter alone [Al-Haj Baddar
and Batcher 2011].
8 Related Work
Batcher’s algorithms were the ﬁrst systematic methods for designing sorting net-
works of arbitrary input size. For decades, they have attracted enormous interest
from academia and industry alike; at the time of writing there are over 2500 ci-
tations of the original paper. Much of the related work concerns performance
and bounds. Batcher’s merge exchange sorter has been proven to use a minimal
number of comparators for input sizes n  8 [Knuth 1998]. Van Voorhis showed
how greater economy of comparators is possible for larger input sizes by dividing
the input into more than two groups [Van Voorhis 1971] and the results are still
the best known for some values. The challenge of proving that a given number of
comparators is minimal for sorting networks of input size n > 8 is still an open
problem.
There is also a substantial body of literature on the design, modelling and
veriﬁcation of Batcher’s networks. We therefore begin by categorizing a small
selection of examples according to the approach taken, and then refer the reader
to other work for more comprehensive reviews.
Imperative Both Gries and Dijkstra produced formal correctness proofs of the
merge exchange sorter, lovingly called “Batcher’s Baﬄer”, in an imperative
style [Gries 1986, Dijkstra 1987]. Neither used the zero-one principle but the
proofs are quite lengthy in comparison to their functional counterparts.
Relational The bitonic merger was described and analysed in the relational lan-
guage Ruby [Sheeran 1991] and shown to be related to the balanced merger.
More recently, relations were used to identify the relationship between Par-
berry’s pairwise network and Batcher’s merge exchange sorter [Codish and
Zazon-Ivry 2010]. The recursive, relational presentation is simpler and more
straightforward to follow than Parberry’s description.
37Hinze R., Martin C.: A Logical Reconstruction ...
Functional A functional description of bitonic sort appeared as an illustration
of an algebraic model of divide-and-conquer algorithms for parallel comput-
ers [Mou and Hudak 1988] but this did not include any correctness proofs.
An elegant and succinct correctness proof of bitonic sort was given subse-
quently using functions on the powerlist data structure [Misra 1994]. The
idea of using parametricity in Haskell as an alternative to Knuth’s zero-one
principle was introduced later [Day et al. 1999]. Bitonic sort was included as
a motivating example to demonstrate how a veriﬁcation of correctness could
be performed on Boolean input and then generalised to more complex types.
Machine-checked Formal Proofs One of the ﬁrst formal proofs of correct-
ness of bitonic sort using an automated reasoning system [Couturier 1998]
was performed in the prototype veriﬁcation system PVS [Owre et al. 1992];
the proof did not use the zero-one principle. Lava, a tool for hardware de-
sign and veriﬁcation, was used later [Claessen et al. 2003] to verify both the
bitonic and merge exchange algorithms using the zero-one principle. This ac-
count also mentioned the close relationship between the two mergers, in the
sense that their deﬁnitions diﬀer only in the choice of cleaner. The diﬀerence
here is our derivation from ﬁrst principles and precise relation to other ex-
positions. Constructive type theory was used to verify bitonic sort in Agda,
via the zero-one principle [Bove and Coquand 2006]. Agda was also used
to verify bitonic sort using parametricity [Dybjer et al. 2004], in a method
that combined testing, model checking and theorem proving. Braibant and
Chlipala followed a similar approach to Bove and Coquand to conduct a for-
mal proof in Coq, but with the added dimension of connecting to an actual
hardware implementation in the Fesi language [Braibant and Chlipala 2013].
There is a further distinction within these various paradigms between recur-
sive and iterative approaches. Sheeran comments that recursive descriptions can
be more comprehensible, as they oﬀer more insight into how the algorithm was
designed [Sheeran 1989]. This is borne out by the contrast between the elegance
of recursive butterﬂy circuits [Jones and Sheeran 1991] and the complexity of the
derivations of an iterative merge exchange sort by Gries and Dijkstra. Parberry’s
method is an exception as the original design was iterative, but we argue that
the recursive expression of the algorithm in Sections 7 is more transparent. An
iterative approach to the bitonic sort is given in Obsidian [Claessen et al. 2012],
which is quite easy to follow.
In the broader context, the interest in hardware design among the functional
programming community that was sparked in the 1980’s is still strong [Sheeran
2015]. This is partly because hardware design is essentially a form of parallel
programming and recent growth in the use of parallel machines has driven a
demand for related tools. As a consequence, the development of data-parallel
38 Hinze R., Martin C.: A Logical Reconstruction ...
functional programming languages has ﬂourished [Blelloch 1993, Chakravarty
et al. 2007]. Functional languages are well-suited for designing, analysing and
validating parallel algorithms for many reasons. These include their high level of
abstraction, use of higher-order functions to structure descriptions, associated
formal transformation and veriﬁcation methods, and ease of algebraic manipula-
tion for equational reasoning. We refer the reader to [Gammie 2013] for excellent
and comprehensive review of the long tradition of describing circuits using func-
tional programming techniques. Gammie also includes a summary of other formal
methods for hardware design including algebraic techniques, relational models
and diagrammatic methods involving “boxes and wires”. For more details of the
early history see also [Knuth 1998] and [Sheeran 2005].
9 Conclusion
To conclude, the “bitonicity” property invented by Batcher is really a red her-
ring. Yet he showed extraordinary insight to conceive both of his schemes using
diagrams alone, without the beneﬁt of calculation. The algebraic method per-
mits a clean derivation of the mergers from minimal assumptions. It shows they
are the two canonical choices, and the proofs rely solely on the monotonicity
property of comparison networks. This rigorous approach also unmasks the re-
semblance between the two mergers. In both cases, all of the work is done by the
conquer step, while the divide step is free. In contrast, Batcher’s presentation of
the circuits show a surprising inconsistency: the burden of the work is assumed
by the divide step for one, and conquer for the other.
Batcher names two advantages of the bitonic sorter over the exchange sorter.
First, they are ﬂexible in the sense that one network can accommodate input
lists of various lengths. Second, they are modular, so a network can be split up
into several identical modules. We question whether these advantages are really
valid? Both sorters appear to be equally ﬂexible and have identical modularity,
as evidenced by their recursive decomposition.
A ﬁnal argument in favour of the symbolic presentation is the fresh perspec-
tive aﬀorded by the point-free view. This high-level stance is useful for exposing
duality between diﬀerent presentations of the bitonic sorter, as well as making
precise the connection between the exchange and pairwise sorters [Hinze and
Martin 2017b].
Acknowledgements
We are grateful to the anonymous referees for constructive suggestions, including
the addition of some historical perspective to the introduction and the removal
of “rabbits” from the derivations. Thanks also to Jeremy Gibbons for pointing us
39Hinze R., Martin C.: A Logical Reconstruction ...
towards Dijkstra’s derivation of “Batcher’s Baﬄer” and David Gries for sharing
an earlier manuscript.
References
[Al-Haj Baddar and Batcher 2011] Al-Haj Baddar, S. W., Batcher, K. E.: Designing
Sorting Networks; Springer, 2011.
[Batcher 1968] Batcher, K. E.: “Sorting networks and their applications”; Proceedings
of the April 30–May 2, 1968, Spring Joint Computer Conference; 307–314; ACM,
1968.
[Blelloch 1993] Blelloch, G. E.: “Nesl: A nested data-parallel language”; Technical re-
port; CMU-CS-93-129, Carnegie Mellon University (1993).
[Bove and Coquand 2006] Bove, A., Coquand, T.: “Formalising bitonic sort in type
theory”; J.-C. Fillitre, C. Paulin-Mohring, B. Werner, eds., Types for Proofs and
Programs; volume 3839 of Lecture Notes in Computer Science; 82–97; Springer,
2006.
[Braibant and Chlipala 2013] Braibant, T., Chlipala, A.: “Formal veriﬁcation of hard-
ware synthesis”; Computer Aided Veriﬁcation; volume 8044 of Lecture Notes in
Computer Science; 213–228; Springer, 2013.
[Chakravarty et al. 2007] Chakravarty, M. M., Leshchinskiy, R., Peyton Jones, S.,
Keller, G., Marlow, S.: “Data parallel Haskell: a status report”; Proceedings of the
2007 workshop on Declarative Aspects of Multicore Programming; 10–18; ACM,
2007.
[Claessen et al. 2003] Claessen, K., Sheeran, M., Singh, S.: “Functional hardware de-
scription in Lava”; The Fun of Programming. Cornerstones of Computing; 151–176;
Palgrave, 2003.
[Claessen et al. 2012] Claessen, K., Sheeran, M., Svensson, B. J.: “Expressive array
constructs in an embedded GPU kernel programming language”; Proceedings of the
7th Workshop on Declarative Aspects and Applications of Multicore Programming;
21–30; ACM, 2012.
[Codish and Zazon-Ivry 2010] Codish, M., Zazon-Ivry, M.: “Pairwise cardinality net-
works”; Proceedings of the 16th International Conference on Logic for Program-
ming, Artiﬁcial Intelligence, and Reasoning; volume 6355 of Lecture Notes in Com-
puter Science; 154–172; Springer, 2010.
[Cormen et al. 2001] Cormen, T. H., Leiserson, C. E., Rivest, R. L., Stein, C.: Intro-
duction to Algorithms; The MIT Press, Cambridge, Massachusetts, 2001; third
edition.
[Couturier 1998] Couturier, R.: “Formal engineering of the bitonic sort using PVS”;
2nd Irish Workshop in Formal Methods; British Computer Society Electronic
Workshops in Computing, Cork, Ireland, 1998.
[Day et al. 1999] Day, N. A., Launchbury, J., Lewis, J.: “Logical abstractions in
Haskell”; Proceedings of the 1999 Haskell Workshop; Utrecht University Depart-
ment of Computer Science, Technical Report UU-CS-1999-28, 1999.
[Dijkstra 1987] Dijkstra, E. W.: “A heuristic explanation of Batcher’s Baﬄer”; Science
of Computer Programming; volume 9; 213–220; Elsevier, 1987.
[Dybjer et al. 2004] Dybjer, P., Haiyan, Q., Takeyama, M.: “Verifying Haskell pro-
grams by combining testing, model checking and interactive theorem proving”;
Information and Software Technology; 46 (2004), 15, 1011–1025.
[Gammie 2013] Gammie, P.: “Synchronous digital circuits as functional programs”;
ACM Computing Surveys; 46 (2013), 2, 21.
[Gries 1986] Gries, D.: Unpublished manuscript; 1986.
[Hinze and Martin 2017a] Hinze, R., Martin, C.: “Functional Pearl: Batcher’s odd-
even merging network revealed”; Journal of Functional Programming; (2017a); .
To appear.
40 Hinze R., Martin C.: A Logical Reconstruction ...
[Hinze and Martin 2017b] Hinze, R., Martin, C.: “Functional Pearl: Parberrys pairwise
sorting network revealed”; (2017b); in submission.
[Jones and Sheeran 1991] Jones, G., Sheeran, M.: “The study of butterﬂies”; Proceed-
ings of the IVth Higher Order Workshop, Banﬀ 1990; 54–65; Springer, 1991.
[Knuth 1998] Knuth, D. E.: The Art of Computer Programming, Volume 3: Sorting
and Searching; Addison-Wesley, 1998; 2nd edition.
[Misra 1994] Misra, J.: “Powerlist: A structure for parallel recursion”; ACM Transac-
tions on Programming Languages and Systems; 16 (1994), 6, 1737–1767.
[Mou and Hudak 1988] Mou, Z. G., Hudak, P.: “An algebraic model for divide-and-
conquer and its parallelism”; The Journal of Supercomputing; 2 (1988), 3, 257–278.
[Owre et al. 1992] Owre, S., Rushby, J. M., Shankar, N.: “PVS: A prototype veriﬁca-
tion system”; D. Kapur, ed., Proceedings of the Eleventh International Conference
on Automated Deduction (CADE); volume 607 of Lecture Notes in Artiﬁcial In-
telligence; 748–752; Springer-Verlag, 1992.
[OConnor and Nelson 1962] OConnor, D. G., Nelson, R. J.: “Sorting system with N-
line sorting switch”; (1962); uS Patent Number 3,029,413.
[Parberry 1992] Parberry, I.: “The pairwise sorting network”; Parallel Processing Let-
ters; 2 (1992), 205–211.
[Sheeran 1989] Sheeran, M.: “Describing butterﬂy networks in Ruby”; Proceedings of
the Glasgow Workshop on Functional Programming; 182–205; Springer Workshops
in Computing, 1989.
[Sheeran 1991] Sheeran, M.: “Sorts of butterﬂies”; Proceedings of the IVth Higher
Order Workshop, Banﬀ 1990; 66–76; Springer, 1991.
[Sheeran 2005] Sheeran, M.: “Hardware design and functional programming: a perfect
match”; Journal of Universal Computer Science; 11 (2005), 7, 1135–1158.
[Sheeran 2015] Sheeran, M.: “Functional programming and hardware design: Still in-
teresting after all these years”; Proceedings of the 20th ACM SIGPLAN Interna-
tional Conference on Functional Programming; ICFP 2015; 165–165; ACM, New
York, NY, USA, 2015.
[Turner 1982] Turner, D. A.: “Recursion equations as a programming language”; Dar-
lington, Henderson, Turner, eds., Functional Programming and its Applications.
An advanced course; Cambridge University Press, 1982.
[Van Voorhis 1971] Van Voorhis, D. C.: “A generalization of the divide-sort-merge
strategy for sorting networks”; Technical report; CS-TR-71-237, Stanford Univer-
sity, CA, USA (1971).
41Hinze R., Martin C.: A Logical Reconstruction ...
