Efficiency Limits for Value-Deviation-Bounded Approximate Communication by Stanley-Marbell, Phillip & Rinard, M
IEEE EMBEDDED SYSTEMS LETTERS, VOL. X, NO. Y, SEPTEMBER 2015 1
Efficiency Limits for Value-Deviation-Bounded
Approximate Communication
Phillip Stanley-Marbell and Martin Rinard
Abstract—Transferring data between integrated circuits ac-
counts for a growing proportion of system power in wearable and
mobile systems. The dynamic component of power dissipated in
this data transfer can be reduced by reducing signal transitions.
Techniques for reducing signal transitions on communication
links have traditionally been targeted at parallel buses and can
therefore not be applied when the transfer interfaces are serial
buses.
In this article, we address the issue of the best-case effec-
tiveness of techniques to reduce signal transitions on serial
buses, if these techniques also allow some error in the numeric
interpretation of transmitted data. For many embedded appli-
cations, exchanging numeric accuracy for power reduction is a
worthwhile tradeoff. We present a study of the efficiency of these
value-deviation-bounded approximate serial data encoders (VDBS
data encoders) and proofs of their properties.
The bounds and proofs we present yield new insights into
the best possible tradeoffs between dynamic power reduction
and approximation error that can be achieved in practice. The
insights are important regardless of whether actual practical
VDBS data encoders are implemented in software or in hardware.
Index Terms—Approximate Computing, Approximate Com-
munication, Bounds, Data Encoding.
I. INTRODUCTION
WEARABLE and health-tracking devices dissipate ever-larger fractions of their power on sensor activation
and on data transfer between processors and sensor integrated
circuits. Since package and circuit board capacitances do not
improve with semiconductor process advances, the fraction
will likely continue to grow relative to components such as
processors. For reasons of space and cost, the data transfer
happens over serial interfaces, and not over parallel buses. This
precludes the use of traditional low-power bus data encod-
ings [1, 2]. To address this challenge, value-deviation-bounded
approximate serial data encoders (VDBS data encoders) [3]
reduce the dynamic power dissipation of serial buses when
deviations in the values being transmitted are tolerable.
This article presents a study of the efficiency of VDBS
data encoders, as well as proofs of properties of VDBS data
encoders. Our analyses provide an essential yardstick for
evaluating practical VDBS data encoding algorithms that may
be proposed in the future. The proofs of properties of VDBS
data encoders on the other hand provide important new insights
into how VDBS data encoders fit into the existing body of
research on reducing dynamic power of both serial and parallel
buses.
l = 8
s = 6410 = 010000002
0 1 0 0 0 0 0 0So
ur
ce
De
st
in
at
io
n
s s
m = 13 ⌈log2(m)⌉ bits
(a) Without VDBS encoding.
l = 8
t = 6310 = 001111112
0 0 1 1 1 1 1 1En
co
de
r
De
st
in
at
io
n
s t
m = 13 ⌈log2(m)⌉ bits
(b) With VDBS: fewer transitions.
Fig. 1. In this example, assume the tolerable deviation, m, is 13 (i.e., 5 %
of 255). VDBS encoding halves the number of transitions while incurring a
value deviation, |s− t|, of just 0.39 % of the full-scale range. All bits except
the most-significant bit are modified, not just the lower dlog2(m)e.
II. DEFINITIONS
Two essential components in the formulation of VDBS
encoders are:
Ê The number of serial transitions that occur when a
single value s is transmitted over a serial link (two tran-
sitions in Figure 1(a)). We call this the serial transition
count (STC) of the word or value s.
Ë The difference in serial transition counts between two
words (a difference of one between s and t in Figure 1).
Throughout this work, the values considered will be unsigned.
Definition 1 (Serial transition count function, #δ(s)).
Let s be an l-bit unsigned integer with bits s0, s1, . . . , sl−1,
from least- to most-significant bit. Then, we define #δ(s), the
number of signal transitions in the serialization of s, as
#δ(s) =
l−2∑
i=0
si ⊕ si+1. 
Definition 2 (Serial transition count difference, ∆s,t ).
Let s and t be two l-bit words. Then, we define ∆s,t, as the
absolute value of their difference in serial transition counts:
∆s,t = |#δ(s)−#δ(t)|. 
III. PROPERTIES OF FUNCTION #δ(n)
For an l-bit value n, the properties of the serial transition
count (STC) function #δ(n), which we explore next, give
insights into the efficiency limits of VDBS encoders.
Proposition 1 (Maximum serial transition count pattern).
When l is even, the maximum serial transition count occurs
when the l-bit word has l2 0s and the same number of 1s. 
Proof (Maximum serial transition count pattern).
To maximize the serial transition count, there should be a
transition in moving between every neighboring pair of bit
P. Stanley-Marbell and M. Rinard are with the Computer Science and Arti-
ficial Intelligence Laboratory (CSAIL), Department of Electrical Engineering
and Computer Science, Massachusetts Institute of Technology, Cambridge,
MA, 02139. E-mail: psm@mit.edu, rinard@csail.mit.edu
IEEE EMBEDDED SYSTEMS LETTERS, VOL. X, NO. Y, SEPTEMBER 2015 2
17010 =
101010102
000000002 111111112
m = 17010 m = 8510
8510 =
010101012
111111112 000000002
m = 17010 m = 8510Induced error, m:
All transitions removed:
Fig. 2. The maximum serial transition counts for l-bit values occur when
they have alternating 0s and 1s in their binary representations.
positions. Thus, when l is even, words with maximum serial
transition count have l2 0s and the same number of 1s. 
Corollary 1 (Maximum serial transition count basis values).
There are two values with maximum serial transition count.
When l is even, these values are
bˆ1 =
l
2−1∑
i=0
22i = 13 (2
l − 1) (1)
and
bˆ2 = 2
l − 1−
l
2−1∑
i=0
22i = 23 (2
l − 1).

This follows directly from Proposition 1. For example,
Figure 2 illustrates how, for l = 8, the maximal-serial-
transition-count words are 85 and 170.
Lemma 1 (Maximum serial transition count).
For every l-bit word n, #δ(n) ≤ l − 1. 
Proof (Maximum serial transition count).
The number of bits (l) in a word is a natural number. When
l is 1, there are no transitions in the word, by definition
of the serial transition count. For all other l, the maximum
serial transition count occurs when all adjacent bits of the
word differ. There are four cases in which this could happen,
corresponding to whether l is even or odd, and whether the
least-significant bit (LSB) is a 1 or a 0.
First, consider the cases when l is even. When l is even,
there are l2 ones and
l
2 zeros. If the LSB is 0, there will
be one transition in moving from the LSB towards the most-
significant bit (MSB), and each of the remaining l2 − 1 bits
which are 0 will have two associated transitions. There will
therefore be a total of l − 1 transitions. A similar argument
applies if the LSB is 1.
Next, consider the cases when l is odd. When l is odd, there
are either b l2c bits which are 1 and d l2e bits which are 0, or
vice versa. The bit polarity appearing in the LSB will occur
l−1
2 +1 times, and the opposite polarity to the LSB will occur
l−1
2 times.
There will be one transition moving out of the LSB towards
the MSB, followed by transitions in the remaining l − 1 bits.
Since l is odd, it follows that l − 1 is even. But we showed
above that such an even number of bits could contain at most
(l − 1) − 1 transitions. Thus, when l is odd, the maximum
number of transitions is also 1 + (l − 1) − 1. That is, the
maximum number of transitions is l − 1. 
Theorem 1 (Serial transitions and Gray code).
Let s be an l-bit integer, let GrayCode(s) denote the sth
L0 = {}
L00 = {0}
L01 = {1}
L1 = {0, 1}
L01 = {00, 01}
L11 = {11, 10}
L2 = {00, 01, 11, 10}
L02 = {000, 001, 011, 010}
L21 = {110, 111, 101, 100}
L3 = {000, 001, 011, 010, 110, 111, 101, 100}
L03 = {0000, 0001, 0011, 0010, 0110, 0111, 0101, 0100}
L31 = {1100, 1101, 1111, 1110, 1010, 1011, 1001, 1000}
L4 = {0000, 0001, 0011, 0010, 0110, 0111, 0101, 0100, 1100, 1101, 1111, 1110,
1010, 1011, 1001, 1000}
l = 0
l = 1
l = 2
l = 3
l = 4
Fig. 3. Illustration of the construction of the list Ll of l-bit strings in Gray
code order, for l = 1, l = 2, and l = 3.
value in Gray code order for l-bit values, and let #1(n)
denote the count of 1s in an l-bit integer n. Then, #δ(s) =
#1(GrayCode(s)). 
For example, for l = 8 and s = 30 (000111102),
GrayCode(s) = 17 (000100012) and #1(GrayCode(s)) = 2.
We will use the following, the Gray code theorem of
Wilf [4], in the proof of Theorem 1. We include a self-
contained adaptation of Wilf’s original proof here so that our
discussion stands on its own.
Theorem 2 (Wilf’s Gray Code Theorem).
Let s be an l-bit integer with bits s0, s1, . . . , sl−1, from least-
to most-significant bit. Let g be the sth l-bit integer in Gray
code order, with bits g0, g1, . . . , gl−1, from least- to most-
significant bit. For l = 1 we have g0 = s0. In general, for
l ≥ 2,
gi ≡ si + si+1 (mod 2) (i = 0, . . . , l − 2)
and
gl−1 = sl−1. 
For example, consider the 8-bit value 63. The string of rank
63 in the 8-bit Gray code, that is, the 63rd Gray code value,
can be constructed as follows: For the ith bit, simply take the
ith and i + 1th bits of 63, and add them modulo 2.
Proof (Wilf’s Gray Code Theorem).
Let Ll be the list of l-bit strings in Gray code order. L0 is
the empty list. The list Ll can be constructed recursively as
follows:
• Let L0l−1 be the list obtained by prefixing every element
of Ll−1 with an additional 0.
• Let Ll−11 be the list obtained by prefixing every element
of the list Ll−1, in reverse order, with an additional 1.
• Ll is the concatenation of L0l−1 and Ll−1
1
.
The construction of the list Ll is illustrated in Figure 3 for
l = 1, l = 2, and l = 3. By construction therefore, the 2l
entries for an l-bit Gray code will be identical to the first 2l
entries for an (l+1)-bit Gray code; we use this property below.
We prove by induction on l that the property of Theorem 2
holds for all l-bit integers s. When l = 0, Ll is the empty list,
and the property we seek to prove is vacuously true. Suppose
IEEE EMBEDDED SYSTEMS LETTERS, VOL. X, NO. Y, SEPTEMBER 2015 3
the property of Theorem 2 holds for all strings on the list Ll−1.
By construction of Ll, we know the property must also hold
for the first 2l−1 items on Ll. Suppose then, that s ≥ 2l−1.
Let s′ = 2l − 1 − s. Then the property of Theorem 2 holds
for the string that has Gray code rank s′, since it is by its
definition less than 2l−1.
Again by construction of the Gray code lists Ll from Ll−1,
it is the case that for any given value 0 ≤ k < 2l−1, the first
l− 1 bits of the Gray code strings g and g′ with ranks s = k
and s′ = k are identical. Furthermore, the most-significant
bits, gl−1 and g′l−1, of these corresponding strings, have the
relation
gl−1 ≡ 1 + g′l−1 (mod 2).
At the same time, the binary representations of the integers s
and s′ have the relation
si ≡ 1 + s′i (mod 2) (i = 0, . . . , l − 1),
and the property of Theorem 2 continues to hold for all strings
on the list Ll. 
We now use Wilf’s Gray code theorem to prove the property
of Theorem 1, which relates properties of transitions within
a single word, s, when serialized, to properties of the rank-s
Gray code.
Proof (Serial transitions and Gray code).
The proof is a direct result of Theorem 2. Let g be the Gray
code representation for l-bit integer s. That is, g is the rank-s
l-bit Gray code. The number of 1s in g, #1(g), is
#1(g) =
l−1∑
i=0
gi
=
l−2∑
i=0
(si + si+1 (mod 2)) , from Theorem 2
=
l−2∑
i=0
(si ⊕ si+1) .
But this is exactly the #δ(s) from Definition 1. 
IV. BOUNDS ON SERIAL TRANSITION COUNT REDUCTION
We can reduce the number of serial transitions in words
without changing the word size, by introducing errors into the
values represented by words. The maximum number of serial
transitions we can remove by doing so, is limited:
Property 1 (Bound on serial transition count difference).
For any two l-bit words s and t, the serial transition count
difference, ∆s,t is less than or equal to l − 1. 
Proof (Bound on serial transition count difference).
By construction, the serial transition count, #δ(s) for a non-
negative integer s, is a natural number. Therefore, the largest
serial transition count difference, will occur when either #δ(s)
is zero and #δ(t) takes on the maximum value in the codomain
of #δ(t), or vice versa. From Lemma 1, this maximum value
is l− 1. Thus the maximum serial transition count difference,
∆s,t is l − 1. 
Across all possible l-bit words, the deviation induced when
transitions are reduced by the maximum of l− 1, is bounded:
Property 2 (Minimum and maximum deviation at maximum
serial transition count difference).
Let s and t be two l-bit words with l even. If s and t differ
in serial transition count by the maximum possible amount
(l− 1), then their difference in numeric value is bounded by:
min
∆s,t=l−1
{|s− t|} = 13
(
2∆s,t+1 − 1) , (2)
and
max
∆s,t=l−1
{|s− t|} = 23
(
2∆s,t+1 − 1) . (3)

Proof (Minimum and maximum deviation at maximum serial
transition count difference).
Follows directly from Corollary 1. 
For example, for l = 8, we have from Lemma 1 that
the maximal serial transition count difference is l − 1 = 7.
The minimum deviation between two words which have this
maximum serial transition count difference, from Property 2,
is 85. Therefore, to reduce the serial transition count of an 8-
bit word by 7 transitions, one cannot do so with a replacement
word that deviates from it by less than 85.
The bounds of Property 2 are only specified for the case
of maximal changes in serial transition count, not for any
arbitrary reduction in serial transition count. General bounds
across all possible values of serial transition count reduction
are desirable, because they would enable us to answer ques-
tions such as:
• By how much can serial transition counts differ
for a given value deviation? This will be captured by
Definition 3 and Theorem 3 below.
• By how much can values differ for a given difference in
serial transition count? Property 2 answers this question
for the restricted case of a serial transition count differ-
ence of l − 1. The answer for the general case will be
captured by Definition 4 below.
Definition 3 (Serial transition difference bound function).
Given an l-bit integer m, let f(m) be a function yielding the
amount by which the serial transition counts of two unsigned
l-bit words s and t can differ if |s− t| = m. That is,
f(m) = max
|s−t|=m
{∆s,t} . 
Why f(m) is important: The function f(m) is interesting
because, if one had an exact expression or tight bounds for
f(m), then an algorithm that searched for the serial-transition-
reducing encoding for a value s could terminate as soon as it
found a value t such that ∆s,t = f(m), since no better value
than t is possible.
Theorem 3 (Bound on f(m)).
The function f(m) of Definition 3, for any l-bit value, m (with
l even), is not monotone. The best linear monotone bound on
f(m) is f(m) ≤ l − 1 . 
IEEE EMBEDDED SYSTEMS LETTERS, VOL. X, NO. Y, SEPTEMBER 2015 4
f(m)
f(0) = 0
2l - 1
l - 1
f(m) is undefined 
in these regions
since m = 0 when s = t
m
f(2l - 1) = 0
1
3 (2
l   1) 23 (2l   1)
(a) Illustration of f(m).
� �� ��� ��� ��� ����
�
�
�
�
�
�
�
�
�(�)
(b) Numerical evaluation of f(m).
Fig. 4. The function f(m) yielding the amount by which the serial transition
counts of two words s and t can differ if |s− t| = m, is not monotone.
Proof (Bound on f(m)).
Let s and t be two unsigned l-bit words, and let m be |s−t|, a
value in the domain of f . If m is 0, then s is identical to t, and
must have identical serial transition count, thus #δ(s) = #δ(t)
and therefore f(0) = 0. If m is 2l − 1, then either s is 2l − 1
and t is zero, or vice versa. In both cases, their serial transition
counts are 0 by definition, that is #δ(s) = #δ(t) = 0. Thus,
when m is 2l − 1, f(m) = 0.
From Corollary 1 and Lemma 1, the maximum value of
f(m) is l − 1, and it occurs at two values, bˆ1 and bˆ2 from
Equation 1. Both bˆ1 and bˆ2 are greater than 0 and less than
2l−1. Since f(0) is 0, f(bˆ1) is l−1, f(bˆ2) is l−1, and f(2l−1)
is 0, it follows that f(m) is not monotone.
From Corollary 1 and Lemma 1, since there are two values
of m for which f(m) takes on its maximum value of l − 1,
it follows that the tightest linear bound on f(m) must pass
through these points. Thus the tightest linear bound on f(m)
is l − 1. 
Figure 4(a) illustrates several properties of f(m), and Fig-
ure 4(b) shows an empirical exact enumeration of f(m) across
all possible unsigned 8-bit values. The maximum value of m
is 2l−1 and the maximum value of f(m) is l−1, as indicated
by the shaded region in Figure 4(a). There can be no reduction
in serial transition count when the accompanying deviation in
value is 0, and thus f(0) = 0. Similarly, when the deviation
induced by encoding is 2l − 1 (i.e., the original and encoded
values are 0 and 2l−1 or vice versa), there can be no reduction
in serial transition count, and thus f(2l− 1) = 0. The maxima
of f(m) occur at m = 13 (2
l − 1) and m = 23 (2l − 1).
Definition 4 (Value deviation bound functions).
Let g(d) be the minimum amount by which two integers s and
t can differ if their difference in serial transition count, ∆s,t,
is d. Similarly, let gˆ(d) be the maximum amount by which two
integers s and t can differ if their difference in serial transition
count, ∆s,t, is d. That is,
g(d) = min
∆s,t=d
{|s− t|} , and gˆ(d) = max
∆s,t=d
{|s− t|} . 
Figure 5(a) illustrates several properties of g(d) and gˆ(d),
and Figure 5(b) shows an empirical exact enumeration of g(d)
and gˆ(d) for unsigned 8-bit values. When there is no difference
in serial transition count (d = 0 in the figures) the original and
l - 1
undefined
in these regions
2l - 1
d
ĝ(d), g(d) ĝ(d), g(d)
0
1
3 (2
l   1)
2
3 (2
l   1)
(a) Illustration of g(d) and gˆ(d).
△ △ △ △ △ △ △
△
▽ ▽ ▽ ▽ ▽ ▽ ▽ ▽
� � � � �
�
��
���
���
���
���
�
�(�)�
�△���
�
�(�)�
�▽�
(b) Numerical evaluation: g(d), gˆ(d).
Fig. 5. At minimum serial transition count (STC) difference, d = 0, either
s and t are identical, or they are different but take on values s = 0 and
t = 2l − 1 or vice versa.
encoded values may be identical (g(0) = 0) or may be 0 and
2l−1 or vice versa (gˆ(0) = 2l−1). At the maximum possible
serial transition count reduction (d = l − 1), the incurred
deviation cannot be reduced below 13 (2
l − 1); the worst-case
deviation at this maximal-transition-reduction point is however
also limited, at 23 (2
l − 1).
V. SUMMARY AND DISCUSSION
Value-deviation-bounded approximate serial data encoders
(VDBS data encoders) reduce the power for transmission of a
word over a serial communication interface. They achieve this
by reducing signal level transitions in the transmitted word at
the expense of numeric deviations in the values transmitted.
This article provided insight into:
Ê The reduction in serial transitions (and hence dynamic
power) that can be achieved at the cost of induced error
(Proposition 1, Lemma 1).
Ë The relation between the count of serial transitions within
a single word, and Gray codes, which minimize transi-
tions between consecutive words (Theorem 1).
Ì Definition of the bound on transition reduction that can be
achieved for a given value deviation (Definition 3). The
relation between transition reduction and value deviation
is not monotone, and Theorem 3 provides the tightest
linear monotone bound.
Í Definition of bounds on the maximum and minimum
numeric value deviation that any VDBS data encoder will
induce for a given reduction in serial transition counts
(Definition 4).
The properties, proofs, and bounds are important regardless
of whether actual practical VDBS data encoders are imple-
mented in software or in hardware.
REFERENCES
[1] W.-C. Cheng and M. Pedram. Memory bus encoding for low power: a
tutorial. ISQED ’01, pages 199–204, 2001.
[2] M. R. Stan and W. P. Burleson. Bus-invert coding for low-power i/o.
IEEE TVLSI, 3(1):49–58, Mar. 1995.
[3] P. Stanley-Marbell and M. Rinard. Value-deviation-bounded serial data
encoding for energy-efficient approximate communication. Technical
Report MIT-CSAIL-TR-2015-022, MIT Computer Science and Artificial
Intelligence Laboratory (CSAIL), June 2015.
[4] H. S. Wilf and A. Nijenhuis. Combinatorial algorithms: an update.
SIAM, 1989.
