Adaptive Low-Power Transmission Coding for Serial Links in Network-on-Chip  by Ren, Xianglong et al.
Procedia Engineering 29 (2012) 1618 – 1624
1877-7058 © 2011 Published by Elsevier Ltd.
doi:10.1016/j.proeng.2012.01.183
Available online at www.sciencedirect.com
Available online at www.sciencedirect.com
 
           Procedia Engineering  00 (2011) 000–000 
Procedia
Engineering
www.elsevier.com/locate/procedia
 
2012 International Workshop on Information and Electronics Engineering (IWIEE) 
Adaptive Low-Power Transmission Coding for Serial Links 
in Network-on-Chip 
Xianglong Ren∗, Deyuan Gao, Xiaoya Fan, Jianfeng An 
School of Computer Science & Engineering, Northwestern Polytechnical University, 127 Youyi Xilu, Xi’an 710072, China 
 
Abstract 
The traditional Serialized Low Energy Transmission Coding (SILENT) is effective in serial links and sequential data 
patterns. However, it is ineffective in the multiplexed channel used on packet-switched networks because the data 
packets are packed and not sequential. In this paper, we propose a coding scheme for the overhead region in SILENT, 
and present an adapting mechanism to switch between the two coding schemes when data pattern changes, thus to 
minimize the power dissipation of serial links in network-on-chip. Experimental results indicate that our scheme for 
the overhead region is quite effective in reducing switching activity with data patterns in it. We also evaluate the 
efficiency of adapting coding method with two multimedia applications. Experimental results show that the method 
we proposed adapts to more data patterns and is more suitable for serial communication in network-on-chip. 
 
© 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of Harbin University 
of Science and Technology 
 
Keywords: Low-Power; Serial Links; Transmission Coding; Network-on-Chip 
1. Introduction 
With the rapid advances of semiconductor technology, more and more resources can be integrated on a 
single chip. This implies that the number of on-chip modules will increase, and so will the number of on-
chip buses connecting these modules [1]. Furthermore, as the feature size rapidly diminishes into the 
nano-scale, long and multi-bit bus lines which have been widely used before have several problems such 
as skew, crosstalk, wiring difficulty, and large area [2]. Skew and jitter make it difficult to raise the 
frequency of bit-parallel bus. Furthermore, the crosstalk between adjacent lines results in data-dependent 
 
∗ Corresponding author. Tel.:+86-029-88451195-2; fax: +86-029-88451195-2. 
E-mail address: xianglong.ren@gmail.com. 
Open access under CC BY-NC-ND license.
Open access under CC BY-NC-ND license.
1619Xianglong Ren et al. / Procedia Engineering 29 (2012) 1618 – 16242 Xianglong Ren, etc. / Procedia Engineering 00 (2011) 000–000 
signal delay and noise, and finally causes the link unreliable [2]. The area cost of wide-bit bus of on-chip 
interconnects is serious, and the metal resources required by complex on-chip interconnect fabric will 
soon outgrow it available on-chip. 
The Network-on-Chip (NoC) is considered as a probable settlement of troublesome issues mentioned 
above [3-6]. NoC employs the modular structure consisted of a number of routers, which connected by 
bit-parallel buses to simplify the complexity of interconnect fabric. 
Serial communication is another technology to alleviate those problems. It was proposed [7] to reduce 
the number of interconnects, where each m-bits are multiplexed and serialized to a single interconnect; 
Serial link was also proposed [8] to reduce the bus power consumption [1]. Keeping the same area with 
bit-parallel bus, the interconnect pitch of serial-link bus increases, thus the interconnect capacitance is 
reduced. This technique is used in a high-performance memory I/O, and also on-chip interconnection 
networks recently [2]. 
3D stacking was also proposed in an attempt to reduce the complexity and length of on-chip 
interconnects [1]. Several dies are bonded together, and vertical true silicon vias (TSV) are used for the 
interconnects of different dies. Due to the limited number of TSVs, compare to parallel links, the serial 
link is more suitable for a large amount of data communication between stacked dies. 
From the discussion mentioned above, serial links appear promisingly to be portion of the on-chip 
interconnect fabric in the future. However, to provide the same throughput as an m-bits parallel 
interconnect, the serial link must operate m times faster [9]. That is to say, serializing m-bits on a single 
wire may increase the overall data Switching Activity (SA), and hence, increase the power consumption. 
Hence, reducing the SA of serial interconnect is necessary for the sake of energy efficiency. This paper 
proposes an adaptive low-power transmission coding method for serial links in NoC. 
The organization of the paper is as follows. In Section 2, a brief description of previously reported 
work is provided. Section 3 describes the proposed adaptive low-power transmission coding algorithm for 
serial links in NoC. Section 4 presents the experiment results of the proposed technique for both synthetic 
data patterns and real on-chip traces. Section 5 provides a summary and conclusions. 
2. Previous Works and Motivation 
There exist many researches about low-power coding on parallel buses [10, 11]. However, those 
parallel bus coding methods are not applicable to serial links. Actually, there is little work reported on 
power reduction in serial links [1, 2, 12]. 
The technique presented in [12] performs bit-ordering of the n-bit parallel bits on a serial interconnect 
such that the SA is minimized. Their approach is most effective when the statistical data of the parallel 
bus traces are known in advance, but it is not easy for general purpose processors, such as CPU and DSP. 
Paper [1] presents a quantitative analysis of the SA of serial links and employs two differential coding 
schemes based on the location of the encoder/decoder on the link to reduce the activity factor. 
Paper [2] presented a transmission coding method, which mentioned as SILENT, and it employs 
differential encoding to reduce the SA of serial links. It XORs the data to be serialized with the previous 
data, then the transition on serial interconnects will be only caused by the differences between successive 
words, so the SA reduction can be achieved. 
Paper [2] performs efficiency analysis of their coding scheme. Since the power consumption in the 
communications depends on the data patterns to be sent, they evaluate the power consumption with all 
possible variations from a random data word. The result shows that, besides the power-saving range, there 
is also a power-overhead range, and the width of this range is one-third of the total range width. In other 
words, for about one-third of all data patterns, the serial communication with SILENT coding will 
consume more power than that without it. 
1620  Xianglong Ren et al. / Procedia Engineering 29 (2012) 1618 – 1624 Xianglong Ren, etc. / Procedia Engineering 00 (2011) 000–000 3 
In this paper, we propose an adaptive low power transmission coding for serial links, which aims to 
suiting more data patterns in network-on-chip. The method we proposed is based on the work in [2]. To 
distinguish our approach from SILENT [2], we refer to it as ALPTCS (Adaptive Low-Power 
Transmission Coding for Serial links in NoC). 
3. Adaptive Low-Power Transmission Coding  
The conventional parallel bus coding methods used between a processor and memories is ineffective in 
the multiplexed channel used on packet-switched NoC, since the data packets are packed and not 
sequential [13]. Furthermore, the low energy coding SILENT proposed in [2] for serial links in the on-
chip network will consume more power than without it when the data pattern is in its overhead region. 
Therefore, we propose another Coding scheme for the Overhead Region in SILENT (CORS), and present 
an adapting mechanism to switch between the two coding schemes when data pattern changes, thus to 
minimize the power dissipation. 
3.1. Adaptive switching mechanism 
For the sake of performance, we predict αa of the current data word to distinguish the data pattern of it. 
Data pattern manager (DPM) is the adaptive switching mechanism for coding scheme. It uses a 2-bit 
prediction scheme, and fixedly predicts α is in [0, / 3 ) (2 / 3 , ]n n n⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥U b for each time. State transition of 
DPM is in accordance with its current state and actual value of α. Figure 1 shows the state machine of 
DPM, where taken means the prediction taken, that is to say [0, / 3 ) (2α / 3 , ]n n n⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥∈ U , not taken 
means α [ / 3 ,2 / 3 ]n n∈ ⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥ . The present state of DPM determines the encoding scheme which will be 
used for current data word. 
 
Figure 1. The states in a 2-bit prediction scheme. 
In serial links of NoC, a flit can be used as the basic unit for coding, that is, the data word mentioned 
above. In addition, the receiver end ought to know the flit currently received is encoded with which 
scheme. Accordingly, the flag can be transmitted to the receiver via adding it to head-unit of flit. 
3.2. Transmission coding for overhead region in SILENT 
We first introduce the notation used during this paper, as shown in table 1. 
Table 1. Notation used in this paper 
 
a α is the number of data transition between successive data words. 
b n is the total bit of data word. 
1621Xianglong Ren et al. / Procedia Engineering 29 (2012) 1618 – 16244 Xianglong Ren, etc. / Procedia Engineering 00 (2011) 000–000 
Notation Description 
 ( )[ 1: 0]tb n −  n-bit data word from a sender at time t 
 ( )[ 1: 0]tB n −  n-bit encoded data word at time t 
 ( )[ 1: 0]tD n −  n-bit data word deserialized from the serial data link at time t 
 ( )[ 1: 0]td n −  n-bit decoded data word at time t 
3.2.1. Coding 
The encoder works as follows:  
 ( )2( ) ( ) , [0,[ ] 1[ ]] [ ]tt tB b bi i ni i−= ⊕ ∈ −  (1) 
The encoded word ( )tB  is achieved by XORing the current data word with the one before the 
successive data word. After encoding, these encoded data words are serialized into transmission channels. 
The SILENT coding is not efficient and even overhead than conventional serial wire without it, 
because of the weak correlation between the successive data words when transition α between 
[ / 3 ,2 / 3 ]n n⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥ . The CORS scheme we proposed for this region exploits the correlation between the 
current data word and the one before the successive data word. After serializing the encoded data words 
produced by CORS, the frequency of zeros or ones appearing on the wire increases. Figure 2 illustrates 
the advantage of CORS. In this example, from t2 to t5, transition of serial wire of the encoded word with 
CORS is 6, while that of raw data without any coding and that of the encoded word with SILENT is 17 
and 28, respectively. By reducing the number of transitions on the serial wire, the transmission power can 
be saved proportionally. 
a) data words from sender  b) encoded data by SILENT  c) encoded data by CORS 
  t0      t1      t2      t3    t4    t5 
b[7]
b[6]
b[5]
b[4]
b[3]
b[2]
b[1]
b[0]
0    0    1    0    0    0 
1    0    0    0    1    0 
0    0    1    0    0    0 
1    0    0    0    1    0 
0    0    1    0    0    0 
0    1    1    1    0    1 
0    0    1    0    1    1 
1    0    0    1    0    1 
α 4/8 4/8 5/8 5/8 4/8 
#Tr 5    3    6    3    7    1 
 
  t0      t1      t2      t3    t4    t5
b[7]
b[6]
b[5]
b[4]
b[3]
b[2]
b[1]
b[0]
0    0    1    1    0    0
1    1    0    0    1    1
0    0    1    1    0    0
1    1    0    0    1    1
0    0    1    1    0    0
0    1    0    0    1    1
0    0    1    1    1    0
1    1    0    1    1    1
#Tr 5    8   7   7   6   8
  t0      t1      t2      t3    t4    t5 
b[7]
b[6]
b[5]
b[4]
b[3]
b[2]
b[1]
b[0]
0    0    1    0    1    0 
1    0    1    0    1    0 
0    0    1    0    1    0 
1    0    1    0    1    0 
0    0    1    0    1    0 
0    1    1    0    1    0 
0    0    1    0    0    1 
1    0    1    1    0    0 
#Tr 5    3    1   2   1   2  
Figure 2. (a) original data words, (b) encoded data words with SILENT, (c) encoded data words with CORS. 
3.2.2. Decoding 
After deserialization at the receiver end, the decoder works as follows: 
 ( )2( ) ( )[ ] [ ] [ ]tt td D di i i−= ⊕   (2) 
We assume for convenience that there is no bit-error on the transmission channel and the latency is 
zero, thus, ( )tD  is identical to ( )tB . 
3.2.3. Proof 
The original data word from a sender unit, ( )tb , can be recovered by the decoder, as the following 
proves: 
From (2),  
 ( )2( ) ( ) ( ) ( 2)tt t t td D d B d− −= ⊕ = ⊕  
Then, from (1), we get 
1622  Xianglong Ren et al. / Procedia Engineering 29 (2012) 1618 – 1624 Xianglong Ren, etc. / Procedia Engineering 00 (2011) 000–000 5 
 ( ) ( ) ( 2) ( 2)( )t t t td b b d− −= ⊕ ⊕   (3) 
Note that (3) is a recursive equation of ( )td . If ( 2)td −  is replaced recursively by (3), we get 
 
( ) ( )
( ) ( ) ( ) ( )
2( ) ( 2)
2 2 4 ( 4)
( ) ( 4) ( 4)
( )
( ) [( ) ]
t tt t
t t t t t
t t t
d b b d
b b b b d
b b d
− −
− − − −
− −
= ⊕ ⊕
= ⊕ ⊕ ⊕ ⊕
= ⊕ ⊕
 
By replacing ( )t id −  terms recursively by (3) until t−i becomes 0 or 1, the following expression is 
obtained for ( )td : 
 
( ) ( ) ( )
( ) ( ) ( )
1 1
( )
0 0
 
 
,
,   
t
t
t
b b d when t is odd
d
b b d when t is even
⎧ ⊕ ⊕⎪= ⎨ ⊕ ⊕⎪⎩
 
If we assume that ( ) ( )0 1(0) (1),b d b d= =  at time ( 0), ( 1)t t= =  respectively, ( ) ( )t td b= . 
To guarantee the initial condition ( ) ( )0 0b d=  and ( )1(1) b d= , the (0)b  and (1)b  at the encoder , ( )0d  and 
( )1d  at the decoder are set as zeros. This can be simply done by hardware resetting the flip-flops in both of 
the encoder and decoder before starting transmission. 
4. Experimental Results 
The overhead of power and latency introduced by the codec is trivial, since the fulfillment of the 
ALPTCS encoder and decoder is so lightweight. To evaluate the efficiency of the CORS and ALPTCS for 
reducing the power consumption, we applied them to various data patterns in the overhead region and two 
multimedia applications, respectively. We take the transition activity as the measure of efficiency, for the 
dynamic power consuming is promotional to transition activity. 
4.1. Performance analysis of CORS 
The energy dissipation in the communications closely relies upon the data patterns to be sent. So, we 
generate all potential variations from a random data word with uniform distribution, and select the 
overhead area of SILENT to evaluate the efficiency of CORS. Figure 3 compares the average power 
consumption of the serial link with SILENT and CORS coding. 
‐20
‐10
0
10
20
30
40
50
60
70
pe
rc
en
ta
ge
 re
du
ct
io
n 
in
 S
A
# of transitions b/w successive data words
CORS
SILENT
CORS VS. SILENT
 
Figure 3. Comparison of CORS and SILENT in overhead region. 
The x-axis represents the number of transitions between successive 32-bit data words using uniformly 
generated traffic of almost 1M samples. The n on the x-axis means that arbitrary n/32 bits have changed 
from their previous values. The y-axis stands for the percentage reduction in SA with CORS and SILENT. 
The baseline of reduction in SA is that of data words without any coding. As shown in figure 3, in the 
overhead region, the CORS coding gives 8%~60% reduction in SA comparing with the SILENT coding. 
1623Xianglong Ren et al. / Procedia Engineering 29 (2012) 1618 – 16246 Xianglong Ren, etc. / Procedia Engineering 00 (2011) 000–000 
Based on the results obtained, CORS is quite effective in reducing SA on the serial interconnect for data 
patterns in the overhead region. 
4.2. On-chip network for multimedia application 
To evaluate the efficiency of adaptive coding (ALPTCS) for reducing the power consumption, we 
constructed a NoC simulation platform based on an open-source simulation environment SoCLib [14]. As 
indicated in figure 4, the platform consists of MIPS processor, on-chip memories, a display component 
(TTY) and other components such as DSPs, RAM Disks and frame buffer controller. These components 
connected to each other with a 3×3 mesh network. 
 
R
MIPS
R
RAM
1
R
RAM
DISK1
R
DSP
1
R
TTY
R
Frame 
Buffer 
Ctl.
R
RAM
2
R
DSP
2
R
RAM
DISK2
Measure
Point
 
Figure 4. The NoC simulation platform we constructed. 
We run the following two embedded multimedia programs on MIPS processor in order: JPEG decoder 
and H.264 video decoder. The JPEG decoder handles a 160×120 image, H.264 decoder processes a 
176×144 video frame. For each application, we perform the evaluation twice with different test data. 
We set up a measurement point to observe the communication between MIPS and other components in 
the platform, as shown in figure 4. While the application running on the processor, we record all its 
through the measurement point in a trace le. Then, we analyzed all the four trace files and derived their 
transitions number of serial links for the following coding methods: uncoded, SILENT and ALPTCS. The 
results of the transition activity reduction (power reduction) of these encoding are given in table 2. 
Table 2. Comparison of the total number of transitions   Table 3. Area overhead for the encoder and decoder. 
 
Uncoded SILENT ALPTCS 
APPs 
#Tr. (k) #Tr. (k) %Red. #Tr. (k) %Red.
JPEG-1 14.1 10.8 23.4 10.1 27.9 
JPEG-2 10.2 6.9 31.6 6.2 38.3 
H264-1 1532.7 1016.1 33.7 977.8 36.2 
H264-2 2467.5 2070.2 16.1 1862.9 24.5 
 
 
SILENT ALPTCS 
Codec
En. De. Tol. En. De. Tol.
4-bit 60 60 120 277 152 429 
8-bit 120 120 240 432 305 737 
The results shown in table 2 confirm that the transitions reduction is associated with the data patterns. 
Even for the same application, results vary widely for different input data. 
Moreover, we utilize Synopsis Design Compiler to produce the gate-level net-lists of the encoders and 
decoders for coding schemes SILENT and our proposed ALPTCS. Table 3 shows the area overhead 
(measured by the number of NAND gates) of 4-bit and 8-bit bus encoders and decoders. We believe it’s 
acceptable for ALPTCS coding scheme we proposed. 
1624  Xianglong Ren et al. / Procedia Engineering 29 (2012) 1618 – 1624 Xianglong Ren, etc. / Procedia Engineering 00 (2011) 000–000 7 
From the table 2 and 3, we get the conclusion that although the area overhead of ALPTCS codec is 
about three times than that of SILENT, the transitions reduction is noticeably significant. It is noticed that 
across all the four testing we performed, ALPTCS is superior to SILENT in reducing transitions, and is 
almost 2.5% to 8.4% better than SILENT. 
5. Conclusion 
On-chip serial communication has many advantages over parallel communication, such as minor skew, 
crosstalk, area cost and wiring difficulty, etc. However, the serial link tends to dissipate more power than 
parallel bus because of the bit multiplexing. In this paper, we propose a coding scheme for the overhead 
region in SILENT, and present an adapting mechanism to switch between the two coding schemes when 
data pattern changes, thus to minimize the power dissipation of serial links in network on chip. The results 
show that the method we proposed adapts to more data pattern and is more suitable for serial 
communication in the network-on-chip. The proposed method can be used for SoCs and NoCs in nano-
scale era where the power dissipation is a prime regard. 
Acknowledgments 
This work is supported by Natural Science Foundation (60736012, 60773223, 61003037 and 61173047) 
and National 863 Program (2009AA01Z110). 
References 
[1] Ghoneima, M., et al., Reducing the data switching activity on serial link buses, in 7th International Symposium on Quality 
Electronic Design(ISQED '06). 2006. p. 427-432. 
[2] Kangmin, L., L. Se-Joong and Y. Hoi-Jun, SILENT: serialized low energy transmission coding for on-chip interconnection 
networks, in ICCAD '04. 2004. p. 448- 451. 
[3] Kumar, S., et al., A network on chip architecture and design methodology, in Proc. IEEE Computer Society Annual 
Symposium on VLSI. 2002, IEEE Press: Pittsburgh, PA , USA. p. 105-112. 
[4] Hemani, A., et al. Network on Chip: An architecture for billion transistor era. in Proc. IEEE NorChip. 2000. p. 1-8. 
[5] Benini, L. and G. De Micheli, Networks on chip: a new paradigm for systems on chip design, in Proc. Design, Automation 
and Test in Europe Conference and Exhibition. 2002. IEEE Press: Paris , France. p. 418-419. 
[6] Dally, W. J. and B. Towles, Route packets, not wires: on-chip interconnection networks, in Proc. Design Automation 
Conference. 2001, ACM Press: Las Vegas. p. 684- 689. 
[7] Se-Joong, L., et al., An 800MHz star-connected on-chip network for application to systems on a chip, in IEEE International 
Solid-State Circuits Conference(ISSCC’03). 2003. p. 468- 469. 
[8] Ghoneima, M., et al., Serial-link bus: a low-power on-chip bus architecture, in ICCAD’05. 2005. p. 541- 546. 
[9] Dobkin, R.R., et al. Parallel vs. serial on-chip communication. In SLIP '08 . 2008, ACM: New York, NY, USA. p. 1-8. 
[10] Aghaghiri, Y., F. Fallah and M. Pedram, Irredundant address bus encoding for low power, in Intel. Symp. on Low Power 
Electronics and Design. 2001. p. 182-187. 
[11] Stan, M.R. and W.P. Burleson, Low-power encodings for global communication in CMOS VLSI. IEEE Trans. on Very 
Large Scale Integration (VLSI) Systems. 1997. p. 444-455. 
[12] Kedia, A. and R. Saleh, Power Reduction of On-Chip Serial Links, in ISCAS’07. 2007. p. 865-868. 
[13] Yoo, H.J., K. Lee and J.K. Kim, Low-power NoC for high-performance SoC design. 2008: CRC. 
[14] SoCLib simulation environment. http://www.soclib.fr/trac/dev. 
