





This is the peer reviewed version of the following article: 
 
Radonjic, A., Vujicic, V., 2019. Integer codes correcting sparse byte errors. 















This work is licensed under a Creative Commons Attribution Non Commercial No 




Integer Codes Correcting Sparse Byte Errors 
 
Aleksandar Radonjic* and Vladimir Vujicic 
Institute of Technical Sciences of the Serbian Academy of Sciences and Arts, Belgrade, Serbia  
E-mails: sasa_radonjic@yahoo.com, vujicicv@yahoo.com (*Corresponding author) 
 
 
Abstract: In public optical networks, the data are scrambled with a xu + 1 self-synchronous 
scramblers (SSSs). The reason for this is to avoid long strings of ones or zeros, which might 
affect the receiver synchronization. Unfortunately, the use of SSSs is always related to the 
problem of duplication of channel errors. More precisely, each error occurring during the 
transmission will be duplicated u bits later. In this paper, we present a low-cost solution to 
this problem based on integer codes capable of correcting sparse byte errors. 
Keywords: Integer codes, sparse byte errors, error correction, public optical networks. 
1. Introduction 
 In many communication networks the data are randomized before transmission. The reason 
for this is to avoid long strings of 1s or 0s, which might affect the receiver synchronization. In 
public optical networks (PONs), such as synchronous optical network (SONET) and high-level 
data link control (HDLC) [1], the process of randomization is performed using self-synchronous 
scramblers (SSSs). These devices modify the data by XORing two bits that are spaced u bits 
apart (u = 29 or 43) [6]-[10]. Such sequence is then sent to the receiver, which performs the 
same operation to recover the original unscrambled data. 
Although this procedure is simple to implement, it has a drawback of error duplication. In 
other words, each error occurring during the transmission will be duplicated u bits later. In 
mentioned networks, where random errors dominate [2]-[5], this will cause the appearance of 
two errors which cannot be corrected by standard cyclic redundancy check (CRC) codes. In 
order to overcome this problem, the researchers proposed the use of modified CRC codes (m-
CRCCs) [6]-[10]. These codes were preferred over other codes (e.g. BCH codes) due to their 
lower decoding complexity. In practice, however, such a solution would be complex to 
implement, since it is also based on the modification of existing network hardware (SONET 
terminals, HDLC controllers, etc.). 
Bearing this in mind, in this paper, we present a class of integer codes that are suitable for 
use in modern PONs. Compared to m-CRCCs, these codes have three advantages. First, they use 
integer and lookup table operations, which are supported by all processors [11]. As a result, they 
can be implemented "for free" (in software), i.e. without modifying the network hardware. The 
second advantage is that the proposed codes can correct three types of errors within a b-bit byte: 
 2 
single errors, double errors and triple-adjacent errors. Hence, they are more powerful than the 
codes suggested in [6]-[10]. Finally, the proposed codes can be interleaved without delay and 
without using dedicated hardware. Thanks to this, it is possible to construct simple codes 
capable of correcting errors affecting several consecutive bytes. 
The organization of this paper is as follows: Section 2 deals with the construction of integer 
codes capable of correcting sparse byte (SB) errors. The error control procedure and theoretical 
decoding throughputs for these codes are described and evaluated in Section 3. Finally, Section 
4 concludes the paper. 
2. Codes Construction 
In this section, we start with four definitions that are related to the construction of integer 
codes capable of correcting SB errors. 
Definition 1. An error is called a SB error if, within a b-bit byte, one, two random or three 
adjacent bits are in error. 
Definition 2.   Let 2 1−bZ = {0, 1,…, 2






= ⋅∑iB be the integer representation of a b-bit byte, where na ∈{0, 1} and 1 ≤ i ≤ k.  
Then, the code C  (b, k, c), defined as 
1
1 2 +1 +12 1
1











C b, k, c B B B B Z C B B                                        (1) 





∈ is the coefficient vector and 
Bk+1 2 1−∈ bZ is an integer. 










∈ and e = y – 





∈ be respectively, the sent  
codeword, the received codeword and the error vector. Then, the syndrome S of the received 








S C B B e C
= =
= ⋅ − − = ⋅ −∑ ∑i i k i i                                                                                                                                            (2) 
Definition 4. An (kb + b, kb) integer code is called SB error correcting (SBEC) if it can 
correct error vectors from the set E =   {(ei, 0,..., 0, 0),..., (0, 0,..., ei, 0), (0, 0,..., 0, ei)}, where 
ei ∈{± 2r, ± 2s ± 2t, (± 22 ± 21 ± 20)·2m}, 0 ≤ r ≤ b –    1, 0 ≤ s < t ≤ b – 1 and 0 ≤ m ≤ b – 3. 
Definition 5. The error set for an (kb + b, kb) integer SBEC code is defined by 
1 2 3=  ξ s s s                                                                                                                                                                                                           (3) 
where 









s C r b
 
⋅ − ≤ ≤ 
 
                                                                                                                                                                                                                                                                                                                              (4) 
 3 









s C r s b
  ⋅ − ≤ ≤   

                                                                                                                                                                                                                                                                                                                 (5)                                                                                                                         









s C m b
   ⋅ ⋅ ≤ ≤    
                                                                                                                                                                                                                                                                 (6) 
With these definitions, we are ready to state the following theorem. 
Theorem 1. The sets s1 and s3 are subsets of s2. 
Proof. An element α of s1 takes the form (± 2r ·  Ci) (mod 2b – 1), where 0 ≤ r ≤ b – 1. The 
set s2 contains elements β taking the form [ (± 2r ± 2s) ·  Ci ] (mod 2b – 1), where 0 ≤ r < s ≤ b – 1. 
Obviously, if s = r + 1, β will take two forms: β1 = (±   2r·  Ci ) (mod 2b - 1) and β2 = (± 3 ·  2r · Ci ) 
(mod 2b – 1), where 0 ≤ r ≤ b – 2. On the other hand, if r = 0 and s = b – 1, β will take the form  
β3 = [ (± 1 ± 2b-1) ·  Ci ] (mod 2b – 1) =  (  2b-1 ·  Ci) (mod 2b – 1). For this it is easy to conclude that  
α = β1   β3, i.e. that s1⊆ s2. Similarly, an element γ of s3 takes the form [(± 22 ± 21 ± 20) ·  2m ·  Ci] 
(mod 2b – 1), where 0 ≤ m ≤ b – 3. Note that this element is of the form (± δ ·  2m ·  Ci) (mod 2b – 
1), where δ { }1,3, 5,7∈ . Now, if δ = 1, then γ1 = (±   2m·  Ci ) (mod 2b - 1) ∈α ⊆  β. If δ = 3, then 
γ2 = (± 3 ·  2m · Ci ) (mod 2b – 1) = [± (2m+1 + 2m) · Ci ] (mod 2b - 1)∈β. If  δ = 5, then γ3 = (± 5 ·  2m · 
Ci ) (mod 2b – 1) = [± (2m+2 + 2m) · Ci ] (mod 2b – 1)∈β. Finally, if δ = 7, then γ4 = (± 7 ·  2m · Ci ) 
(mod 2b – 1) = [± (2m+3 – 2m) · Ci ] (mod 2b – 1)∈β. In all cases, γ⊆ β. Hence, s3 ⊆ s2. □ 
Now we can prove the main theorem of this section. 
Theorem 2. The codes defined by (1) can correct all SB errors only if there exist k mutually 
different coefficients { }0,1∈ 2 -1biC Z \ such that 
2= ( ,ξ s  ⋅ − − ⋅ 
2= 2 ( 1) 2 1)b k +   
where |  A  | is the cardinality of A. 
























( ) (mod 1) , 1 +1
( ) (mod 1) , 1 +1
( ) (mod 1) 3, 1 +1






 = ± − ⋅ ⋅ − ≤ ≤ ≤ ≤ 
 = ± + ⋅ ⋅ − ≤ ≤ ≤ ≤ 
 = ± − ⋅ ⋅ − ≤ ≤ ≤ ≤ 
 = ± + ⋅ ⋅ − ≤ ≤ ≤ ≤ 
= ± − ⋅ ⋅
2 2 2 2 : 0 – 2
2 2 2 2 : 0 – 2
2 2 2 2 : 0 –











R C r b i
R C r b i
R C r b i
R C r b i
R C{ }
{ }3 06
(mod 1) 4, 1 +1
( ) (mod 1) 4, 1 +1
k
k
  − ≤ ≤ ≤ ≤ 
 = ± + ⋅ ⋅ − ≤ ≤ ≤ ≤ 

2 : 0 –






















( ) (mod 1) , 1 +1
( ) (mod 1) , 1 +1
( ) (mod 1) 1, 1 +1














 = ± − ⋅ ⋅ − ≤ ≤ ≤ ≤ 
 = ± + ⋅ ⋅ − ≤ ≤ ≤ ≤ 
 = ± − ⋅ ⋅ − ≤ ≤ ≤ ≤ 
 = ± + ⋅ ⋅ − ≤ ≤ ≤ ≤ 
2 2 2 2 : 0 2
2 2 2 2 : 0 2
2 2 2 2 : 0










R C r i
R C r i
R C r i







( ) (mod 1) , 1 +1






 = ± − ⋅ ⋅ − ≤ ≤ 
 = ± + ⋅ ⋅ − ≤ ≤ 
2 2 2 2 : = 0






R C r i
 
The syndromes caused by SB errors will be nonzero and mutually different only if there exists k 
different coefficients { }0,1∈ 2 -1biC Z \ such that 
1 2 2 2
2 1 2
,
2 ( ) ( +1), = 1, 2, ..., 1.
b




= = ⋅ ⋅
   
– –
R R R
R R b b  
Further, if we compare the sets R2, R3, R2b-2 and R2b-5 we can note that ( )2 5 2 2 2b bR R R− −⊆  and 
3 2 .R R⊆ As a result, it follows that 
1
2 3 2 5
1 1











2= ( ) ( ) 2 ( 1) 2 1)
b b
b= R R R b b b k +  
Conversely, if the codes satisfy the above condition, then we correct all SB errors. Therefore, 
these codes are (kb + b, kb) integer SBEC codes. □  
Now, by knowing the cardinality of s2, we can derive the upper bound on code length. 
Theorem 3. For any (kb + b, kb) integer SBEC code it holds that 
1 2
2





≤  − − 
b-
k  
Proof. From Definition 1 we know that the total number of nonzero syndromes is 2b – 2. In 
addition, from Theorem 2 we know that the set s2 has (
22 ( 1) 2 1)b k + ⋅ − − ⋅   nonzero elements.  
Consequently, we have the inequality 
( 2 222 ( 1) 2 1)b k + ⋅ − − ⋅ ≤ − 
b  
wherefrom it follows that 
1 2
2





≤  − − 
b-
k □ 
To illustrate the applicability of Theorems 2-3 we have conducted an exhaustive computer 
search. Our first goal was to compare the obtained results with the theoretical bounds (Table 1), 
while the second goal was finding the coefficients Ci (Table 2) for 32-bit codes (these codes are 
perfectly suited for implementation on modern 32/64-bit processors [11]).  
 
 5 
3. Error Control Procedure and Theoretical Decoding Throughputs 
The error control procedure for the proposed codes is similar to that described in [12]-[18]. 
In short, it consists of two steps: obtaining the error correction data from the syndrome table and 
executing the operation 
2= − −(mod 1)bi iB B e                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   (7) 
where 1 ≤ i ≤ k  + 1, e = ± 2r ± 2s and 0 ≤ r < s ≤ b - 1. To generate the syndome table it is 
necessary to substitute the values of b and Ci (Table 2) into (5). In this way, exactly |ξ| (Theorem 
2) relationships between the syndrome (element of the set ξ), error location (i) and error vector 
(e) are established (Fig. 1). Accordingly, when S ≠ 0, the decoder's task is to find the entry with 
the first b bits as that of the syndrome S. If the elements of ξ are sorted in increasing order, this 
task will be completed after nTL table lookups, where TL 21 2n log ξ≤ ≤  +  [19]. 
To illustrate the effectiveness of the above approach, suppose that the data packet has K = 
6·b·k = 192·k data bits (k = 32, 64, 96 and 128) and that each network node is equipped with the 
six-core processor (Fig. 2) having the following parameters [20]: 
1) clock rate: CR = 3.3  ·109 Hz, 
2) integer addition/subtraction latency: 1 cycle, 
3) integer multiplication latency: 3 cycles, 
4) modulo reduction operation: 1 cycle latency, 
5) 32-bit comparison operation: 1 cycle latency, 
6) 128-bit shift operation latency: 1 cycle,  
 
                         
 
  Fig. 1. Bit-width of one syndrome table entry. 
 
  
Table 1. Number of Coefficients for Some Integer SBEC Codes. 
 b = 8 b = 9 b = 10 b = 11 b = 12 b = 13 b = 14 b = 15 b = 16 
Theoretical bound 1 3 5 9 16 27 47 83 145 
Computer-search result 0 1 1 3 6 10 16 27 43 
 
Table 2. First 128 Coefficients in [2, 232 - 2] for 32-bit Integer SBEC Codes. 
19 23 25 27 29 37 39 41 47 49 53 59 61 67 71 77 
79 83 89 97 101 103 107 109 113 121 131 137 139 149 151 157 
163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 251 
263 269 271 277 281 283 289 293 307 311 313 317 331 337 347 349 
353 357 359 361 365 367 373 379 383 389 397 401 409 419 421 431 
433 437 439 443 449 457 461 463 465 467 475 479 487 491 499 503 
521 523 529 541 547 551 557 563 569 571 575 577 587 593 599 601 
607 613 617 619 621 625 631 641 643 647 653 659 661 667 673 675 
 
 6 
7) L1 cache (32 KB per core) access latency: 4 cycles, 
8) L2 cache (256 KB per core) access latency: 12 cycles, 
9) L3 cache (15 MB shared) access latency: 34 cycles. 
In addition, let us assume that the coefficients Ci (Table 2) are stored in each of the six L1 
caches and that the syndrome table is placed into the L3 cache (Fig. 2). In that case, instead of 
one, the decoder (processor) will (in parallel) compute the values of six syndromes: 
• Core 1 
32
1 1 6 +1
k
i
S C B⋅ ⋅
=
= ⋅ − −∑ 6 ( - )+1
1
( ) (mod 2 1)i i kB                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                (8) 
• Core 2 
32
2 1 2 6 +2
k
i
S C B⋅ ⋅
=
= ⋅ − −∑ 6 ( - )+
1
( ) (mod 2 1)i i kB                                                                                                                                  (9) 
         
• Core 6 
32
6 1 6 6 +6
k
i
S C B⋅ ⋅
=
= ⋅ − −∑ 6 ( - )+
1
( ) (mod 2 1)i i kB                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               (10) 
If we add to this K/128 = 1.5·k shift operations, we see that each core requires T1 = 9.5·k + 1 
clock cycles (k accesses to the L1 cache, k integer multiplications, k - 1 integer additions, 1.5·k 
shift operations, 1 integer subtraction and 1 modulo reduction) to calculate the values of all 
syndromes. If one or more syndromes are non-zero, the decoder will additionally perform nTL 
table lookups, nTL comparisons, 1 integer addition and 1 modulo reduction. In our case, six such 
operations can be executed in parallel in T2 = 35·nTL + 2 clock cycles. So, if we sum up both 
processing times, we come to the conclusion that the decoder requires 
total 1 2 TLT = T + T = 9.5· + 35· + 3k n                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          (11) 
clock cycles to process K data bits, i.e. one second to decode 
     
        






3.3 10 3.3 10 192·
=  = =
T / 9.5· + 35· + 3 9.5· + 35· + 3
C K k
G
K k n k n
⋅ ⋅ ⋅ ⋅( ) ( )
                                                               (12)  
data bits. From (12) it is easy to calculate that the theoretical throughput of the decoder varies 
between 22.48 Gbps and 43.05 Gbps (Table 3). This means that all codes from Table 3 have the 
potential to be used in 10G networks (e.g. 10G SONET and HDLC network) [1]. Besides this, 
from Table 3 we see that the code with the code rate 4096/4128 has theoretical throughput above 
40 Gbps. This fact makes it a good candidate for use in 40G networks (e.g. 40G SONET) [1]. 
Finally, from (8)-(10) we see that the analyzed codes are interleaved at the byte level. Thanks to 
this, they are able both to protect up to 24576 bits and to correct SB errors spanning up to 192 
bits. Such solution is not only more reliable than [6]-[10], but also much simpler to implement 
(Table 4). 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
4. Conclusion 
In this paper, we presented a new class of integer error control codes. We have shown that 
these codes have three characteristics: first, they can correct sparse byte errors, second, they 
operate under integer arithmetic, and third, they can be interleaved without delay and without 
using additional hardware. Thanks to these features, the presented codes are well suited to be 
used in practice, especially in optical networks such as SONET and HDLC. 
Table 3. Memory requirements and theoretical decoding throughputs for some six-byte 
interleaved integer SBEC codes. 
Code   k 
Memory 
requirements 




for storing the  
syndrome table 
Number 






(1056, 1024)  32 6 x 128 B 0.55 MB 1 ≤ nTL ≤ 17 22.48 Gbps 
(2080, 2048)  64 6 x 256 B 1.11 MB 1 ≤ nTL ≤ 18 32.68 Gbps 
(3104, 3072)  96 6 x 384 B 1.65 MB 1 ≤ nTL ≤ 19 38.50 Gbps 
(4128, 4096)   128 6 x 512 B 2.23 MB 1 ≤ nTL ≤ 19 43.05 Gpbs 
 
Table 4. Comparison of codes used or proposed for use in modern PONs. 
Main 
characteristics CRC codes 






Detection of single, double, 
triple and burst errors of 
lenght up to 16 bits 
Correction of single and 
duplicate errors, and detection 
of double and triple errors 
Correction of SB errors 
affecting several 




(packet header only) 
No 
(packet header only) Yes 
Processing  
of data bits Modulo-2 operations Modulo-2 operations 
Integer and lookup table  
operations 
Type of 












[1] R. Ramaswani, K. Sivarajan and G. Sasaki, Optical Networks: A Practical Perspective, 3rd 
ed., Elsevier, Inc., 2010. 
[2] CCITT Study Group XVIII Contribution D21, “Observations of Error Characteristics of 
Fiber Optic Transmission Systems,” Jan. 1989. 
[3] W. Grover and D. Moore, “Design and Characterization of an Error-Correcting Code for 
the SONET STS-1 Tributary,” IEEE Trans. Commun., vol. 38, pp. 467-476, Apr. 1990. 
[4] W. Grover, “Effect of Error Correcting Code Using D-S3 Framing Bits on Measured 
Dribble Error Pattern of 565 Mb/s Fibre Optic Transmission  System”, Elect. Lett., vol. 28, 
no. 20, pp. 1869-1870, Sept. 1992. 
[5] M. Cheung, W. Grover and W. Krzymien, “Combined Framing and Error Correction 
Coding for DS3 Signal Format,” IEEE Trans. Commun., vol. 43, nos.    2-4, pp. 1365-1374, 
Feb-Apr. 1995. 
[6] N. Figueira, “Networking Device and Method for Making Cyclic Redundancy Check 
(CRC) Immune to Scrambler Error Duplication,” U.S. Patent 6,609,226, Aug. 19, 2003. 
[7] S. Gorshe, “Cyclic Redundancy Check Circuit for Use with Self-Synchronous Scramblers,” 
U.S. Patent 7,353,446, Nov. 17, 2005. 
[8] S. Gorshe, “Analysis of the Interaction Between Linear Cyclic Error Correcting Codes and 
Self-Synchronous Payload Scramblers”, IEEE Trans. Commun., vol. 56, no. 11, pp. 1800-
1806, Nov. 2008. 
[9] S. Gorshe, “Forward Error Correction with Self-Synchronous Scramblers,” U.S. Patent 
7,913,151, Mar. 22, 2011. 
[10] D. Ferguson et al., “System, Apparatus, and Method for Increasing Resiliency in 
Communications,” U.S. Patent 7,986,717, Jul. 26, 2011. 
[11] R. Giladi, Network Processors: Architecture, Programming, and Implementation, Elsevier, 
Inc., 2008. 
[12] A. Radonjic and V. Vujicic, “Integer Codes Correcting Burst Errors within a Byte,” IEEE 
Trans. Comput., vol. 62, no. 2, pp. 411-415, Feb. 2013. 
[13] A. Radonjic et al.,“Integer Codes Correcting Double Asymmetric Errors,” IET Commun., 
vol. 10, no. 14, pp. 1691-1696, Sep. 2016. 
[14] A. Radonjic and V. Vujicic, “Integer Codes Correcting Spotty Byte Asymmetric Errors,” 
IEEE Commun. Lett., vol. 20, no. 12, pp. 2338-2341, Dec. 2016. 
[15] A. Radonjic and V. Vujicic, “Integer Codes Correcting High-Density Byte Asymmetric 
Errors,” IEEE Commun. Lett., vol. 21, no. 4, pp. 694-697, Apr. 2017. 
[16] A.    Radonjic and V. Vujicic, “Integer Codes Correcting Single Errors and Burst Asymmetric 
Errors within a Byte,” Inform. Process. Lett., vol. 121, pp. 45-50, May 2017. 
[17] A. Radonjic, “(Perfect) Integer Codes Correcting Single Errors,” IEEE Commun. Lett., vol. 
22, no. 1, pp. 17-20, Jan. 2018. 
[18] A. Radonjic and V. Vujicic, “Integer Codes Correcting Burst and Random Asymmetric 
Errors within a Byte,” J. Franklin Inst., vol. 355, no. 2, pp. 981-996, Jan. 2018. 
[19] K. Mehlhorn and P. Sanders, Algorithms and Data Structures: The Basic Toolbox, 
Springer, 2008. 
[20] A. Fog, “The Microarchitecture of Intel, AMD and VIA CPUs,” Tech. Univ. Denmark, 
Denmark, Jan. 2016. 
