Single Bit Error Correction Implementation in CRC-16 on FPGA by Shukla, Sunil & Bergmann, Neil
SINGLE BIT ERROR CORRECTION IMPLEMENTATION IN CRC-16 
ON FPGA 
Sunil Shukla, Neil W. Bergmann 
School ofITEE, The Universip of Queensland, Australia 
{Surd, bergmann)@itee. uq. edu.au 
Abstract 
Framing protocols employ c,vclic redundancy 
check (CRC) fa  detect errors incurred during 
transmission. General1.v whole pame is protected 
using CRC and upon detection of error, re- 
transmission is requested. But certain protocols 
demand .for single bit error correction capabilities 
for the headerpart of the frame. w,hich often plays an 
important role in receiver svnchronization. At a 
speed of I O  Gbps, header error correction 
implementation in hardware can be a bottleneck 
This paper presents a hardwre efficient way of 
implernenting CRC-I6 over 16 bits of data, multiple 
bit error detection and single bit error correction on 
FPGA device. 
1. Introduction 
The Internet is growing rapidly in terms of 
number of users and amount of bandwidth used. 
Besides the transmission and switching speeds, the 
per-packet operations necessary for lntemet Protocol 
(IP) packet forwarding are the current limiting 
factors. As transmission speeds are continually 
increasing, IP packet processing overheads have 
become the main bottleneck [ 5 ] .  Often. IP packets 
are encapsulated in frames protected by a cyclic 
redundancy check (CRC) code. CRC is the most 
preferred method of encoding because i t  provides 
very efficient protection against comnlonly occurring 
burst errors. CRC’s can detect all one bit and two bit 
errors as well as  all odd number of bits in error [Z]. 
The most commonly used framing techniques are 
PPP, HDLC and GFP. Generic Framing Procedure 
(GFP) is a recently proposed technique for framing. 
The advantage o f  this technique is that it does not 
use any special code to indicate the beginning and 
end of frame. Frame delineation is based on packet 
length that is transmitted at the beginning of each 
frame. The 16 hit packet length is protected by CRC- 
16 and transmitted as core header. Single bit error 
correction capability is required from the receiver. 
Besides fhc packet length, GFP fi-amc also has type 
0-7803-8652-3/04/$20.00 0 2004 IEEE 319 
header following core header which is also protected 
by CRC-16 and the receiver is expected to correct 
single bit error for type header also. T’his can be a 
bottleneck at a speed of IOGbps, if the core is 
implemented on FPGA. A lot of work has been 
reported on hardware implementation of error 
detection using CRC but there is no published 
method for error correction in CRC in hardware. In 
this paper we have proposed a technique for CRC-16 
error detection and single bit error correction which 
is hardware optimized and works at relatively higher 
frequency. This paper focuses on implementation of 
this method on FPGA. We are targeting FPGA 
because timing issues in FPGA arises more oAen and 
this technique utilizes the huge resources available in 
FPGA as Block RAM. Focus of the paper is on 
single hit error correction for header hits protected 
by CRC-16. The paper is organized into two parts 
viz.: CRC-16 implementation in hardware and the 
proposed technique for CRC error detection and 
single hit error correction. 
2. CRC-16 Implementation in Hardware 
The generator polynomial used for CRC-I6 
calculation is XI6 + X” + Xs + I in X25 standard and 
XI6 + X” + X’ + I in CCITT. In this paper, we will 
he referring to the polynomial defined in X25 
standard but the results can be extended to any 16 bit 
generator polynomial with slight modification. CRC 
can be implemented in hardware via techniques such 
as serial implementation, parallel implementation or 
look up table-based implementation. Look up table 
approach involves storing CRC values for all 
possible input combinations. Thus for I 6  bit input 
data, we need to store 216 (65536) values i.e. a 
storage space of IM bits. Serial implementation uses 
Linear Feedback Shift Registers (LFSR) in hardware. 
In LFSR, division is performed by left shifting and 
subtraction by XOR operation. Serial 
implementation is hardware efficient but is not 
feasible at higher frequencies. In case of parallel 
implementation, the division process is reducible to a 
set of equations involving XOR operation. Parallel 
implementation of CRC is fast because it involves 
two level of logic. The optimized equation of 
ICFPT 2004 
resulting checksum in CRC-16 is summarized in Fig. 
C(15) = E(II) e E(10) e E(7)  Q E ( 3 )  
q 1 4 )  = ~ ( 1 0 )  e ~ ( 9 )  e ~ ( 6 )  Q ~ ( 2 )  
r(13) = ~ ( 9 )  e ~ ( 8 )  e ~ ( 5 )  Q E ( I )  
~ ( 1 2 )  = ~ ( 1 s )  Q ~ ( 8 )  e ~ ( 7 )  e ~ ( 4 )  e E(n) 
q i i )  = ~ ( 1 5 )  o ~ ( 1 4 )  e E(II) e E ( I O )  Q ~ ( 6  
qio)  = ~ ( 1 4 )  e ~ ( 1 3 )  Q .qin) e ~ ( 9 )  0 ~ ( 5 )  
c(9) = ~ ( 1 5 )  e ~ ( 1 3 )  e ~ ( 1 2 )  e ~ ( 9 )  
~ ( 8 )  = ~ ( 1 5 )  e ~ ( 1 4 )  Q ~ ( 1 2 )  e €01) 
3 7 )  = ~ ( 1 5 )  e ~ ( 1 4 )  e ~ ( 1 3 )  Q E(II)  
E(8)  @ E ( 4 )  
E ( 8 )  Q E ( 7 )  0 E(3)  
Q €(IO) E ( 7 )  Q E(6)  E ( 2 )  
7(6)  = E(14) E(13) €(U) Q E(10) 
E(9)  0 E ( 6 )  e E(5)  €(I) 
:(SI = ~ ( 1 3 )  o ~ ( 1 2 )  e E(II)  e ~ ( 9 )  
e ~ ( 8 )  0 E ( S )  e ~ ( 4 )  @ E ( O )  
7(41 = ~ ( 1 5 )  e ~ ( 1 2 )  e ~ ( 8 )  e ~ ( 4 )  
7(3) = ~ ( 1 5 )  Q ~ ( 1 4 )  e H I I )  e ~ ( 7 )  Q ~ ( 3 )  
7(2 )  = ~ ( 1 4 )  Q ~ ( 1 3 )  e €(IO) Q ~ ( 6 )  e ~ ( 2 )  
:(I) = ~ ( 1 3 )  o ~ ( 1 2 )  o ~ ( 9 )  e ~ ( 5 )  e E ( I )  
:(OJ = ~ ( 1 2 )  e E(II) e ~ ( 8 )  e ~ ( 4 )  Q E(O) 
Figure 1 CRC-16 Equations 
Where, 
E(i) = D(i) XOR Cmv(i), 
' 8 . indicates XOR operator, 
D(i) is the ilh hit of input data, 
CPEu(i) is the i" bit of previous CRC result. In our 
case, since data width is 16 bits, C,,., refers to the 
initial state of the CRC which may be either all zeros 
or all ones. 
3. Proposed method for CRC-16 Error 
Detection and Correction 
In this paper, we will he presenting a unique 
way of implementing multiple hit error detection and 
single hit error correction using CRC for a data width 
of 16 hits. Let F,, be the frame transmitted in which 
checksum is appended after 16 bits of data. We can 
express F, as 
F,, = D,, & C,, 
Where, 
& -Concatenation operator 
D,r- transmitted 16 hit data 
C, - transmitted I6 bit checksum 
At the receiver side, let F, be the received frame 
such that 
Where, C ,  indicates received checksum and D, 
represents received data. Receiver again calculates 
CRC on the received data. Let C,,, indicates the CRC 
calculated over D,, at the receiver side. If no error 
has occurred during transmission then C, and C,, 
are equal. But if some bit(s) are in error, then C,, and 
C,,, will be in mismatch. Here we are concemed with 
just single bit error. There can be two cases, either 
single bit error can be in data, 0, or it can he in 
checksum, C,. In case single bit error is there in one 
of the checksum bits, then we need to just detect it. 
So the real concem is to correct data in case one of 
the data bit is in error. If we refer to Fig 1, we will 
see that the checksum calculation involves XOR 
operations on a combination of data bits. If single bit 
of data flips then all the checksum hits in which that 
data hit has been used, will be inverted. For eg. Data 
bit 0 is used in checksum bit 0, 5 and 12. So if there 
is an error in data bit 0, then the calculated checksum 
and received checksum will differ in position 0, 5 
and 12. Let C,,,,,,,., = C,I XOR C,p If we consider 
that only one bit in data is in error then we will have 
16 unique patterns for C,,,,,,,, each corresponding 
to individual data error bits. We have written a C 
program, and found the pattems for the XOR result. 
Fre = D,, C, 
Table 1. XOR pattern for data ba in error 
I Data Bit I 
If there is single bit error in checksum bit, then we 
will obtain the following XOR pattems. 
320 
Table 2. XOR pattem for checksum bit in error 
I M S B  8 I LSB 8 1  
XORing 
No Error CRC bit Data bit hit seq. 
(16) (1 )  error(l) error(]) 
The XOR patterns are unique for single bit error 
occurring anywhere, either in data or in checksum. 
For correction purpose, we have to just find out the 
hit in error. If that bit is CRC bit we need not do 
anything but if that bit is data bit then we need to flip 
that bit. We can find the hit depending upon which 
we can have a hit sequence with which received data 
is XORed. For e.g. if hit two is in error, then the hit 
sequence is “0000000000000100”. This pattem is 
XORed with received data, which is simply flipping 
of bit two. We have stored these bit sequences in 
memory. 
4. Memory Design Considerations 
In this section, memory design parameters and 
programming is discussed in detail. FPGA have 
abundant memory available in the form of Block 
RAM. We will show in this section that one block 
RAM is sufficient for whole processing. In fact two 
ports of single Block RAM can serve the purpose of 
two CRC correction engines simultaneously as the 
Block RAM is used as ROM with the configuration 
parameters initialized during its generation using 
Xilinx CORE Generator. 
4.1. M e m o r y  Addressing 
The memory is accessed using the XOR 
patterns. Each 16 bit XOR pattern is unique among 
the 32 cases. But using 16 bits of addressing implies 
65536 locations, which is not desirable. To minimize 
the number of locations required, XOR pattern is 
divided into two palts of 8 bits each. If we observe 
the XOR pattems in Table I ,  the lower 8 bits have a 
Match 
Panem 
(8) 
repetitive pattern hence upper 8 bits of XOR pattem 
is used for accessing memory. as there is no 
repetitive value. If we observe the XOR patterns in 
Table 2, for half of the cases, upper 8 hits are zero 
and for the remaining cases lower 8 bits are zero. 
Thus we can’t use the upper 8 bits for addressing 
memory for the first 8 cases. In those cases, lower 8 
hits are used for addressing memory. Thus there is a 
muxing of address. Whenever the upper 8 bits of 
XOR pattern are zero, lower 8 bits are used for 
accessing memory. In this way a maximum of 256 
locations are required. If we observe the MSB 8 bit 
patterns for data and CRC, then we will find that all 
the 32 patterns are not unique. m e r e  are overlapping 
patterns of 16, 32 and 64. These patterns will come 
as address bits when there is single bit error in data at 
bit position 0. I and 2 respectively and will also 
appear when there is single hit error in CRC at bit 
position 5 ,  6, 7 respectively and again at 12, 13 and 
14 respectively. These cases can be distinguished 
easily. If the lower or upper 8 bits of XOR panem 
are all zeros, then it is a probable case of single bit 
error in CRC else it is a probable case of single hit 
error in data. We have protection for such 
overlapping cases in our data structure. 
4.2. M e m o r y  Data Structure 
The data StNCNre for memory is shown in Fig 
Figure 2 Memory Data Structure 
The lower 8 hits of XOR pattem are stored in 
memory location as-“Match Pattem” for addresses 3, 
6. 13. 16. 18. 27. 32. 36. 51. 64. 72, 102. 129, 137, 
145 and 204.’ Fo; address 1,’2, 4, 8and.128 which 
represent 8 bit XOR. pattern in case of CRC bit in 
error excluding 16, 32 and 64, “Match pattern” is all 
zeros. The 16 bit “XORing bit sequence” is stored in 
which only one hit that corresponds to the bit 
position in error is set to ‘I’ and all other hits are set 
to ‘0’. For e.g. for location 72, which indicates a 
probable case ofdata bit error at bit position 6, match 
pattem will he ”I 1000100” and XORing hit 
sequence will be “0000000000100000”. Two 
additional hits, “CRC bit error’’ and “Data bit error”, 
are used to indicate that the information is stored is 
related to single bit error in data or CRC. For 
locations 3. 6, 13, 16, 18, 27, 32, 36, 51, 64, 72, 102, 
129, 137, 145 and 204 “Data bit error” is set to ‘I’ 
and at all other locations it is set to ‘0’. For locations 
I ,  2 ,4 ,  8, 16,32, 64 and 128 “CRC hit error” is set to 
‘ I ’  and at all other locations it is set to ‘0’. At 
location 0, “No error” bit is set to ‘ I ’  and at all other 
32 1 
locations it is set to ‘0’. Thus last three bits in 
memory are used to decide the type of error. The 
data width of memory will be 26 bits. 
5. Error Handling 
In this section we will discuss how single and 
multi bit errors are detected and handled. The 
decision is based upon following algorithm. 
BEGIN 
Reset flag no-error 
Reset flag single-bit-error 
Reset flag single-bit-crc-error 
Reset flag single-bit-data-error 
Reset flag multiple-bit-error 
IF bit-26 = ’ I ’  THEN 
Set flag no-error 
Transmit received data 
ELSlF bit-25 = ‘ I ’  AND bit-24 = ‘0’ THEN 
IF Cxorpattem (7:O) = “00000000” THEN 
Set flag single-bit-error 
Set flag single-bit-crc-error 
Transmit received data 
ELSE 
Set flag multiple-bit-error 
Transmit received data 
END IF 
ELSIF bit-25 = ‘0’ AND b i t 2 4  = ‘ I ’  THEN 
IF Cxorpattem (7:O) = “Match Pattem” THEN 
Set flag single-bit-error 
Set flag single-bit-data-error 
XOR received data with 16 bit “XORing bit 
sequence” 
ELSE 
Set flag multiple-bit-error 
Transmit received data 
END IF 
ELSlFbit-25=‘I’ANDbit-24=’I’THEN 
IF Cxorpattem ( 7 : O )  = ”00000000” THEN 
Set flag single-bit-error 
Set flag single-bit-crc-error 
Transmit received data 
ELSlF Cxorpattem (7:O) = “Match Pattem” THEN 
Set flag single-bit-error 
Set flag single-hit-data-error 
XOR received data with 16 bit “XORing bit 
sequence” 
ELSE 
Set multiple-bit-error 
Transmit received data 
END IF 
Set multiple-bit-error 
Transmit received data 
ELSE 
END IF 
END 
6. Hardware Implementation 
The algorithm has been implemented and 
verified on Xilinx Virtex-I1 FPGA device. The code 
was written in VHDL and synthesized using 
Leonard0 Spectrum. The device used for 
implementation is 2V4Ocs144 with speed grade 5 
and wire load model xcv2-40-5-wc. The results 
obtained are summarized in Table 3. 
Table 3 Hardware Implementation Results 
Device I Area I Speed 
Virtex I1 52 slices I 233MHz 
I (2V4Ocs144) I I 
For a GFP IP core giving a throughput of IOGbps, 
there can be 9 such CRC single bit error correction 
engines in receiver, assuming a data interface of 64 
bits. Hence the effect of area saving in the design of 
this block is multiplied by 9. 
I. Conclusion 
In this paper, we have described how single bit 
error correction can be employed in case of CRC-16 
in a very efficient way on FPGA. This approach is 
efficient both in terms of hardware and speed. The 
additional hardware required is very simple. This 
technique works efficiently in case of ASIC design 
also. 
8. References 
[I] Ross N. Williams, “Painless Guide to CRC Error 
Detection Algorithms". 
[2] Norman Matloff. “Cyclic Redundancy Checking”. 
[3] Adrian Simionescu, “Computing CRC in Parallel for 
Ethernet”. 
[4] Giureppe Campobello, Giuseppe Patane. Marco Russo, 
”Parallel CRC Realization”. 
[ 5 ]  Florian Braun, Mmarcel Waldvogel. “Fast Incremental 
CRC Updates for IP over ATM Networks”. 
322 
