Abstract-Increasing the integration density offers the possibility for designers to built very complex system on a single chip. However, approaching the limits of integration, circuit reliability has emerged as a critical concern. The loss of reliability increases with process/voltage and temperature (PVT) variations. Faults can appear in circuits which can affect the system behaviour and lead to a system failure. Therefore it is increasingly important to build more fault tolerant resilient system. This paper 1 proposes a new fault tolerant scheme, the Duplication with Syndrome based Correction (DSC) scheme. Two criteria were considered to evaluate the proposed scheme: the reliability (probability that no error appears in the output of the architecture) and the hardware efficiency of the architecture. Results show that the DSC scheme reduces the complexity by 32%, compared to the classical Triple Modular Redundancy (TMR) scheme, while maintaining a level of reliability closed to the TMR. The paper shows also an example of signal processing applications where the DSC has been used to protect the correlation function and filters inside the tracking loops of the Global Positioning System (GPS) receiver.
I. INTRODUCTION Due to the increasing demand for enhancing performance and functionality at reduced area and cost, transistors have been scaled down over the past four decades. This growth has helped to increase the number of transistors per unit area and to optimise the performance and power consumption of circuits. Today chips employ billions of transistors, include multiple processor cores on a single silicon die, run at clock speeds measured in gigahertz, and deliver more than 4 million times the performance of the first ship [1] . Moreover, since power consumption is proportional to the square of the supply voltage V dd , voltage scaling has been started in the late 80s in order to reduce consumption of circuits. During the last decades, V dd was scaled from 5V to 3.3V then to 2.5V and it is predicted to be reduced to 0.64V in 2028 [2] .
The increase of integration density with technology scaling has offered the possibility for designers to built very complex system on a single chip, reducing, so, the cost of circuit in term of area and power consumption, and improving their speed and performances. However, approaching the limits of integration, circuit reliability has emerged as a critical concern [3] . A fault or a set of faults may affect the system behaver if they are not masked and they can cause a system failure. Therefore, it is extremely important to protect systems from fault's impact 1 This work has received a French government support granted to the COMIN Labs excellence laboratory and managed by the National Research Agency in the "Investing for the Future" program under reference ANR-10-LABX-07-01. It has also received support from the Brittany Region. Fig. 1 : The TMR concept to achieve acceptable reliability and maintain low complexity, cost and power.
Fault-tolerance is the set of measures and techniques that aim to enable continuity of correct service delivered by a system even in presence of errors due to PVT variations coupled with increased advancement of CMOS technology. In the literature, a considerable amount of architectures has been proposed to mask or mitigate errors in different context, but what they have in common is that they all use redundancies to detect different types of errors. Redundancy takes two forms: spacial and temporal. The spatial redundancy refers as a replication of blocks, functions or data in a system. In the temporal redundancy, the same operation is repeated multiple times and by comparing the result in different instant, the presence of errors in the system is detected.
It is John von Neuman that pioneered the idea of using redundancy, in the 1950's, to improve the reliability of systems [4] . Then, the well-known Triple Modular Redundancy (TMR) appear as a similar approach with less complexity. In a TMR system, the original module is replicated three times, and error correction is achieved by a majority vote operation [5] . Fig. 1 illustrates the TMR concept. One of the reason of its popularity is its high ability to protect circuits from all type of errors. However, it is a very costly technique, an overhead of more than 200% compared to the original module. For extremely critical applications, such as space, avionics and healthy applications, where the system cost is less important than its reliability, TMR can be used to protect circuits. Otherwise, the need for resilient technique consuming less power and approaching the performance of the TMR at the same time, is increasing. In the context of signal processing applications, to reduce the area overhead, the authors of [6] propose to add only one additional module to the original module. The additional module gives a reduced precision estimation for the output of the original function and consumes less power than the original. The final output is chosen between the output of the original module and the 978-1-5386-0446-5/17/$31.00 c 2017 IEEE Fig. 2 : The ANT scheme output of the reduced replica. The scheme is referred as the Algorithmic Noise Tolerance (ANT) and is shown in Fig.2 . Error Correction Codes have also been proposed to protect memories, [7] , and then applied for interconnect networks [8] . All these methods are based on the spacial redundancy. Architectures based on the temporal redundancy usually use the double sampling technique to detect the presence of errors and a re-computation of operations for the recovery procedure [9] . Temporal redundancy comes with a throughput penalty and can be suitable only for application that tolerate spending extra time to recompute operations. This paper presents a new resilient scheme, named the Duplication with Syndrome based Correction (DSC) scheme; The proposed scheme is based on the duplication of a module and correction using a syndrome based approach. The remainder of this paper is organized as follows. Sec. II introduces the DSC scheme in details and gives informations about its ability of correcting errors. Sec. III presents an example of application where the scheme can be used and the evaluation methodology. Sec. IV provides results of the comparison between DSC and the classical TMR schemes.
II. PRINCIPLE OF THE DSC METHOD
Before describing the DSC method, let us define minimum hypothesis on targeted signal processing application required to apply the DSC method.
A. Definitions
Definition 1: Let consider (G,+) a set with an operation (+) that combines any two elements x 1 and x 2 to form another element denoted x 1 +x 2 . (G,+) is a group if the four following properties are satisfied:
• Closure: For all x 1 , x 2 in G, the result of the operation,
There exists an element e in G such that, for every element
Such an element is unique • Inverse element: For each x 1 in G, there exists an element x 2 in G, commonly denoted -x 1 , such that x 1 + x 2 = x 2 + x 1 = e, where e is the identity element. Definition 2: Let consider two groups (A,+) and (B,+). We define a group homomorphism Ψ: (A,+) −→ (B,+) where for all x 1 and x 2 in A,
There is many examples of group homomorphisms in the context of signal processing application, for example multiplication with a given value or polynomial, filtering, (convolution), matrix product, ... Those operations can be performed in any set (real, integer modulo P , Galois Field,...). . . .
B. Three Duplication with Syndrome based Correction scheme (3-DSC)
In the literature, all the fault tolerant architectures propose to add extra redundancy to be able to detect occurrence of faults. However, today, many operations and functions exists in replica inside the same design. Let us assume that there exist in a design three identical group homomorphism, F , processing in parallel on three independent inputs, denote x 1 , x 2 , x 3 to generate y 1 = F (x 1 ), y 2 = F (x 2 ) and y 3 = F (x 3 ) (see Fig.3a ). Researchers previously propose to triplicate each operation independently and add a vote majority to mask errors, as shown in Fig 3b. With that, The design that contained three group homomorphisms at the beginning, will contain nine group homomorphisms and three voters. We propose in this paper a resilient scheme that will contain only six group homomorphisms F and a syndrome based corrector instead of three voter to mask errors. The proposed scheme is composed of the three original group homomorphisms, and three other redundant group homomorphisms F computing Fig. 4 . In this figure, y k = F (x k ) for k = 1, 2 and 3, are replaced by z k = F (x k ), for k = 1,2 and 3 since the final y k values are estimated after the correction mechanism. According to the group homomorphism structure of the fonction, in case of error free computation, the three following equations are all verified
Note that, in case of computation with rounding and/or saturation noise, the strict equality can be relaxed to a distance compatible with the noise computation. Those three inequality can be represented by a triplet of Boolean values, or syndromes, (s 1 s 2 s 3 ), with s 1 (respectively s 2 and s 3 ) equals to 1 if the first equation (respectively second and third equation) is not fulfilled.
In a more robust version, it is also possible to further consider Z r , the parity check sum, defined as
In case of no error, the syndromes are all equal to zero and the values of z 1 , z 2 and z 3 are simply copied in the outputs y 1 , y 2 and y 3 .
If a single fault occurs in one of the six modules then the value of the syndromes allows to detect the error and correct it. Let us consider, for example, that an error occurs in the computation of F (x 1 ) and that the first operator outputsz 1 , withz 1 = z 1 . According to (1), replacing z 1 by the erroneous valuez 1 in will generate the violation of the equality of the first and last equation of (1), i.e., leading the 3 syndromes (s 1 s 2 s 3 ) be equal to (s 1 s 2 s 3 ) = (101). In this case, it is still possible to recover the exact value y 1 by processing y 1 = 
. This example can be generalised for any single error, as shown in Table I . Finally, if more than one error occurs, in most of the case, the 3 equations won't be fulfilled. In that case, an error is detected an other correction/mitigation mechanisms at higher level may be activated but those mechanisms are out of the scope of the paper 2 . Nevertheless, it should be note that two errors event of same amplitude may not be detected. For example, if the function F is applied on integer and both z 1 and z 2 are affected by an additive noise of same amplitude n, i.e.z 1 = z 1 + n andz 2 = z 2 + n , then equation z 4 =z 1 −z 2 will remain correct. The syndrome is thus (011), which implies the erroneous correction of z 3 . Note that, in case of this double event of same amplitude, TMR system will also output a wrong value.
Faulty variable
To conclude this section, we should mention that the ANT technique presented in [6] can be adapted to the proposed scheme. Instead of doing computation of z 4 , z 5 and z 6 in full precision, it is possible to do them in a reduced precision to save area and power dissipation. The drawback is that, in case of error occurring in z 1 for example, the reconstructed value y 1 = z 4 −z 2 will have a degraded precision due to the reduced precision of z 4 .
C. Generalisation: N-DSC scheme
In this section, a resilient scheme for a design with N group homomorphisms is proposed as an extension of the 3-DSC (N > 3). The extension is straightforward: to the N functions 2 In [10] , in case of error, the previous last correct value is given to a feedback filter in order to limit the propagation of error for example 
Similarly to the 3-DSC case, in case of no error, the N following equation are fulfilled,
N syndrome s q , q = 1 . . . N can be defined, with s q = 0 is the q th equation is fulfilled, 1 otherwise.
From the local syndrome s q and s q+1 , it is possible to evaluate if the computation of z q is correct. In fact, if due to an error,z q is output instead of z q , both syndromes s q and s q+1 are equals to 1. In that case, the correct value of y q can be estimated as y q = z N +q + z q−1 . The hardware required to perform those operation is shown in Fig. 5 .
To summarise, any single error in the N -DSC module is detected and corrected. Two errors that appear in the N −DSC module can be corrected if and only if their corresponding local syndrome doesn't interact. However, two errors that appear in two adjacent redundant modules z N +q and z N +q−1 lead to a wrong estimation of y q . In fact, if z q−1 , z q and z q+1 are correct whilez N +q−1 andz N +q are faulty the corresponding syndrome s q−1 = 1 and s q = 1, and thus, according to the correction mechanism, y q will be estimated as y q =z N +q + z q−1 . This type of error can result to non correctable error, which may be problematic if the result of function F is used to feed a feedback loop. In this case we can propose to split the original design in blocks of 3/4/5 functions F and use the 3-DSC, 4-DSC and 5-DSC to protect each blocks. 
A. Introduction to GPS
The GPS is a well known technology that allows determining both the physical position and the absolute time of a receiver. The position in time and in space is determined thanks to a precise distance measurement with at least four GPS satellites. Each GPS satellite transmits a navigation message at 50-bits/s using the CDMA (Code Division Multiple Access) technology. The analytical expression of the transmitted signal of a satellite a is:
where:
• d a (t): navigation message of the a th satellite,
• c a (t): a th Coarse/Acquisition (C/A) satellite code with a Binary Phase Shift Keying (BPSK) modulation (i.e. c a ∈ {−1, 1}), • f L1 : the carrier frequency in the L1 GPS Band (Open Sevice). GPS receiver has to demodulate the navigation message of different satellites in view to make the distance measurement. This involves two essential and sequential process: the acquisition process and tracking process. The acquisition process is the process by which the receiver identifies which satellites are in view. It is a a three-dimensional search to determine the GPS satellite identifier (which is the index of its associated C/A code), the code phase (represented by τ ), and the carrier frequency offset due to Doppler effect (represented by f d ).
Since satellites are in continuous motion, the distance between any satellite and the receiver is dynamic. Besides to that, the carrier frequency of the received signal is also constantly changing in time due to Doppler shifts. Therefore, once acquired, GPS signals have to be tracked over time. To note here, any GPS receiver design contain at least four channel tracking module; each module tracks a unique GPS satellite.
To track a satellites' signals, each tracking module is composed of a correlation module and two tracking loops ( the carrier tracking loop and the code tracking loop). The carrier tracking loop performs the task of aligning the local generated carrier with the incoming signal while the code tracking loop ensures the time alignment of the local generated codes. Each loop is made of discriminators, filters and generators. The correlation function is computed every 10 ms period to compare local signals with incoming signals. A maximum correlation output is achieved when the two signals are aligned. A simplified representation of the channel tracking module is given in Fig. 6 .
B. Robustness of the Correlation Function
Faults when computing the correlation function can produce errors at its output. Because of feedback loops these errors will propagate over time and will corrupt generated carrier and codes. Loss in the signal tracking process can be reported, forcing the receiver to restart the initial signal acquisition procedure. So it is increasingly important to deal with the impact of the faults when they appear in the correlation process. For each tracking channel, the incoming signal is first multiplied by the generated carrier, and, then by three generated codes, c E , c P and c L . As mentioned earlier, any GPS receiver design contain at least four channel tracking module, we duplicate the first multiplication of the four canal. Error correction is achieved to determine the correct X P 1 , X P 2 , X P 3 and X P 1 as shown in Fig.7 . Then, the three multiplication in each tracking are duplicated and protected independently as illustrated in Fig.8 .
C. Evaluation
To compare the proposed method to the TMR, we propose to use the approach defined in [11] . This approach characterises an architecture in unreliable hardware by two dimensional criteria: the reliability (probability that no error appear in the output of the architecture P N o−error and the hardware efficiency of an architecture (defined as the normalised number of operation per unit area and time unit). To compute the efficiency, the nature of the computation inside the area unit is not specified, it can be a simple multiplier or more complex operation like an FFT transformation or a iterative system . . . . If we consider an operation that takes n area units and m area clocks to be executed, the efficiency is expressed as,
In the following part, we will focus on the evaluation of the 3-DSC scheme as shown in Fig.7 .
Triple Modular redundancy (TMR): Let P M the error probability in a single module during one clock cycle, and n v the area cost of the voter. The resulting error probability of the TMR is expressed as,
The efficiency of the TMR is determined by,
The normalised hardware efficiency is defined as, • when P cs =0 and z 2 and z 3 are non faulty (i.e S=101)
• when z 1 is non faulty The error probability in one of the SC scheme outputs is,
The efficiency of the DSC-scheme is,
where n c and m c represents the number of area unit and time unit respectively. The normalised hardware 
IV. PERFORMANCE COMPARISON The TMR and DSC schemes were evaluated in term of hardware efficiency and reliability. Table II details the logic synthesis results in term of number of cell obtained using Synopsys Design compiler in the 45 nm technology. Varying the probability that an error occurs at the output of a single module, the probability that no error appear in the output of each scheme is compared in Fig. 9 . From this figure we can see that the DSC scheme provides robustness closed to the TMR methods. Now, given a probability of an error at the output of a single module fixed to 10 −3 , results from the analyse of the normalised hardware efficiency of each method are summarised in Fig. 10 as of function of the error probability. The hardware efficiency The error probability at the output of the schema 
V. CONCLUSION
Our results show an 32% improvement in the complexity with the proposed 3-DSC scheme compared to the classical TMR scheme. Moreover, we guarantee that the reliability offered by the 3-DSC is maintained closed to the TMR. As an example, the 3-DSC was used in the context of a GPS tracking application to protect some internal component such as the correlation function and filters. It has been shown in this paper that the 3-DSC can be extended to design a more general resilient scheme named N-DSC where N redundant modules are added to an original design that contains N identical modules operating on N different data.
