Abstract-The presence of different noise sources and continuous increase in crosstalk in the deep submicrometer technology raised concerns for on-chip communication reliability, leading to the incorporation of crosstalk avoidance techniques in error control coding schemes. This brief proposes joint crosstalk avoidance with adaptive error control scheme to reduce the power consumption by providing appropriate communication resiliency based on runtime noise level. By switching between shielding and duplication as the crosstalk avoidance technique and between hybrid automatic repeat request and forward error correction as the error control policies, three modes of error resiliencies are provided. The results show that, in reduced mode, the scheme achieves up to 25.3% power savings at 3-mm wire length as compared to the original nonadaptive scheme at the cost of only 3.4% power overhead in high protection mode.
redundancy check, Hamming codes, and Hamming-based product codes were proposed to handle LVI transient faults only [3] [4] [5] , while crosstalk avoidance code schemes and noncoding techniques including shielding, skewed transition, and repeater insertion were proposed to handle TVI only [2] , [6] .
On the other hand, a class of work proposed crosstalk-aware codes that join error control with crosstalk avoidance to address both LVI and TVI faults. Earliest schemes, duplicate-add-parity (DAP) [6] , dual rail [7] , boundary shift code [8] , and modified dual rail code [9] , achieved single error correction (SEC) with reduced worst-case crosstalk-induced bus delay (CIBD) to (1 + 2λ)τ 0 through duplication and parity calculation over data bits. Note that λ is the ratio of wire coupling capacitance to bulk capacitance and τ 0 is the crosstalk-free wire delay [1] . These coding schemes achieve the dual function of speeding up bus signal arrival and increasing resilience against single logic level flips while reducing power consumption.
Recently, more powerful crosstalk-aware codes to detect/ correct multibit errors have been proposed to address steady growth of noise in ultra-DSM technology. The Hamming product code with skewed transitions [10] achieved multibit error detection; however, it restricted to two errors per row. A crosstalk avoidance and double error correction scheme was proposed in [11] , encoding the data using Hamming SEC and passing the resultant codeword into the DAP encoder. This was later enhanced to the joint crosstalk avoidance and triple error correction (JTEC) scheme and to the JTEC with simultaneous quadruple error detection (JTEC-SQED) scheme [12] . This high error detection/correction enabled a lower voltage swing, thus effectively reducing power while maintaining the target reliability.
These crosstalk-aware multibit error control codes are designed to achieve target reliability in the presence of worstcase noise conditions predicted, whereas noise level actually varies during operation due to temperature and supply voltage variations [1] , [10] , [13] . Accordingly, some works [10] , [13] attempt to adapt the detection/correction strength based on the noise level to minimize the power and/or energy consumption. However, they lack the crosstalk-aware capability.
In this brief, we consider network on chip (NoC), an emerging paradigm for on-chip communication. We propose an adaptive crosstalk-aware multibit error control code to further reduce the bus power consumption by providing appropriate communication resiliency based on runtime noise level. For a given noise scenario, an appropriate set of crosstalk avoidance approach and error control policy is selected for power-efficient and reliable transmission. The proposed scheme switches between three sets, namely, duplication and hybrid automatic repeat request (HARQ), shielding and HARQ, and shielding 1549-7747 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. with forward error correction (FEC). The rest of this brief is organized as follows. Section II describes the proposed adaptive scheme. Section III presents the reliability provided by the different schemes. Section IV evaluates the schemes from power and area perspectives. Section V concludes this brief.
II. PROPOSED ADAPTIVE CROSSTALK-AWARE MULTIBIT ERROR CONTROL CODE
The common approach of arbitrarily combining different coding schemes leads to a high-complexity design [13] . The proposed approach exploits the inherent characteristics of joint wire duplication and error control policies, and systematic codec integration for reduced hardware complexity in providing adaptable error protection levels while maintaining (1 + 2λ)τ 0 CIBD.
The proposed scheme works in three modes, i.e., normal mode and two power-saving modes, namely, duplicated SECDED (D_SECDED), shielded SECDED (S_SECDED), and shielded SEC (S_SEC), respectively. The normal or D_SECDED mode provides the highest error protection at no power saving, while the two power-saving modes (S_SEC and S_SECDED) provide moderate and low error protections, leading to moderate and high power savings, respectively.
In the D_SECDED mode, the JTEC-SQED scheme is selected, which uses duplication and HARQ-based Hamming SEC-DED as the crosstalk avoidance and error control, respectively. JTEC-SQED is employed as it achieves high error protection, i.e., triple error correction and quadruple error detection, and it will be used in high-noise conditions. The drawback of this scheme is the increased bus power consumption due to the switching of the duplicated wires and complex decoding algorithm [12] .
In the power-saving modes, the crosstalk avoidance approach is switched to the shielding technique, while the error control schemes used are HARQ-based Hamming SECDED and FECbased Hamming SEC for the S_SECDED and S_SEC modes, respectively. The S_SECDED mode enjoys error protection up to two error detections, while S_SEC has SEC, and thus, it will be used in moderate and low-noise conditions, respectively. In these two modes, power reduction is achieved through the nonswitching shield wires and nonactive hardware logics at both encoder and decoder due to simpler error control coding. Note that shielding is an alternative crosstalk avoidance approach that uses the same number of wires as duplication, and both limit the worst case CIBD to (1 + 2λ)τ 0 .
In NoC, it is common to place the encoder and decoder as separate pipeline stages to preserve frequency with increased packet latency [12] . Fig. 1(a) shows the adaptive scheme encoder where data are first encoded using the Hamming SECDED encoder, and then, one of the crosstalk avoidance approaches is applied before being sent on the bus. In pipelined on-chip communication, a set of flip-flops precedes the crosstalk control. The Hsiao code is implemented for the encoder block due to its ability to reduce a long chain of XOR gates to calculate the overall parity and decoder complexity [8] . A characteristic of the Hsiao code is the ability to work as SEC code by dropping any one of the (h + 1) parity bits [12] . Reconfiguring to use SECDED or SEC code is achieved through mode [0] control bit.
The crosstalk control block applies duplication in the D_SECDED mode, while shielding is applied in the S_SECDED and S_SEC modes by setting mode [1] control bit to 1 and 0, respectively. By duplicating the SECDED codeword, the Hamming distance is doubled from originally 4 to 8, and with proper decoding strategy, triple error correction and quadruple error detection are achieved [8] . Shielding SECDED or SEC codeword does not increase the Hamming distance, and thus, their Hamming distances are 4 and 3, respectively. The resulting codeword in the three modes is shown in Fig. 1(b) . Note that, in the S_SECDED and S_SEC modes, shield wires are grounded.
A retransmission buffer is required for the HARQ-type schemes, i.e., D_SECDED and S_SECDED. In the FEC-type scheme, i.e., S_SEC, the retransmission buffer is not required and can be disabled to save power. Clock and power gatings are the alternatives to disable the buffer, and in our implementation, clock gating is employed. Fig. 1(c) shows the proposed adaptive scheme decoder that supports reconfiguration among the three modes. In the D_SECDED mode, the JTEC-SQED scheme implements a complex decoding algorithm that requires calculating the syndromes of both codeword copies A and B and correcting them. Two sets of (k + h + 1)-register arrays at the decoder input are used to store both codeword copies. In the S_SECDED and S_SEC modes, the decoding process is much simpler than JTEC-SQED as both modes require a single codeword copy and correct a single error only. In our implementation, copy A codeword is chosen to be decoded; thus, blocks that process copy B codeword including the register array and JTEC-SQED decision logic can be disabled to save power.
In the S_SECDED mode, blocks related to codeword correction (i.e., error correction and syndrome calculation) and retransmission request signal generation (i.e., odd/even and decision logic SECDED) are active. In the S_SEC mode, blocks related to codeword correction are active, while other blocks including 1 bit in copy A register array are disabled.
Mode [1] control signal together with the signal from the JTEC-SQED Decision Logic block selects decoded data from corrected copy A when in the S_SECDED and S_SEC mode or either corrected copy A or B in the D_SECDED mode. The retransmission request signal is selected from the respective decision logic of the JTEC-SQED or SECDED block, and in the case of S_SEC, a value of 0 is inserted, all controlled by the two mode signals.
Noise behavior can be monitored by counting the number of errors affecting a victim wire working in half the voltage swing of normal wires [1] , [13] , [14] as shown in Fig. 1(d 
On-chip communication reliability can be assessed using the mean time to failure (MTTF). This metric can be measured using the residual word error probability P res , which is the probability that the decoder accepts a word having one or more errors as they pass undetected in the first transmission or during the retransmission; thus, P res can be given by [13] P res = P und 1 − P ret (2) where P und is the word undetected error probability, P ret is the retransmission probability, and P und + P ret ≤ 1. Each coding scheme has a different P und model proportional to its detection/ correction capability. For DAP, JTEC, and JTEC-SQED, the undetected error probabilities are as follows [6] , [12] :
The P und of JTEC and JTEC-SQED represents the upper limit of undetected error probability assuming that all four errors in JTEC are wrongly corrected and all five errors in JTEC-SQED are wrongly corrected or undetected. For S_SEC (Hamming SEC) and S_SECDED (Hamming SECDED), P und is given in [15] and approximated for a small ε to
For the FEC-based schemes, P ret = 0, while for Hamming SECDED and JTEC-SQED, it is proportional to ε 2 and ε 4 , respectively, which can be neglected for small values of ε. Fig. 2 shows the achieved residual error probability for the different schemes working at V SW = 1 V at different noise deviations σ N . Under iso-V SW , different error protection strengths lead to a range of tolerable noise level. JTEC-SQED achieves the highest reliability as it has the lowest residual error probability owing to its higher error detection represented by the P und being proportional to ε 5 . Both DAP and Hamming SEC achieve similar reliability levels due to their ability to correct a single error. Hamming SECDED provides an intermediate protection between Hamming SEC and JTEC-SQED.
IV. RESULTS
To evaluate the proposed scheme, it is compared to three schemes: DAP, JTEC, and JTEC-SQED considering NoC communication between two routers. A 32-bit flit size is assumed, passing through a three-stage pipeline: encoder, link traversal, and decoder, with one cycle per stage. A target reliability of P res = 10
−18 results in about 65 days MTTF for 8 × 8 mesh NoC running at 800-MHz frequency. Every two neighbor routers share one monitoring block in the adaptive scheme. Fig. 3(a) shows the achieved residual word error probability with increased noise deviation σ N for the different schemes. Each scheme is designed to work at V SW that achieves P res = 10 −18 at a maximum σ N of 0.12 V, as detailed in Table I . For the adaptive scheme, all modes are designed to work in the voltage swing at which the D_SECDED mode achieves the target P res at 0.12-V noise deviation. Each mode can maintain the minimum target reliability P res ≤ 10 −18 up to certain σ N as indicated by the intersection of the modes' residual word error probability graph with the horizontal line. The S_SEC mode can achieve P res ≤ 10 −18 for σ N up to 0.079 V. Above this noise deviation level, the scheme must switch to a higher detection mode (i.e., S_SECDED) that maintains the target reliability for σ N up to 0.097 V. For a higher σ N , the scheme switches to the D_SECDED mode that can achieve the target reliability up to the maximum noise deviation designed for (i.e., σ N = 0.12 V). Thus, the switching points between the three modes are the noise deviations σ N 1 = 0.079 V and σ N 2 = 0.097 V. Error threshold = BER × Sampling window, where BER is found using (1) for σ N 1 and σ N 2 at V SW = 0.51 V. For a 14-bit sampling counter (i.e., 2 14 cycles sampling window), the two thresholds corresponding to σ N 1 and σ N 2 are 10 and 69 errors, respectively, thus requiring a 7-bit error counter. The negligible retransmission effect in all considered schemes led to a similar network performance as shown in Fig. 3(b) . The figure also shows how network latency is increased when coding is incorporated due to the added encoding and decoding pipeline stages per router.
The different schemes were implemented in Verilog HDL and verified using ModelSim. The encoders/decoders were synthesized for a 1.1-V supply voltage with an 800-MHz target operating frequency using Synopsys Design Compiler with 45-nm Nangate library. Table I compares the number of wires for the different schemes, including the number of switching wires (active) as compared to the total number of wires for the adaptive scheme. The synthesized codec area of the proposed scheme, including the noise monitor, is only 15.6% higher than the original JTEC-SQED scheme as shown in Table I due to the careful design of the reduced protection modes allowing hardware resource sharing. In addition to the noise monitor, this overhead comes from the additional circuitry of the crosstalk control and configuration hardware on the encoder and decoder, respectively, and the clock gating at the encoder and decoder. In general, the FEC-based schemes (i.e., DAP and JTEC) have a smaller encoder area since no retransmission buffers are required.
The average link power can be evaluated using [16] 
where L is the number of wires in the link, C L and C C are the wire self-capacitance and coupling capacitance between wires, respectively, α wire and α C are the wire self-transition and coupling capacitance transition activity factors, respectively, V SW is the wire supply voltage, and f is the operating frequency. Based on the predictive technology model [17] and using 45-nm technology interconnects of 0.077 μm width and spacing, 0.18 μm thickness, 0.11 μm height, and 3.1 dielectric constant, the self-and coupling capacitances were obtained. The evaluation considers 1-5-mm link lengths. For the adaptive scheme, runtime noise values of σ N = 0.079, 0.097, and 0.12 V, respectively, for the three modes are used for power estimation, while nonadaptive schemes are independent from noise level. As indicated by Fig. 4(a) , DAP has the highest total power consumption due to its higher V SW which results in high link power consumption. Although JTEC and JTEC-SQED have higher codec complexities as compared to DAP, they achieve overall power savings of 28.5% and 31.2%, respectively, at 3-mm link length. Hardware sharing, induced by careful design of the different protection modes, limits the power consumption of the adaptive scheme including the noise monitor, in D_SECDED mode, to 3.4% higher than the original JTEC-SQED. On the other hand, the S_SECDED and S_ SEC modes achieve power savings of 14.1% and 25.3%, respectively, as compared to the original JTEC-SQED, and larger savings are achieved with respect to DAP of 40.9% and 48.6%, respectively, at 3-mm link length. At a longer link length (i.e., 5 mm), the adaptive scheme achieves larger power savings as compared to DAP with 36.4%, 46.8%, and 52% for the D_SECDED, S_SECDED, and S_SEC modes, respectively. On the other hand, a shorter link length (i.e., 1 mm) limits the savings of low V SW operation since the codec complexity overhead dominates the total power consumption. With respect to DAP, the D_SECDED mode results in 1.2% overhead, whereas the S_SECDED and S_SEC modes achieve 17.4% and 35% savings, respectively. The link length affects JTEC and JTEC-SQED power consumption as well, increasing the savings over DAP from 15.0% and 4.7% at 1-mm length to 31.9% and 37.9% at 5-mm length, respectively. The reduced power consumption in the S_SECDED mode as compared to the D_SECDED mode comes from two sources. The first source is the reduced number of switching wires, 39 compared to 78 in the D_SECDED mode, which reduces the link power consumption by 14.9% as shown in Fig. 4(c) . The second source is the reduced decoder operations through disabled hardware blocks that process copy B codeword, leading to 39.7% reduction in the decoder power. Clock gating the registers also leads to nonswitching of many parts of the decoder combinational logic, which contributes to this large reduction. As a result, S_SECDED enjoys 17% power savings when compared to the D_SECDED mode at 3-mm link length. Since the S_SEC mode is a FEC-based scheme, clock gating the retransmission buffers leads to 55.1% reduction in the encoder power when compared to both S_SECDED and D_SECDED modes. A small reduction of 2.6% in link's power as compared to the S_SECDED mode is achieved due to the unused check bit in the link. At the decoder side, the odd/even module, SECDED Decision Logic module, and one flip-flop are disabled, leading to 1.7% power reduction with respect to the S_SECDED mode. At short link lengths (i.e., 1 mm), the codec represents a large portion of the total power; thus, the power saving shown in Fig. 4(c) is mainly contributed by the codecs. As the link length increases, the link power dominates, and thus, power savings are largely influenced by the link.
From another perspective, the estimated maximum noise deviation governs the minimum V SW required to achieve the target reliability P res . A higher σ N enforces the use of higher V SW , which, in turn, quadratically increases the link power consumption. Therefore, schemes with lower V SW benefit more at higher estimated σ N since their power savings will be higher as shown in Fig. 4(b) . The savings of the highest and lowest protection modes of the adaptive scheme, including the noise monitor, at σ N = 0.08 V are 8.8% and 39.6% with respect to DAP, respectively. These savings increase to 37.2% and 52.4% at σ N = 0.16 V.
V. CONCLUSION
Crosstalk-aware multibit error detection/correction coding schemes suffer from high power consumption as they are designed for worst-case noise conditions. This brief has proposed a coding scheme that jointly provides crosstalk avoidance and adaptively selects the level of error protection based on noise conditions. It was shown that working in the reduced error protection mode achieves noticeable power savings of 25.3% at 3-mm wire length. Adding this adaptivity incurs area and power consumption overhead of 15.6% and 3.4%, respectively. It was noted that higher protection schemes are more effective at higher noise levels as they achieve higher power savings.
