Abstract-On-chip interconnect communication system consists of the drivers, interconnect wires and receivers. Several on-chip communication system models have been developed for the purpose of on-chip fault-tolerant communication research. While most of these models improved the channel modeling, the effects of the drivers and receivers to the whole communication system were largely ignored. In this paper, we introduce a comprehensive, system-level framework, to capture and integrate the characteristics of the channel as well as the drivers and receivers. The proposed framework offers a methodology to model the on-chip interconnect communication system and can provide a flexible and updateable platform to evaluate faulttolerant communication approaches. Furthermore, the current deterministic paradigm which end is worst case analysis pessimism is avoided by shifting towards statistical design flow to reduce uncertainties caused by process variation.
INTRODUCTION
On account of enhancements in process technology, number of components being integrated into a single Systemon-Chip (SoC) is progressively growing. Resulted in larger number of interconnects, the communication between these components is increasingly taking over crucial system paths and frequently becomes the basis for performance holdup [1] .Consequently, it becomes extremely vital for designers to explore the communication space early in the design flow.
From the communication point of view, the signals of onchip interconnect flow from the drivers, through the channel (interconnect wires) to the receivers. From literature, different DSM noise sources have been identified affecting each of these components of the interconnect communication system [2] . These noise sources can result in functional as well as timing errors.
To increase the reliability of the communication system, two approaches have been identified [3] . The first approach is through the 'noise budgeting' approach which applies the worst case analysis leading to increase of the noise margin in order to mitigate the noise. This approach results in high signal-to-noise ratio (SNR) at the expense of high power dissipation [2] . The analysis is rather a pessimistic analysis, to consider all noise sources to happen simultaneously at the worst possible extreme value which is misleading in real design.
The other approach is the fault-tolerant communication strategy which consists of design techniques that are inherently tolerant to noise and errors. Well-known subcategories under this topic are dynamic noise analysis, bus encoding, and channel coding [2, 4] . These methods have shown great success versus the noise-budgeting approaches in terms of optimality in speed and power [5] [6] [7] [8] . However, such communication techniques were largely evaluated through simplistic low-precision channel models which could not show the actual capacity of the techniques [5] [6] [8] [9] . In order to evaluate the fault-tolerant communication techniques, the very fundamental requirement is a comprehensive model of the communication system which consists of the drivers, receivers and interconnects.
Lasting from early days of semiconductor manufacturing, process variation has never been a serious setback for integrated circuits robustness. However by emergence of the deep submicron (DSM) epoch, its effects on signal integrity are no longer negligible. Variability gives rise to variance in timing and power as well as Bit Error Rate (BER) of SoC designs [10] . Resulting in prohibitive design overheads in terms of speed and power [11] , worst case considerations have been broadly employed to mitigate the variations noise. However, in order to move towards fault tolerance and optimality in speed and power a shift in design paradigm is inevitable as process variations are vastly exacerbated at future technology [12] .
In this paper, communication system models developed in previous works are reviewed and based on their drawbacks and advantages a comprehensive system-level framework is proposed. Also, the process variations uncertainties about circuit parameters are included by introducing a major shift in design paradigm from current deterministic paradigm towards statistical design flow. The modeling of interconnect communication system is studied from a system-level point of view focusing on functional errors.
The rest of the paper is organized as follows. A review on communication system models used to evaluate fault tolerant techniques is presented in Section II. In section III, the proposed system-level framework is discussed. Finally, the conclusion is given in section IV. In [13] , the authors made the assumption that every time a transfer occurs across a wire, it can make an error with a certain probability ε. The parameter ε depends on the standard deviation of the noise voltage and the operating supply voltage. Generally speaking, this channel model lumped together different on-chip noise sources and considers the error probability as independent of these noises characteristics. It is also assumed to employ static noise margin (SNM) thresholding at the receiver in which the receiver output is in error when the noise voltage exceeds the receiver threshold voltage. This model has been deployed in [6] and [9] in order to evaluate bus encoding technique and error correcting scheme respectively.
In [8] , the authors proposed a channel model which captures both functional and timing error. The additive white Gaussian noise (AWGN) model is added to signaling waveform, while the variability of the channel cut off frequency around its nominal value is used to capture the functional and timing error respectively. Focusing on functional error, all sources of noise are lumped into AWGN model. It is assumed that an error occurs when the additive noise exceeds half of the voltage swing -an SNM thresholding approach.
Another channel model which was originally used for chipto-chip interconnects and is applicable to on-chip interconnects is proposed in [5] . In this model, there are two noise generators which are the AWGN noise generator and crosstalk noise generator. The crosstalk noise models the noise that is dependant on the interconnect transition patterns while the AWGN noise models all other noise sources. The crosstalk noise is simply modeled by taking the derivative of the transmitted signal in the discrete time domain and a gain factor is applied to adjust the noise amplitude. In this channel model, values from both noise generators are injected to the signaling waveform which is later threshold at the receiver using SNM thresholding technique.
Finally, [14] proposed a circuit-level simulation based communication system model. In this model, the effects of the channel as well as the driver and receiver that form the endpoints of the channel are captured. Signal degradation at the receiver as a result of transistor switching noise, edge, jitter, and frequency-dependant signal attenuation and crosstalk is captured in the model. For fluctuation in supply voltage (power supply noise), it is modeled as AWGN and injected to both driver and receiver circuits. This channel model is simulated through Cadence's Spectre device-level analog circuit simulation tool [15] .
III. PROPOSED FRAMEWORK
In this section, our proposed on-chip communication system model in support of fault-tolerant analysis is presented.
To proceed further, we start off by identifying the drawbacks and advantages of the on-chip communication system models presented in the previous section. We then address our proposed framework which is flexible and updatable, suitable to evaluate on-chip communication approaches for low power fault-tolerant communication research.
Most of the models in the literature are focused in developing model for the channel, ignoring the effects contributed by driver and receiver components [5] [6] [8] [9] . In most cases, noise sources with different characteristics are lumped together into a single noise model which is misleading [6, [8] [9] . For example, hence the crosstalk noise which is the major contributor in DSM on-chip communication is very transition dependent [3] , it must be taken into account for more accurate evaluation. Moreover, a simple crosstalk model proposed in [5] is inadequate to capture the real magnitude of the waveform perturbation.
While methodology proposed in [14] has high accuracy, it has drawbacks in terms of computation cost and expensive in terms of system resources. As a normal practice to get acceptable reading in BER for evaluating communication approaches, a simulation of sending data through the communication system in the order of 10 15 bits is required which is very large to be handled by a circuit-level simulator. Another important aspect which is not considered by all models is the process variation impact on the communication system. It has been shown that process variation must be considered at the early stage of every chip design [10] . Fig. 1 shows the block diagram of the proposed systemlevel framework. The framework is divided into three parts which are the driver, interconnect and receiver. In this illustration, we assume an N number of driver-receiver pairs connected through N parallel interconnect wires. The blocks and arrows represent system elements as well as signal path from driver to receiver. Dashed arrows represent power supply signals. This framework is updateable in the sense that the blocks can be modified according to simulation objectiveprecision and speed. For example, one can substitute a block with analytical, statistical or other types of models. The framework is also flexible in the sense that the unimportant blocks can be switched off by replacing them with ideal models.
The Driver block represents a driver node. The signal rise and fall times are defined in this block. The input bit stream and power supply voltages, Vdd and ground, are inputs to this block and the output is fed into the interconnect block.
Interconnect block is the block that stands for capacitive and inductive crosstalk and even inter-symbol interference in some proposed models. The inputs are the wave forms from driver nodes and the outputs are in the form of calculated wave forms, or simply waveform characteristics like glitch peaks and durations. Glitch duration is the period that the output exceeds next level threshold. The output of this block is directly passed to the process variation unit (PVU) block for uncertainty calculations.
Several analytical models have been proposed to extract output waveforms of the interconnects which are affected by capacitive and inductive crosstalk. In [16] and [17] , the authors proposed similar methods for fast calculation of output waveforms by decoupling interconnects according to transmission line model. This method is improved later in [18] and can be used inside the interconnect block for fast modeling of inductive and capacitive crosstalk effects. Another decoupling method based on eigenvector matrices with capability of taking into account the effects of inter-symbol interference is proposed in [19] , which is slower due to its iterative process. Besides the models which provide waveform estimation, there are other approaches which estimate the waveform characteristics like glitch peaks and durations [20] [21] [22] .
The power supply block symbolizes the on-chip power supply grid. This block can be represented by ideal sources with unchanging voltages or models consisting IR drops and voltage bouncing (Ldi/dt noise). AWGN added as an external source to the supply voltage of both the driver and the receiver circuits is another approach of expressing power supply noise [10, 14] . LR and LRC models are largely used where the focus of the research is on power supply noise measurement [23] [24] [25] . This block provides Vdd and ground connections for the driver and receiver blocks. The outputs of this block play an important role in the dynamic noise margin (DNM) block or receiver block. For example, when the receiver is a CMOS inverter that compares the incoming voltage with a powersupply referenced threshold voltage, typically centered in the middle of the two supply rails. The signal swing is lowerbounded by noise considerations. More specifically, power supply noise has a large impact as it impact both the signal levels and the switching threshold of the receiver. The latter is also a strong function of manufacturing process variations.
By emergence of the DSM epoch, variations in the process parameters and device dimensions are translated into variations in circuit parameters such as threshold voltages of transistors (V th ) and capacitance and resistance of interconnects wires [26] . Consequently, to facilitate the inclusion of process variation noise in the proposed framework its mechanisms of affecting functional error rates are studied under two categories. Firstly, variations in interconnects parameters accounting for fluctuations in peak and duration of output signals are studied under interconnect component represented by PVU block. And secondly, statistical variations in V th are studied under the receiver component of the communication system (DNM block) since it adds uncertainties to dynamic noise margins.
The PVU block represents the additive amount of uncertainty to the interconnect output signal to capture process variation effect on interconnect wire. The added uncertainty is quantified in [10] based on variations in wire resistivity, width, thickness and dielectric height as well as variations in temperature, transistor effective length and power supply. This approach is an appropriate choice for the PVU block because it can estimate the added uncertainty offline for each interconnect configuration. However, much recent literature has been devoted to the topic of modeling interconnect variation [11, [27] [28] [29] , so that various models with different precisions and computation overhead are available for this block. Finally, the DNM thresholding block in the framework as magnified in Fig. 2 , acts as the receiver thresholding approach. Dynamic noise analysis in receiver nodes is a neutralizing mechanism which covers up many occasions that are easily mistaken by fault in static noise analysis [4, 21, 30] . To implement DNM thresholding in the framework, lookup tables of DNM for different glitch peaks and durations can be easily extracted offline for each receiver node by sweeping voltage and glitch [14] duration for one receiver node using circuit-level simulators. However as stated before, statistical variations in threshold voltages of receiver nodes have to be translated into variations in DNM to capture the effects of process variation noise. Also note that the fluctuations in supply voltage from the power supply block may shift the threshold curve upwards or downwards. A survey of methods in order to derive the DNM profile can be found in [31] . Therefore, by receiving the waveform or its characteristics from interconnect, this block can decide whether the systems output is a logical one or a zero.
IV. CONCLUSION
A system-level framework to model the on-chip communication system in support of fault-tolerant analysis is proposed in order to support fault tolerant designs by providing a comprehensive evaluation platform. Different models for each component of the framework are presented. Unlike previous models in literature, the proposed framework is comprehensive which includes all significant known sources of noise and take into account effect of transmitter and receiver. Furthermore, the current deterministic paradigm which end is worst case analysis pessimism is avoided by shifting towards statistical design flow to reduce uncertainties caused by process variation. This framework is flexible and updateable since it can be reconfigured or updated for different simulation goals. It can provide fast analysis and is very attractive in evaluating fault-tolerant communication approaches.
