Abstract-We propose a method of reducing the switching noise in the substrate of an integrated circuit. The main idea is to design the digital circuits to obtain a periodic supply current with the same period as the clock. This property locates the frequency components of the switching noise above the clock frequency. Differential return-to-zero signaling is used to reduce the data-dependency of the current. Circuits are implemented in symmetrical precharged DCVS logic with internally asynchronous D registers. A chip was fabricated in a standard 130-nm CMOS technology holding two versions of a pipelined 16-bit adder. First version employed the proposed method, and second version used conventional static CMOS logic circuits and TSPC registers. The respective device counts are 1190 and 684, and maximal operating frequencies 450 and 375 MHz. Frequency domain measurements were performed at the substrate node with on-chip generated sinusoidal and pseudo-random data at a clock frequency of 300 MHz. The sinusoidal case resulted in the largest frequency components, where an 8.5 dB/Hz decrease in maximal power is measured for the proposed circuitry at a cost of three times larger power consumption.
I. INTRODUCTION

I
N THE ERA of increasing integration, the technological advancements have lead us to integrate not only digital circuits of high device density, but also analog circuits on the same die [1] , [2] . The circuit speed and the number of devices have increased because of the downscaling in device sizes. However, as side effects the of the supply current increases and the distance between the circuits is decreased, causing especially the analog circuits to experience more problems from power-ground noise [3] . In a mixed-signal single silicon die that accommodates both digital and analog circuits, the digital switching noise becomes a paramount concern for the correct functioning and performance of the system [4] - [8] . The main problem is that digital switching in conjunction with the impedance of the power supply net cause a voltage fluctuation denoted simultaneous switching noise (SSN) on the digital supply rails.
In conventional designs, substrate contacts are included in each digital gate to bias the substrate to ground voltage. These substrate contacts give low impedance from the digital ground to the substrate surface. Hence the SSN is injected in the substrate region of the digital circuit and spread through the substrate to other circuits, thereby degrading performance in analog circuits [9] , [10] . The SSN is one of the main sources of noise in digital integrated circuits [11] , [12] . Therefore, the integrated circuit designer needs to consider SSN on the digital grounds for a mixed-signal circuit system-on-chip.
The purpose of this work is to investigate a method for reducing the SSN in the frequency range below the clock frequency, to the benefit of circuitry that operates with a bandwidth within this range. An experiment is carried out by using precharged DCVS logic circuits combined with a novel differential D flip-flop (DFF) in the registers, and compared with using conventional static CMOS logic circuits and TSPC registers [13] . Both the DCVS logic and differential register circuits use return-to-zero (RZ) signaling with the target on drawing periodic currents from the power supply. If the circuit draws an equal current during each clock cycle independent of the input data, the frequency content of the noise produced due to current spikes will be located above the input clock frequency. A test chip was fabricated with two 16-bit pipelined adders and on-chip test pattern generators. The test chip is used to show that the new circuits will reduce the SSN below the clock frequency.
SSN and common techniques to mitigate the associate problems are discussed in Section II. In Section III we present the used method and circuits. The test chip design for evaluation of the method is presented in Section IV, and in Section V the obtained results are given. The work is concluded in Section VI.
II. SIMULTANEOUS SWITCHING NOISE
An operating digital circuit contains a large number of gates that rapidly switch from 0 V to the power supply voltage, and vice versa. The switch time is in the order of picoseconds, which will require large supply current spikes. The current is supplied from the battery via printed circuit board (PCB) traces, lead frame or substrate of the package, off-chip to on-chip bond wires, and on-chip power lines, all with resistive, capacitive, and inductive parasitic impedances. These factors result in the voltage fluctuations on on-chip power supply lines known as SSN. This effect has always been present in the digital circuits, but due to the downscaling of devices, it has become a higher concern.
The SSN is consequently related to the impedance present between the ground of the device and the ground of the system [10] . Assuming constant inductance and resistance of the power supply, the voltage fluctuation on the on-chip supply lines is given by where and is the effective inductance and resistance of the power supply lines respectively, including the pin, bond wire, and PCB trace parasitics.
Considering typical inductance and resistance values, which are in the order of nHs and Ohms respectively, the equation shows that the inductance is of primary concern for the SSN as the is very high. The SSN increases with an increasing number of simultaneously switching devices.
Assuming a CMOS driver, the amplitude of the supply current spike depends on the following [4] : 1) aspect ratio of transistors; 2) rise and fall time of input; 3) load capacitance compared with critical capacitance; 4) output voltage; 5) short circuit current. After occurrence of the supply current spike there are voltage oscillations at the power lines. These oscillations are caused by the resistance, inductance, decoupling capacitance, and switched capacitance within the package. The corresponding RLC circuit is complex and may have many resonating frequencies with a dominating frequency [14] . The damping factor of such a resonance circuit needs to be sufficient to prevent the amplitude of the noise pulse from increasing by successive switching events.
A. Effect of SSN in Digital Circuits
In digital circuits SSN may cause delay error, false switching, and erroneous storage. The delay error is caused by the supply bounce affecting the transition time, which, e.g., can cause violation of setup and hold times of the registers [15] . In false switching gate outputs that are not switching can change due to the SSN. A disturbance higher than the threshold voltage may be sufficient for an output node to change, or lead to an ill-timed clock [4] . Hence delay errors and false switching affect the registers adversely, which can result in erroneous storage of data.
B. Effect of SSN in Analog Circuits
In mixed-signal circuits the digital and analog circuits share a common substrate. Hence noise may easily be transmitted from the noisy digital circuits to the sensitive analog circuits. The noise is received by analog substrate contacts and signal nodes through resistive and capacitive coupling, which reduces the circuit performance by direct injection or indirectly by altering transistor threshold voltages [16] . To reduce problems with SSN, the circuits are commonly designed with separate analog and digital power supply lines, physical separation between noisy and noise prone components, and differential analog circuit architecture [17] , [18] .
Analog circuits are in general more sensitive to substrate noise than to classical analog device noise sources (i.e., thermal, flicker, and shot noise). The substrate noise can easily be orders of magnitudes larger than the device noise. Therefore, substrate noise severely degrades the performance of analog circuits. For instance, the substrate noise can result in degradation of the signal-to-noise ratio (SNR) and spurious-free dynamic range (SFDR).
Some examples on components that are severely affected by SSN are amplifiers, phase-locked loops (PLLs), and data converters. In amplifiers the noise increases signal distortion and phase errors, which adversely affect linear filters and demodulators in the system. In PLLs, the voltage controlled oscillator (VCO) is particularly sensitive to substrate noise, causing jitter in the PLL output [19] , but other components in the loop filter may also be affected adversely [20] . In flash analog-to-digital converters (ADCs) the most important effect is the limitation of the minimum signal level that can be distinguished from noise. Consequently the SSN limits dynamic range and defines the maximum converter resolution [21] . SSN also introduce timing jitter in the clock generators of switched capacitor circuits [22] , [23] , which are, e.g., used in ADCs.
C. SSN Reduction Methods
There are several methods employed by circuit designers for reduction of the digital switching noise. In this section, we briefly discuss the most common methods.
To reduce the on-chip to off-chip inductance many power supply bounds wires can be used. However, just multiplying the power supply bond wires will not decrease the SSN efficiently. The designer needs to strategically place power supply interconnects so that the currents are in opposite directions in adjacent interconnects [3] . Double bonding techniques are used to reduce on-chip to off-chip impedance. The idea is to use two bonding wires instead of one from the on-chip pad to the off-chip interconnects, which will reduce the parasitic resistance to half [24] , [25] . Selecting a package with reduced impedance will result in lower SSN, but higher cost [26] .
On-chip decoupling capacitors are commonly used to create a low impedance path that reduces the voltage fluctuations on the power supply lines [27] - [29] . Reduction of the main peak in the power supply rails may be achieved by adding an RLC circuit where the three components are connected in series. This circuit should place the resonance at the same frequency as the impedance peak of the power supply, which will reduce the impedance peak [29] . In reduced supply bounce CMOS logic the digital circuits are implemented in static CMOS logic together with a simple circuit for reducing the SSN on supply lines. In this circuit, transistors are connected in series with the power supply, serving as resistors to damp SSN [30] .
Reducing the power supply voltage reduces the effective gate voltage of the MOS transistors, which leads to smaller drain currents. Hence SSN is reduced, but then speed is also reduced. Timing and sizing of output buffers may be designed in such a way that the propagation delays of the different buffer stages are different, which will reduce the current peak [31] , [32] . Another approach is to design a circuit that bias the output to the inverter threshold voltage, and load the output inductively [33] . In bus drivers, data can be coded to reduce SSN [34] .
Logic styles aimed at drawing a constant current can be used such that the power supply currents are as constant as possible. The major drawback is high power consumption [35] - [37] . Asynchronous circuits distribute the switching noise in time and frequency more than synchronous circuits. However, asynchronous circuit design is less supported than synchronous circuits design, causing the design to be more challenging [38] .
Ascertaining clock skews on a chip can smoothen the resulting current peak, which results in a lower SSN [39] . The clock frequency can be varied slightly to spread the energy of the clock frequency and its harmonics over a wider bandwidth. This technique does however require extra care in designing the interface between clock domains to prevent violation of the timing constraints.
III. NOISE REDUCTION METHOD
The main idea to reduce substrate noise in this work is to use digital circuits that obtain a periodic supply current with the same period as the clock. Consequently the frequency components of the SSN will be located on and above the clock frequency. This property is achieved by eliminating the data dependency of the current draw through symmetric circuit design.
The concept is similar to the reduction of power supply dependency on data in cryptographic system design, which has the aim of preventing differential power analysis attacks [40] . The main difference is that our application requires a current waveform between clock periods as identical as possible, while imperfections in cryptographic hardware can be compensated with other techniques such as random masking [41] and current flattening [42] . The proposed approach does also differ from the methods discussed in the previous section, which focus on reducing SSN in the time domain, while this work focuses on reducing SSN in the frequency domain.
For combinational logic we propose the use of a precharged DCVS 2-to-1 multiplexer (MUX). All internal nodes are precharged before evaluation to equalize the distribution of charge within each differential node pair, which is necessary to reduce the data dependency of the current. A novel differential DFF circuit is proposed that operates with the precharged DCVS circuit. The latches in the DFF have two symmetrical stages, where the precharged output stage is used to asynchronously reset the input stage. RZ signaling is used between logic and latches. Hence a single output node is always charged and discharged during a clock period, enabling the same current draw due to symmetrical differential loads and branches.
A. Return-to-Zero MUX
The combinational logic is implemented with 2-to-1 MUXs only since they are good candidates of yielding periodic power supply currents and can realize any function. The MUX is implemented in precharged DCVS logic [43] according to Fig. 1 . In this circuit the pull down network is symmetrical. The input consists of select signal , data signals , and (with complements , and , respectively), and clock signal . During precharge phase the output nodes and are discharged, and during evaluation phase the proper node rises such that the output is complementary. Transistors and precharge the intermediate node pair and to avoid a data-dependent power supply current. If and are excluded there will be a voltage difference between and after the precharge phase, which will make the current through the circuit input dependent.
B. Return-to-Zero DFF
The differential DFF is designed using two latches connected in a master-slave configuration. The slave uses the true phase of the clock while the master uses the complement of the clock. A special property of the DFF used in this work is the forcing of the outputs to zero during the precharge phase. Within the circuit all the differential nodes always return to the initial voltages in the precharge phase. In this way it is possible to achieve periodic currents on the power supply lines.
The proposed latch, shown in Fig. 2(a) , has complementary signals. The function of the latch is explained below with the help of the timing diagram in Fig. 2(b) .
Initially the signal nodes and are high, the clock is low and nodes and are high. The input goes from low to high and remains low. This event switches on and the node is discharged, which consequently switches on . When the clock goes high the input nodes and return to zero. At the same instant is switched on, causing to discharge through . As soon as starts discharging, will slowly start to conduct which will result in charging of node after delay, and at the same time will start to rise. The transistor sizes are selected so that discharging of is always completed before is charged to the threshold voltage of the inverter . The nodes on the complementary side remain unchanged during these transitions.
When the clock goes low and are precharged, and the output nodes and return to zero. If instead the input remains low and goes high, the circuit will behave in a similar manner, but with the events occurring on the complementary side. There will be no difference in current draw compared with the complementary case since the circuit is symmetric.
Note that the input can only change without causing a conflict in the input stage when the clock is low. Ideally when clock goes from low to high, the input nodes and should return to zero immediately. In a real circuit there will however be some delay, resulting in a small period when the node is erroneously partially discharged before switches off the transistor . Careful sizing of the transistors is required to prevent glitches due to this delay.
IV. TEST CHIP
For evaluation of the noise reduction method, two versions of a pipelined 16-bit ripple carry adder (RCA) were designed using the Cadence toolset. This test circuit was chosen for generality where the behavior of the simple structure is easy to comprehend and draw conclusions from, and the incorporated full adders (FAs) and registers represent the most common building blocks of a digital circuit. Each pipelined 16-bit adder consists of two 8-bit RCAs and pipeline registers according to Fig. 3 . A simple model of the power supply used for simulation of SSN is also shown in Fig. 3 .
The proposed precharged DCVS MUX and RZ register circuits (PDCVS-RZ) were used for one version of the adders. For reference, the other adder version used conventional static CMOS logic and TSPC register circuits (SCMOS-TSPC). The implementation was done in a standard 130-nm epitaxial CMOS technology (i.e., a heavily doped substrate with a thin lightly doped layer on top). The epitaxial layer provides deterrence against latch-up problems in CMOS circuits and improves doping control. The substrate is biased with ordinary substrate contacts within the cells, and the noise is measured via the bottom plate of the package. The bottom plate is glued to the backside of the substrate with conductive glue, yielding an average of the total substrate noise. In the following, we discuss the design of the adder circuits and the circuits used for test and control. 
A. PDCVS-RZ Adder
The first adder is implemented with the proposed approach using the PDCVSL MUX with precharging of the intermediate nodes and the RZ DFF circuits. In the adder MUX-based logic is employed, where each FA consists of three MUXs connected according to Fig. 4 . Since differential logic is used, the number of input and output lines of the 16-bit adder is doubled compared to the conventional adder circuit.
Referring to Fig. 1 , the widths of the transistors in the MUX are 0.45 m for to , and 0.15 m for and . For the RZ latch, shown in Fig. 2(a) , the transistor widths are 0.15 m for to , 0.30 m for and , and 0.15 m for . All channel lengths are at the minimum 0.13 m. The total transistor count for the proposed adder is 1190.
B. SCMOS-TSPC Adder
The RCA in the static CMOS 16-bit adder is implemented with the static CMOS logic FA shown in Fig. 5(a) and the TSPC DFF circuit shown in Fig. 5(b) [44] . The switch nets of the sum stage in the FA are optimized, requiring four transistors less than a standard FA [45] 
C. Test Circuits
To test the PDCVS-RZ adder and the CMOS circuit we implemented on-chip test pattern generators and control circuitry in order to reduce external noise sources during measurement. The on-chip generators are operating at half the system clock rate to relax the design requirements. Fig. 6 shows the block level schematic of the complete on-chip circuitry. The main purpose is to measure the SSN in the substrate due to the PDCVS-RZ adder and the SCMOS-TSPC adder individually.
The circuits are tested by selecting the data generated from either a pseudo-random binary sequence (PRBS) generator or a test vector ROM containing a sine wave pattern, and supply it to the PDCVS-RZ adder or the SCMOS-TSPC adder one at a time. When the test inputs are fed to the PDCVS-RZ adder, then the SCMOS-TSPC adder receives a constant zero at its input. The unused adder is shut down completely, which is achieved by having separate power supplies for the adders. Furthermore are the PRBS, ROM, and the select logic circuits connected to their own power supply line, and the output buffers are supplied with a separate power line. There are also options to disable the output buffers and the clock of any adder. The purpose of this arrangement is to measure the noise without the influence of output buffers. Below we briefly discuss the main on-chip components used for test.
1) Test Vector ROM:
An on-chip ROM with enable signal was designed to store test patterns for the inputs. The size of the ROM is 8 7 bits and the data content is generated by using the equation
We have only stored the data for . The data for is generated by right-shifting the data content two times. The original data was 16 bits wide, but careful inspection of the data and some optimization allowed us to compact it to 7 bits for storage.
2) PRBS: Two pseudo random binary sequence generators (PRBS) with enable signals are used to generate the randomlike input data patterns [46] , [47] . The first PRBS generates a pseudo random sequence of length , and the second PRBS generates a pseudo random sequence of length . The two generator polynomials are and , respectively [47] .
3) Input Data Selector: The input data selector circuit selects data generated from either the PRBS or the test vector ROM, and supply it to the PDCVS-RZ adder or the SCMOS-TSPC adder. This circuit is implemented with one MUX and one demultiplexer.
4) Clock and Output Enable:
We control clock and output for the PDCVS-RZ and SCMOS-TSPC adders with enable signals.
5) Differential to Single-Ended Signal Converter:
The differential to single-ended signal converter converts the RZ differential data generated by the PDCVS-RZ adder to the single-ended non RZ signal used in the chip output. For this purpose, we have used TSPC latches.
D. Floor Plan
To control the symmetry of circuits and signals, the floor planning of the test chip was performed manually. The adders are placed in the middle and parallel to each other. The remaining logic is placed around the two adders as shown in Fig. 7 .
E. Layout and Routing
The PDCVS-RZ circuit should be laid out as symmetrically as possible with respect to the differential signals in order to reduce the low frequency SSN. This requirement imposed manual layout and route of the circuit due to the lack of support for differential symmetric routing from automated design tools. To enable a fair comparison between the adders we designed all circuits using the same manual approach. The chip uses five different power supplies, where the first is used for ESD protection, the second for output buffers, the third for test circuits, the fourth for the reference adder, and the fifth for the proposed adder. Decoupling capacitors are added in the unused area of the core and connected to the five power supplies with values 90, 310, 270, 650, and 650 pF, respectively. They consist of parallel-connected unit capacitors of value 250 fF designed using the gate capacitance of a transistor and the parallel plate capacitance obtained by stacking and connecting every second layer of the six available metal layers.
In the interface to the environment, we used vendor-supplied pads with standard ESD protection. Output buffers required to drive the pads are placed close to the pad frame, i.e., distant from the adders to reduce their noise contribution.
Cells within a block are laid out in rows with the same height in a row. The supply lines are placed horizontally at the cell bottom and top in metal 1, and can hence be shared between two rows. Within a cell, metal 1, 3, and 5 are used as horizontal routing layers, while poly, metal 2, and metal 4 are used as vertical routing layers. The transistors are placed close and are sharing the source and drain of two transistors wherever possible. Large transistors are laid out using several parallel-connected smaller transistors to maintain the cell height. Substrate contacts are used to bias the substrate and wells of every cell.
The layouts of the PDCVS MUX and the RZ latch are shown in Figs. 8(a) and 8(b) , respectively. The respective widths of the cells are 10.7 m and 14.4 m, and the height is 4.0 m for both cells.
The layouts of the static CMOS FA and TSPC DFF are shown in Fig. 9 to 147 W and 66 W for the PDCVS FA and SCMOS FA, respectively, at a clock frequency of 500 MHz and maximal switch activity at the input.
V. RESULTS
In this section, we present results obtained by simulation as well as measurement of the different designs on a chip. The simulation results are concluded from a prestudy to this work [13] , which are based on transistor-level simulations using a sinusoidal input, a nominal power supply voltage of 1.2 V, a clock frequency of 500 MHz, and an example model of a power supply. In the measurements, a similar sinusoidal input is generated on-chip, the same nominal supply voltage of 1.2 V is used, and the clock has a frequency of 300 MHz for which both adders work well. A microphotograph of the test chip is shown in Fig. 10 . The chip has 40 pins in total and is bonded in a JLCC44 package. The die area is 1.0 mm and the area of the core visible in darker gray in the center is 0.074 mm .
A. Simulation Results
The simulation setup is shown in Fig. 3 . For modeling of the power supply, the RLC circuit with , , , , , and is used. In the model of the interconnect impedance between on-chip and off-chip the resistances 1 Ohm and inductances 1 nH are used, serving as an example of a nonideal power supply. A capacitor of 2 pF and a series resistor of 10 Ohm are used to model on-chip decoupling. All the circuits in the shaded box are assumed located on-chip. The main purpose with the simulation is to determine whether the proposed method has the potential of reducing substrate noise. The simulation example does not however model the real test chip in terms of substrate, package parasitics, board level parasitics, etc.
A clock frequency of 500 MHz is used and is set to 1.2 V. A frequency band of 0 to 225 MHz was considered in a comparison of critical noise components. The schematic was created in 130-nm CMOS technology using Cadence and the simulations were performed using Spectre. When testing the circuits two sinusoidal inputs were applied to the pipelined adder. The power spectral density of the on-chip ground line of the PDCVS-RZ circuit had a largest frequency component of 47 dBm/Hz at a frequency of 187 MHz, while the SCMOS-TSPC adder had a largest frequency component of 30 dBm/Hz at 125 MHz. A difference of 17 dB/Hz in the largest frequency component power was achieved between the two implementations, estimated at the on-chip power supply ground lines. The power consumption was estimated to increase with 200% for the PDCVS-RZ adder over the SCMOS-TSPC circuit in [13] . These results show that the proposed method has the potential of reducing the noise below the clock frequency. However, the performance improvement limits under process mismatch constraints and technology scaling were not investigated, which would be a very interesting and important continuation to this work.
B. Test Setup
The test setup is illustrated in Fig. 11(a) . It consists of a test PCB, a dc power supply, a Marconi 2024 signal generator for clock generation, an HP 16500C logic analyzer for validation of functionality, and an HP 8562E spectrum analyzer for measuring substrate noise. The PCB contains a socket for the packaged chip, decoupling capacitors for power supply, SMA connectors for clock and measurement nodes, termination and bias network for the clock, and pin lists for the digital control and output. Fig. 11(b) depicts the connection of the substrate to the chip package. The substrate noise spreads in the highly conductive layer of the silicon substrate, which is attached to a metal bottom plate and bonded to a separate pin on the package. The pin is routed on the PCB to an SMA contact that is connected to the spectrum analyzer via a 0.5 m long 50-Ohm cable. The shield of the cable is connected to the PCB ground plane. This connection yields a well-terminated transmission line on the receiver end.
All measurements are performed at a supply voltage of 1.2 V. The functionality is validated and the maximum operating frequencies of the PDCVS-RZ and the SCMOS-TSPC adders are 450 and 375 MHz, respectively. The noise measurements presented in the subsequent sections use a common operating frequency of 300 MHz for which both adders work well with some margin.
C. Measurement Results Using Sinusoidal Input
The two adder circuits on the test chip are measured one at a time using a spectrum analyzer connected to the bottom of the substrate. The clock frequency is 300 MHz, which is the major component of about 25 dBm/Hz in all measured spectra. There is also a tone at 13 MHz of about 70 dBm/Hz in the spectra that originates from the GPIB interface between the spectrum analyzer and the computer. This tone is marked with a gray color in the plots and is omitted from the discussion since it is only present during transfer of results. Below we compare the power of the remaining noise components in a frequency band of 0 to 290 MHz. Fig. 12(a) shows the PSD of the noise at the substrate node when the PDCVS-RZ adder is operating with sinusoidal inputs generated by the ROM. For this measurement, the outputs are disabled and the frequency spectrum shown is up to the clock frequency. The frequency band of interest and the power of the largest frequency components are indicated with the dashed lines. It can be seen that the frequency spectrum has two large components of 71 dBm/Hz for the band of 0 to 290 MHz. The components are situated at 50 and 150 MHz, which are the second harmonic of the sinusoidal inputs and half the clock frequency, respectively. The power consumption is 2.5 mW in this configuration. Fig. 12(b) shows the corresponding spectrum of the SCMOS-TSPC adder. It can be seen that the noise has the main components at the fundamental frequency 18.75 MHz of the sinusoidal inputs and its harmonics. The largest noise components are the first and third harmonics with a PSD of about 63 dBm/Hz. The second harmonic has a PSD of 65 dBm/Hz, the fundamental tone 66 dBm/Hz, and the fourth harmonic 70 dBm/Hz. Note that this set of tones is more harmful to most analog circuits than the corresponding smoother PSD of the PDCVS-RZ adder. There is also a small component at half the clock frequency, which could be due to nonlinearities in the clock circuitry and noise from the test pattern generator. The power consumption is 0.63 mW for this case. Fig. 12 (e) and (f) show the corresponding spectra for a wider frequency band of 0 to 1 GHz. Above 300 MHz we have the first and second clock harmonics with PSDs of 62 dBm/Hz and 73 dBm/Hz for the PDCVS-RZ adder, respectively, and for the SCMOS-TSPC adder both harmonics have PSDs of 61 dBm/Hz.
D. Measurement Results Using Pseudo-Random Input
In this section, we present and compare the measurement results of the PDCVS-RZ adder and the SCMOS-TSPC adder using pseudo-random inputs. The circuits are measured with a clock frequency of 300 MHz, which results in a major component of about 25 dBm/Hz in the spectra. As in the previous case there is tone from the GPIB interface of about 70 dBm/Hz at 13 MHz. Again we will compare the frequency components of the noise in the frequency band of 0 to 290 MHz, omitting the GPIB tone from the discussion. Fig. 12(c) shows the frequency spectrum of the noise at the substrate node when the PDCVS-RZ adder is operating with inputs from the PRBS generators. The outputs are disabled and the frequency spectrum is shown up to the clock frequency. The frequency band of interest and the power of the largest frequency components are indicated with dashed lines. For this case, there is only one component of about 73 dBm/Hz at 150 MHz, i.e., half the clock frequency. The power consumption is 2.4 mW. Fig. 12(d) shows the corresponding spectrum of the SCMOS-TSPC adder. For this case, there is only one noise component of about 68 dBm/Hz, also at half the clock frequency. There is also an elevated noise floor of about 4 dB in the low frequency range up to 50 MHz compared with previous case. The power consumption is 1.1 mW. Fig. 12 (g) and (h) show the corresponding spectra for the wider frequency band of 0 to 1 GHz. The PSDs of the clock harmonics are about 68 dBm/Hz for the PDCVS-RZ adder, and 64 dBm/Hz and 58 dBm/Hz for the first and second clock harmonics of the SCMOS-TSPC adder, respectively.
E. Performance Summary
The main results are presented in Table I . By using a clock frequency of 300 MHz and a sinusoidal input, the largest frequency components of the substrate noise in the range 0 to 290 MHz is 71.2 and 62.7 dBm/Hz for the PDCVS-RZ adder and SCMOS-TSPC adder, respectively. The corresponding results using a random input are 73.0 and 67.5 dBm/Hz, respectively. Hence the performance is limited by the SCMOS-TSPC adder in the sinusoidal input case. For this case, the difference in largest frequency component of the two adders is 8.5 dB/Hz. This improvement comes with the costs of a three times larger average power consumption and an increased area consumption of 35% for the PDCVS-RZ adder relative to the SCMOS-TSPC adder. Comparing with simulations in the prestudy, the obtained noise reduction is smaller than estimated while the power consumption agrees with the estimate. 
VI. CONCLUSION
A method of reducing the switching noise in the substrate of an integrated circuit was proposed. The method is based on the idea that digital circuits drawing a periodic supply current with the same period as the clock should ideally have reduced switching noise in the frequency range below the clock frequency, compared with conventional design. This property is obtained by using symmetrical circuit and layout styles, and differential RZ signaling in the digital circuits. Precharged DCVS logic and a new differential RZ DFF was workable for this purpose.
A test chip containing two versions of a 16-bit pipelined ripple-carry adder was fabricated in 130-nm CMOS bulk technology. Proposed version was realized with the new method and a reference was realized with conventional techniques for comparison. The conventional implementation used static CMOS logic and TSPC registers.
A logic analyzer was used to validate the functionality of the adders. The maximum operating frequencies were 450 and 375 MHz of the proposed and conventional adder, respectively. Both implementations are working on a power supply voltage of 1.2 V.
For comparison of switching noise, the adders were tested in the range of 0 to 290 MHz using a clock frequency of 300 MHz. The major on-chip source of noise is the output buffers, which were disabled during the noise measurements. The largest frequency components of the substrate noise were observed when the adder circuits were tested with a sinusoidal input. In this major case, which limits the performance, the largest frequency components of the proposed adder circuit were 8.5 dB/Hz lower compared with the conventional circuit. In another nonlimiting case, the adders were tested with pseudo-random inputs, which reduced the largest frequency component of the proposed adder with 1.8 dB/Hz compared with the sinusoidal case. For the conventional adder, the largest frequency component was correspondingly decreased with 4.8 dB/Hz, resulting in a net figure of 5.5 dB/Hz to the advantage of the proposed adder. We conclude that the new technique works as expected in spectral terms, however the level of noise is not as good as indicated in a prestudy. The main costs of the proposed adder circuit are three times higher power consumption and 35% increase in area compared with the conventional adder circuit.
