Abstract-To achieve a 0.5-V low-power high-speed robust bus, a dynamic bus architecture, combined with a dynamic driver and a dynamic receiver for small leakage current with stacked MOSFETs, is proposed. In particular, the dynamic driver enables high speed even at 0.5 V with increased gate over-drive by changing the power lines from VDD/2 in the standby mode to VDD in the active mode. It further speeds up with the help of another proposal of a dummy bus for tracking the bus-voltage detecting point for reducing the bus swing. Robustness of each proposal is investigated by Monte Carlo simulation. Then, a O.S-V 2S-nm-FD-SOI 32-bit bus architecture using the proposals is evaluated by simulation. The power-supply bounce noise and the reduction are also investigated here through the layout. As a result, it turns out that the architecture has a potential of operating a I-pF bus at a SO-m V swing, 1.2 GHz, and a standby current of 1.1 /lA, with x3-S faster and more than two-order lower standby current than the conventional static architecture.
Introduction
Low-power bus architectures for intra-and inter-chip communication have been strongly needed to cope with the ever-increasing interconnect power dissipation caused by larger interconnect capacitances [1] . Indeed, data-bus lines, address lines, control lines, and precharge lines running across a chip inevitably increase the capacitances with larger scale integration of chips. In addition, the number of the lines, especially for data buses in RAMs and MPUs, increases with the ever-stronger need for higher throughput, exemplified by as large as 1024 [2] for 3D integration such as Through Silicon-Via (TSV)-3D chips. Such buses will continue to be indispensable even with power-supply (VDD) and device scaling, calling for new circuits as well as low-capacitance interconnect technology. The most effective way is to lower the voltage swing on the bus. In line with this, many attempts have been made so far despite some drawbacks (such as limited reduction of the swing, and complicated and large area) being involved. Good examples are a flip-flop receiver [3] , current-mode signaling [4] , and differential buses [5] . Of these, N.C. Svensson's static bus architecture [6] is simple and most promising with respect to low-voltage-swing capability thanks to an inverter-amplifier in a receiver. Even so, the basic characteristics and design issues remain unknown, although optimization of the power dissipation was investigated. The 978-1-5090-1213-8/16/$31.00 ©2016 IEEE most serious issue of the attempts, however, is the fact that they were evaluated at VDD higher than 1 V and with device feature sizes larger than 0.13 /lm. Obviously, considering that the state-of-the-art VDD and devices for SoCs are 0.6 V and 14 nm even at the commercial base [7] , the bus architectures must keep up with the scaling because they will be eventually embedded in such SoCs. Hence, the ever-increasing subthreshold leakage (off) current with reducing r't and degraded robustness by Vcvariations with device scaling [8] are emerging issues for the bus design. Well-known stacked MOSFETs for low-leakage current are thus indispensable. Moreover, the use of fully-depleted SOl MOSFETs [9] is vital for smaller Vcvariations. Note that using bulk MOSFETs may be intolerable, considering the standard deviation (crr't) of the local r't-variation [S] given as crr't = Av/(L W)1I2(Avt; the Pelgrom coefficient). This is because the necessary channel width Ws are expected to be almost quadrupled for a given crr't. Here, as for the global Vcvariation, which influences speed, the so-called back plain control by detecting the average Vt [10] is effective. In addition to the above scaling issues, to the best of our knowledge, multi-bit bus architectures, indispensable for data bus in DRAMs and embedded SRAMs in SoCs, remain unreported.
In this paper, first, design issues of Svensson's conventional static bus (S-BUS) with the inverter-amplifier are clarified. Then, a dynamic bus architecture (D-BUS) to cope with the issues is proposed. It consists of a dynamic driver, a dynamic receiver, and a dummy bus for smaller bus swing for high speed and low power. Some of them are then evaluated in terms of speed and robustness by Monte Carlo simulation. Noise issue is also briefly discussed. Finally, a 32-bit bus architecture using the proposals and 2S-nm FD-SOI MOSFETs is evaluated by simulation and the layout. Fig. 1 shows the conventional S-BUS [6] . The bus consists of a gate-source offset-driven CMOS inverter [3] for a driver (DRV), and amplifiers with symmetric CMOS inverters (i.e., with WvI�, == 2.5) for a receiver (REe), each trip point of which is set to VDD/2 for maximizing the amplification. DRV operates at power supplies of (VDD/2 + Vs) and (VDD/2 -Vs), so the bus is biased at VDD/2 and swings at signal components ± Vs that corresponds to the input IN data "1"(H) or "O"(L). The first inverter IV1 amplifies and discriminates ± V" referring to VDD/2. For example, at Vs = 50 mY, a signal component +50 mV superposed on the VDD/2 (point m) is amplified and a low enough voltage (point I) is thus outputted at OUT 1. In contrast, another signal component -50 mV is amplified and a high enough voltage of about 0.5 V (point h) is available at OUTj• There are two major issues, namely, extremely slow speed of DRV due to the power-supply setting of VDD/2 ± V" and large DC current of REC due to the static operation. For example, the gate-over-drive (GODd, namely, VGS -Vt) of DRV MOSFETs when driving BUS is as low as 0.046 V at VDD = 0.5 V, Vs = 50 m V, and r't = 0.254 V (i.e., low r't (L VT)) because it is given as VDD/2 + Vs -r't. Such small GODd also makes the speed quite sensitive to r't-variations. The large DC curr ent comes from operations at small gate voltages of about VDD/2. Fortunately, the succeeding inverters operating at higher VGS with more amplified input voltage consume much small current. DRV itself also consumes a small enough current, because the Vt is effectively a regular Vt (RVT = 0.454 V) that ensures a low enough off-leakage current, since the source of the L VT MOSFET is at 0.2 V. ! , Fig. 4 Although D-DRV has larger Von(DRV) due to additional small Md and Ms, the drawback is offset by larger GOD. Indeed, as seen in Fig. 4(b) , the signal developing time tn the precharge time t f , and thus tcmin, defmed at Vs = 50 mV and Vn = 5 mV (noise) for "0" input, as in Figs. 3(c) and (d), are shorter with more reduced variation than those of S-DRV. Fig. 5 shows the signal characteristics. Obviously, tf becomes longer with increasing V" as mentioned before, calling for reduction of the necessary Vs for fast cycle. The reduction strongly influences the cycle time, since BUS is slowly driven due to a large CB• As for tf, it becomes longer with Vn for a given Vs of 50 mY, due to need for longer pre charging time. Here, Vn works as noise for the next cycle with the incomplete precharging voltage. Thus, faster BUS pre charging is necessary for faster cycle with the help of the accelerator ACC.
Design Issues of Conventional Static Bus
, ___ �� �'! �_r J I? _�Y:! _, , ________________ � �_<:. � !y_-:!( �_ �ft ___________ _ i V oo i +Vs i VOD-UI VDD-U2 VDD ; 2 £ ················· r �· ·· j . . . . . . . . . . . � . . . . . . . . . . � . . . .
Proposed Dynamic Bus Architecture
Here, let's investigate the robustness of a BUS with W(Mnd) = IOx80 nm, when a 841-ps-cycle pulse (P) with an active pulse width tw of 353 ps and an off-pulse width t off of 488 ps is provided by the dummy bus (DM-BSA), as discussed later. Since tw and t off are almost equal to the BUS-driving speed tf and pre charging speed t f , respectively, the BUS generates a Vs of 37 mV and a Vn less than 5 mY, as seen in Fig. 5 . Hence, if the total noise including other noise components is less than 32 mV (= 37 -5), the BUS-stability is ensured. 
B. Dynamic Receiver (D-REC)
It uses the same circuit and L VT MOSFETs as S-BUS, except for the dynamic operation. Major concerns here are the large DC current and the offset voltage Voff(REC) of IV), both of which are not imposed on the succeeding stages due to amplified gate-voltages. The current is reduced by a dynamically-operated stacked-MOSFET (e.g., Mnst in Fig.  6(a) ). In this circuit, while the pulse P is at 0.5 V at the first half of cycle, Mnst turns on for active operation with a quite large current due to operating at about VDD/2. At the second half of cycle with P at 0 V (precharge period), the current is reduced to a negligible level (e.g., from 8/-lA to 31 nA, see Fig.  8( c) ) that is equal to a small subthreshold current of Mnst. Hence, the current is almost halved in the active mode at a 50% duty cycle, while negligible in the standby mode. The more detail will be discussed later. As for Vofrt:REC), Fig. 7 depicts the IVj-output and variations vs. W(Mn), taken by Monte Carlo simulation. Vofrt:REC) is the deviation from VDD/2, as mentioned previously. For example, for W(Mn) = W(Mnst) = IOx80 nm, Vout! ranges from 425 mV to 93 mV (see Fig. 7(a) ). They correspond to 225 m V and 270 m V if converted to the input (i.e., BUS) (see Fig. 7(b) ), implying that Vofrt:REC)s are 25 mV and +20 mY, respectively. Since Vofrt:REC) is the 
C. Dummy BUS (DM-BSA)
Dummy bus eventually governs performances of the whole architecture explained later, since the above-described bus operates under the control of dummy bus. the inverters in D-BUS. Note that PN and PF correspond to P in the previous figures (e.g., Fig. 2) . Separated RECDs at the near end and the far end make the design of the architecture simple and robust, enabling to cancel the bus delay effect. Hence, the one at the near end can generate pulse PN to control DRVs at the near end (see Fig. 11 
Application to 32-bit Buses
A. Architecture and Performances , an input pattern of "0" "1" "0" "1" starts to be discriminated at point S at each cycle. In addition, since the output (OUTo) becomes invalid at point T when it starts to be B. Noise Reduction However, it may quickly recover if HVG enables to detect the fluctuation, followed by feedback for stabilization [13] .
Ideally, the (Voo + V ss )/2 level of each PSC, ACC, and REC must be always equal statically and dynamically to the Voo/2 C. Summary 
Conclusion
To achieve a O.S-V low-power high-speed robust bus, a dynamic bus architecture, combined with a dynamic driver and a dynamic receiver for small leakage current with stacked MOSFETs, was proposed. In particular, the dynamic driver The power-supply bounce noise and the reduction were also investigated through the layout. As a result, it turned out that the architecture has a potential of operating a l-pF bus at a SO m V swing, 1.2 G Hz, and a standby current of I. I \-lA, with x3-S faster and more than two-order lower standby current than the conventional static architecture.
