ABSTRACT This paper presents an energy-efficient level shifter design that is capable of converting an extremely low input voltage to the supply voltage level. Featuring a core area as compact as 32.99 µm 2 , the proposed design comprises a front-end current mirror and an output cross-coupled structure. Concretely, the current mirror is used to boost the complementary input signals to a proper level, with the operations of pull-up and pull-down networks well-balanced. The cross-coupled structure is used at the second stage to achieve a full-swing output. In addition, multi-threshold CMOS transistors are employed and optimized to elevate the circuit performance. The prototype was fabricated using a commercial 65-nm CMOS process. Experiments show that a 90-mV low input voltage can be successfully converted to 1.2 V. The energy consumed per conversion is reported to be 31.47 fJ for converting a 0.2-V input to 1.2 V at 10 MHz, and the corresponding propagation delay is measured to be 23.98 ns.
I. INTRODUCTION
Multi-supply voltage technique has been widely adopted in modern low-power circuit design, which partitions a chip to various power domains with different supply voltages [1] , [2] . For instance, a near/sub-threshold voltage (V DDL ) is applied in the domains with non-critical signal paths to minimize the power consumption; while a higher voltage (V DDH ) is exploited for the domains with critical paths to maximize the speed (i.e. performance). As a result, for multi-supply voltage circuits, level shifter is an indispensable component that converts between different voltage levels. Driven by complicated power supply separations and deep data width nowadays, a significant increase in the demand of on-chip level shifters has been witnessed. It becomes increasingly essential to minimize the level shifters' power consumption, silicon area and propagation delay.
Conventional level shifters can be categorized to cross-coupled and current-mirror structures. As shown in Fig. 1 (a) , a pair of PMOS transistors (MP1 and MP2) is adopted to form a positive feedback and achieve full output swing for the cross-coupled level shifter. However, when the input voltage is below the threshold voltage of the NMOS transistors, the driving strength of the NMOS pair (MN1 and MN2) becomes much weaker than that of the PMOS pair, leading to failure of the logic toggling. It can be addressed by upsizing the NMOS transistors but at the expense of large silicon area, high power consumption and delay degradation. On the other hand, the level shifter based on the currentmirror structure is presented in Fig. 1 (b) , which utilizes a current mirror to speed up the conversion and reduce the minimum input voltage with a relaxed contention between pull-up and pull-down networks. Nevertheless, it suffers from large static power consumption, because of the current flowing through MP1 and MN1 with high input voltage. A number of implementations have been presented to address the aforesaid contention-related issues [3] - [8] . In [3] , a feedback PMOS is used to cut off the standby current once the transition is finished. However, the feedback PMOS can introduce a voltage drop at the internal output node for high logical output, which leads to a large leakage power in the output buffer. Meanwhile, it employs a large number of transistors and has a higher complexity in the control mechanism. More recently, a level-shifting capacitor is used to boost the output voltage [7] . In [8] , a current generator is adopted, which allows the current to flow only during transition. The drawback is that the pull-down network is still directly driven by V DDL , which has the same problem as the conventional cross-coupled level shifter.
This paper presents a high-performance level shifter design with lower input voltage, lower energy consumption per conversion, and shorter propagation delay. It is a compound design with both current-mirror and cross-coupled structures. Specifically, the current mirror is used to amplify and differentiate the input voltage, with the static power reduced by a proposed negative feedback. The cross-coupled structure at the second stage guarantees a full-swing output. Since the voltage at the input of the cross-coupled stage have been boosted by the current mirror of the first stage, the contention between the pull-down and pull-up networks is well-balanced even with an extreme low input voltage (e.g. lower than the NMOS threshold), leading to improved performance in terms of power consumption and propagation delay, etc. Moreover, multi-threshold (MTCMOS) CMOS transistors are employed to further optimize the design. Fabricated using a standard 65nm CMOS technology, the prototype chip can convert the lowest input voltage of 90mV to the supply voltage of 1.2V. Converting 0.2V input to 1.2V at a frequency of 10MHz, the minimum energy per conversion is measured to be 31.47fJ, and the propagation delay is reported to be as short as 20.4ns. The rest of this paper is organized as follows. In Section II, the design and principle of the proposed level shifter are presented. Measurement results are elaborated in Section III. Finally, the conclusion is drawn in Section IV.
II. PROPOSED LEVEL SHIFTER DESIGN
The schematic of the proposed level shifter is illustrated in Fig. 2 with the multi-threshold CMOS transistors adopted. The differential inputs IN and IN_NOT are generated by the low V th inverter buffer, leading to a low latency and a minimized allowable input voltage. In order to balance the contention of the following cross-coupled level shifter, the front current mirror is used to raise the differentiated voltages to right above the threshold of the low V th NMOS. A feedback PMOS (MP3) in the current mirror is utilized to eliminate the standby current. MP3 is implemented by a high V th PMOS transistor to further reduce its leakage current. The pull-down network consists of MN1, MN2, MN3 and MN4, all of which are realized with low V th transistors. The active current of the pull-down network is limited by high V th PMOS transistors MP4 and MP5. As presented in Fig. 3 , the operation principle of the proposed level shifter is further elaborated with the transient simulation waveform at each node. The low supply voltage VOLUME 6, 2018 V DDL and the high supply voltage V DDH are set to be 0.2V and 1.2V, respectively. In this simulation, the frequency of the input signal is 1MHz. While IN is high and IN_NOT is low, MN1 is turned on. The current I 1 flows through MP1, MP3 and MN1. This current is mirrored to MP2. As MN2 is off, the node A will be charged until MP3 is turned off. The standby current is reduced due to the feedback of MP3. On the contrary, while IN is low and IN_NOT is high, MN1 is turned off and MN2 is turned on. Node A will be discharged, while the voltage at node B will raise to V DDH − |V ds:MP1 |−|V ds:MP3 |, where V ds:MP1 and V ds:MP3 are the drain to source voltages of MP1 and MP3, respectively. As a result, before going to the cross-coupled stage, IN and IN_NOT have been boosted by the current mirror at the front stage. The boosted voltage should be optimized. It cannot be too high (the current mirror will need more power to level it up) or too low (the cross-coupled structure will require a much stronger pull-down network to ensure the correct conversion). IN and IN_NOT are boosted to ∼680mV based on the extensive simulations. Finally, the cross-coupled structure elevates the voltage to 1.2V, as shown in Fig. 3 . In this way, the input voltages of the cross-coupled structure become high enough to ensure that the drive strength of the pull-down network is close to that of the pull-up network, which is vital for a successful flip.
The design is optimized through the MTCMOS technique and subthreshold sizing. On one hand, MTCMOS technique is applied in the proposed design to leverage the contention issue. More specifically, the pull-up network is implemented with high V th transistors while the pull-down network is realized with the low V th transistors. On the other hand, sizing is a relatively weak knob in the subthreshold region as it only linearly affects the current. The V th of the transistors can be adjusted by the lengths because of the short-channel effects, heightening the impact of the contention balance and the current reduction. For example, the NMOS pair (MN1 and MN2) has a large W/L ratio to help the NMOS pair to pull the A/B to low, while longer length of the PMOS devices (MP4 and MP5) improves the ability of the NMOS to transition from low to high. Fig. 4 shows the nominal (TT), best (FF) and worst (SS) corner simulation results of the propagation delay and the total power dissipation of the proposed level shifter in the VDDL domain using 65nm CMOS technology at 27 • C. The optimized transistor sizes and the threshold voltages of used transistors are listed in Table 1 and Table 2 , respectively. The input is a clock at the frequency of 1MHz. It is observed the proposed level shift circuit works correctly at all the PVT corners for an input V DDL in the range of 0.1V to 1.2V. We also re-implemented [3] using the same 65nm CMOS process with the optimized transistor parameters under the same conditions (i.e. the typical PVT corner with V DDH = 1.2V, V DDL = 0.2V, f clk = 1MHz). The simulation results show the leakage power and total power for the proposed design are 11 times and 2.5 times lower than [3] while the propagation delay is comparable to [3] . The statistical performance of the proposed level shifter is evaluated with 1000 runs of Monte Carlo simulation under a 200mV input at 1MHz and 27 • C. Fig. 5 shows the propagation delay and total power consumption histograms of Monte Carlo simulation results. The fitted lognormal curves for the two histograms imply both the delay and the power consumption of the proposed design are lognormal distributed. In addition, the average delay is 21.56ns, with a standard deviation of 14.20ns. To eliminate the impact of the process variation, the average delay is also displayed using the FO4 delay at 0.2V, which is 2.23 FO4. The mean total power is measured to be 63.01nW, with a standard deviation of 15.46nW. The normalized standard deviation values (σ/µ) of the delay and the power consumption are 0.66 and 0.25, respectively.
III. EXPERIMENTAL RESULTS
The proposed level shifter design was fabricated using a standard 65nm CMOS technology. Fig. 6 shows both the microphotograph and the layout of the prototype chip with the core area as compact as 32.99µm 2 . Ten sample chips were measured at the room temperature (25 • C). Fig. 7 shows the measured input/output waveform of one prototype chip, where a 0.2V input signal (f clk = 1MHz) is successfully converted to 1.2V. To evaluate the power consumption of the proposed level shifter design, V DDL with 1MHz frequency was swept from 0.1V to 1.2V. As shown in Fig. 8 , the measured total power consumption is a function of V DDL (ten samples). A relatively low power consumption is observed with V DDL ranging from 0.2V to 0.7V; while a larger power consumption appears when V DDL is below 0.2V. This is due to the fact that the driven strength of MN1 and MN2 in the current mirror is limited by the small V DDL voltage, thus the node A or B at the input of the cross-coupled structure cannot be fully discharged. With a larger voltage for logic '0' at node A or B, a higher power dissipation is needed for the low V th transistor MN3 or MN4. Additionally, the power consumption increases when V DDL exceeds 0.7V. This is due to the high power consumed by the low V th input inverter buffer (MP6 and MN5). However, the measured power when V DDL is larger than 0.7V is different from the simulation result in Fig. 4 . This may be attributed to the parasitical capacitance and resistance which are not considered in the simulation model for the high V DDL domain. According to Fig. 8 , the average minimum power consumption was measured to be 0.12µW for converting the 1MHz input signal from 0.2V to 1.2V. Moreover, the leakage power of the proposed design was measured with varied V DDL as well. As plotted in Fig. 9 , the leakage power consumption is optimized with V DDL ranging from 0.2V to 0.7V, which is measured to be ∼1.6nW. With V DDL larger than 0.7V, the measured leakage power grows. The leakage current is caused by a short circuit current through MP2 and MN2 when IN = 0 and IN_ NOT = 1. Since the gate voltage of MP2 cannot reach a full V DDH , MP2 and MN2 are both partially turned on. This leakage can be reduced by increasing the length of MP2 and MN4 to enlarge the V th due to the short-channel effects, resulting in a small short current, or replacing MN4 with a normal or high V th NMOS. However, both methods will degrade the performance in terms of the propagation delay and the minimum input voltage. Another method to reduce the leakage is power gating. The sleep transistor can be used to reduce the leakage power at the expense of more silicon area and a voltage drop. The delay due to the output buffer is excluded from the measured propagation delay of the proposed level shifter. It is observed that the propagation delay decreases exponentially to V DDL , which means the proposed level shifter is able to work at a higher frequency with an increased V DDL . When V DDL is 0.2V, the mean propagation delay is measured to be 23.98ns. The measurement has been repeated at a lower temperature (i.e. 0 • C). The proposed level shifter is capable of operating properly even at 0 • C. The low temperature results in a smaller device current for the subthreshold operation. The delay at 0 • C is 1.51 times slower than that at 25 • C. Furthermore, for converting a 0.2V input to 1.2V, we characterize the power consumption and energy consumption per conversion with respect to the working frequency (Fig. 11) . The power consumption is observed to be proportional to the input frequency. The maximum working frequency for V DDL of 0.2V was measured to be around 16MHz and the average minimum energy consumption per conversion was measured to be 31.47fJ, corresponding to the working frequency of 10MHz.
The Figure of Merits (FoMs) of the level shifter are summarized and compared with the state-of-the-art implementations in Table 3 . The fabricated chips using processes with other feature sizes may not be directly comparable to the proposed design, here the two implementations using the same technology node are considered [4] , [11] . The measured propagation delay of 23.98ns, minimum allowable V DDL of 90mV and energy consumption per conversion of 31.47fJ in this work are among the best of the physical implementations. The occupied silicon area of this work is 32.99µm 2 . The leakage power is relatively large. However, in comparison with the design from [11] that was fabricated using the same technology (65nm), our proposed design consumes about 40% less leakage power.
IV. CONCLUSION
In this paper, a novel level shifter is designed and fabricated with high energy efficiency. Featuring a low allowable input voltage, the proposed design employs a current mirror at the first stage to amplify the input voltage. The full swing output is achieved by the following cross-coupled structure. A feedback PMOS is inserted to reduce the standby leakage current of the current mirror. We adopted MTCMOS devices to further reduce the circuit latency and power/energy consumption. Furthermore, the proposed level shifter, with its core area as compact as 32.99µm 2 , is fabricated using a standard 65nm CMOS MTCMOS technology. The measurement results show the energy consumption per conversion, minimum input voltage and propagation delay are 31.47fJ, 90mV and 23.98ns, respectively. The proposed implementation is suitable for a wide range of applications with multi-supply circuitries, especially the interfaces between subthreshold and normal voltage modules.
