We present novel asynchronous VLSI comparator schemes which are based on recently proposed, reconfigurable shift switch logic and the traditional (precharged) CMOS domino logic. The schemes always produce a semaphore as a by-product of the process to indicate the end of domino process, which requires no additional delay and a minimal number of additional devices. For a large percentage of inputs the computations are much faster than traditional synchronous comparators due to the full utilization of the inherent speed of the circuits. Also the schemes are simple, area compact and stable.
INTRODUCTION
Recently, shift switch logic (defined by state signals and shift switches, refer to [6] [7] [8] [9] and see below) has been proposed. It is shown that the new logic method is an efficient alternative to the traditional switching logic (i.e., binary signals and logic gates) for the designs of a number of arithmetic devices including parallel counters, multipliers, and fast adders [5] [6] [7] [8] [9] [10] .
In [6] [7] [8] [9] 
2 (4) 3 ( <--<-- PIE is set to 0 (PIE is set to 1), all GP1 switches and comp units are precharged as described above. The precharge phase is as the follows: each propagation gate in GP1 is off and will be kept off if the corresponding comp unit bit evaluation signal cb(j)o (glj, gOj,pj) (0, 0, 1)(1 <j < 6).
This ensures a stable discharge during the evaluation phase, i.e., the domino discharging can The bus (with GPls) is then partitioned (by the propagation gates), and the worst case of comp evaluation signal propagation (or discharging) will start from gate A and/or B passing all 6 GPls to produce cgO. The worst case delay of the comparator is Tc (time to discharge a comp unit) + 7T6p (time to discharge seven cascaded pass-transistors including A or B)+ Tinv (an inverter delay, low to high). The best case delay of the comparator is Tc + Tae +Tinv. 4 . THE 32-BIT SCHEME: SHIFT SWITCH WITH DOMINO LOGIC A 32-bit comparator can be constructed by several GP switch blocks and organized in two levels (see Fig. 8 ). The first level consists of six blocks:-blockA(0), i.e., the 7-bit comparator and blockA(i) (for < < 5), The second level is a single block termed blockB. BlockA(/) (for < < 5; evaluation phase is shown). The output bits and the semaphore are produced at about the same time. This is the unique feature of our scheme. More significantly, the best case delay of the comparator can be specified as 3 times of a cascaded pass-transistors delay plus two times of an inverter delay (or 3 TGp + 2Tiny), the 
