This paper addresses the problem of deriving a registertransfer level (RTL) model from a transistor-level circuit. Using existing techniques, the transistor-level circuit is converted into a relation that describes the evolution of the signals in the circuit with respect to the simulator clock. This simulation relation is then manipulated to derive the stable behavior of the circuit. Given this stable behavior and information about the clocking scheme, we determine i f the circuit is combinational, asynchronous or synchronous. For combinational and synchronous circuits we derive an equivalent register-transfer level model. This development enables full-custom circuit designers to use tools that were till now available only to designers working at the gatelevel. The algorithm has been successfully used to characterize several custom designs, as well as the entire AT&T standard-cell library.
Introduction
Circuits that can be described at the register-transfer level constitute a large percentage of digital designs, and analysis and optimization tools are widely available for such circuits. In contrast, fewer tools support higher level reasoning about transistor level integrated circuit designs. This paper describes a technique that, given a netlist of transistors, derives an equivalent register-transfer level description; the technique first identifies if a given circuit is combinational or synchronous before attempting to derive an equivalent model.
This research was initially motivated in the context of developing formal verification techniques for custom VLSI designs, where there was a need to reason about transistor-level circuit descriptions. Fortuitously, the ability to perform such circuit extraction additionally enabled full-custom designers to use commercial EDA tools developed for gate-level ASIC design (like fast logic-simulation, fault-simulation, test pattern generation, timing analysis). Furthermore, logic simulation is at least an order of magnitude faster than switch level simulation. Having an automatically extracted gate-level equivalent circuit from a transistor representation allows simulation based validation to be sped up considerably, reducing design time.
In Section 2 we review some previous work on developing logic models for transistor circuits and contrast them with the current approach. In Section 3, we then describe our algorithm for extracting logic models from transistor circuits. The algorithm has been used to derive the functionality of gates in a standard-cell library and fragments of full-custom designs used in production. These experimental results are presented in Section 4. Finally, in Section 5 we present conclusions and comments.
Background
Pattern matching has been proposed [4] as an approach for extracting equivalent logic circuits from transistor netlists. This technique is effective when there is a restricted style of designing circuits. However, since it is difficult to capture all the different circuit variations and optimization tricks employed by designers, patternmatching techniques are not very effective for full-custom design styles. Our extraction approach is based on the circuit functionality and not on the structure of the netlist and consequently does not suffer from this limitation. Our approach can complement pattern-matching: modules that are not recognized by template matching may be input to the abstraction tool we have developed.
There has been prior work [ 1, 3, 2] in transforming transistor netlists into equivalent logic netlists for improving the speed of switch-level simulation. Switch-level simulators differ in the effects that are modeled (transistor strengths, directionality, charge storage, etc.), the delay assumptions for the transistors, as well as the accuracy to which the node voltages are represented (binary, ternary, four-valued, etc.). However they all use models that describe the evolution of the circuit from one time instant to the next. We call such models simulation models. Each simulation cycle is represented by one occurrence of the simulation clock which is used by the simulator to schedule different parts of the circuit for evaluation. When a computation is indicated to be of unit-delay type, it produces the logic value for a node at the next simulation cycle while if a computation is of zero-delay type, it determines the node value to be used in the current simulation cycle. During each simulation cycle the relevant equations are evaluated to determine the node voltages. Unit-delay elements are associated with feedback Synchronous digital circuits are described using a register transfer level model (RTL model) and our goal is to derive such a model, if possible, from a transistor netlist. An RTL model describes the circuit behavior as an interconnection of logic gates and memory devices, latches and flip-flops, whose behavior is pre-defined. The state of the circuit is set of values stored in the memory devices. The state changes in response to input values and the previous state when the clock signals change. When using an RTL model the user is concerned mainly with the evolution of the state of the system and a clock-cycle-by-clock-cycle description of the outputs.
Even though both the simulation model and the RTL model represent the circuit behavior as finite-state machines, the FSMs differ in their interpretation. The conjiguration in the simulation model is a microscopic representation of circuit state, while the state in an RTL model is a macroscopic representation. It is important to understand the distinction between the simulation model and the RTL model in order to appreciate how our logic extraction differs from that of [3, 2] . This difference is illustrated in Figure 1 . In the simulation model, a new configuration is computed by evaluating the logic at every simulation tick, denoted by a time interval 6, till the circuit response is unchanged. Due to the fine granularity of time, the simulation model can model the transient values at the nodes in the circuit in addition to their final values. Also, since events on all signals (clock and data inputs) are processed in the same manner the simulation model can model both synchronous and asynchronous circuits. On the other hand, in an RTL model the clocks have a special status since the equations describing the circuit state are evaluated only when the clock inputs change. This evaluation computes the new state of the circuit which is stored in the registers for use during the next clock cycle. Since the circuit behavior is described in terms of state changes, there is no modeling of the transient voltages in the circuit. RTL models represent synchronous circuits and the interpretation of the RTL model depends on the clocking scheme. Thus to derive an RTL model from a simulation model, the clocking scheme must be specified so that the appropriate analysis can be performed.
A previous attempt to address the extraction is described in [5] . The simulation relation was manipulated to abstract the transient behavior and the resulting description was mapped to edge-triggered flip-flops. This work could not recognize level-sensitive latches; further, it was the user's responsibility to determine if the extracted functionality was a complete description of the circuit. In our current method we first propose conditions to check for synchronous and combinational behavior, and only when the simulation model fits these conditions do we derive equivalent RTL models based on level-sensitive latches.
Terminology and notation
The characteristicfunction of a set A is a Boolean function A(z) such that A ( x ) = 1 if and only in z E A . Sets will be represented by their characteristic functions. Since a relation is a set of n-tuples, it is represented by its characteristic function as well. 
Extracting RTL models
The overall approach we adopt is shown in Figure 2 and details of the operations and the various tests suggested in is processed to obtain a unit-step relation that describes the evolution of the circuit configuration (Section 3.1). We stop the analysis if this relation represents an oscillatory circuit or a circuit whose output is not binary valued for some binary input condition (Section 3.2). For stable circuits, we then derive a stable unit-step relation that abstracts away the transient behavior (Section 3.3). We examine the stable unit-step relation to determine if the circuit is combinational or sequential (Sections 3.4 and 3.5). Combinational circuits are converted into an acyclic interconnection of logic gates. Sequential circuits may be either synchronous or asynchronous. If a designer wants to distinguish between these, helshe can define a clocking scheme for the circuit to constrain the circuit behavior. If the circuit is synchronous under the clocking scheme, an equivalent RTL model is derived.
Deriving a unit-step relation
We use Tranalyze [2] as a pre-processor to derive the simulation model. Tranalyze uses a 4-valued representation for the signals and produces an interconnection of zero-delay logic functions and unit-delay elements. Unitdelay elements represent feedback arcs or charge-storage nodes in the circuit. Associated with each unit-delay element nz, are 4-valued variables z1 and y, -z, represents the value at the output of ni and yt represents the value at the input of n,. Variables IC, and yi are represented using pairs of Boolean variables, (zp,~:) and (yp,yb) respectively. The present-configuration, denoted 
Stability Test
After building the unit-step relation we determine if the circuit is stable or oscillatory. We assume that ifthere exists a stable binary configuration corresponding to an input condition, then the circuit will settle in that conJiguration when the corresponding input is applied. This gives us an easy way to check if the circuit is stable -all we need to check is that there is a stable binary configuration corresponding to every input.
A Circuits that have oscillations will not have a stable configuration corresponding to some input and will fail this test. The algorithm we describe here will not derive equivalent RTL descriptions for oscillatory circuits.
3
The next step in the extraction is to abstract the transient behavior. Consider the fragment of a simulation relation shown in Figure 3(a) . Configurations A and D are stable binary configurations, i.e. there is some input for which the circuit will remain in these configuration till the input changes. Thus as long as the input is il and the circuit configuration is A , the circuit configuration remains A. Now assume that the circuit input changes to i2. The circuit goes through intermediate configurations B and C before stabilizing at configuration D. Since we are not concerned with modeling the transient behavior in the RTL mode1 (we assumed that the circuit has time to settle after every input change), we replace the configuration space with the reduced configuration space shown in Figure 3 (c) where there is a direct transition from one stable configuration to the other on the given input. This mimics the analysis that a human performs when trying to understand the circuit operation -they simulate it till they get a stable circuit configuration in response to an input change.
The stable unit-step relation, denoted R, , represents the transition structure obtained by eliminating the transient configurations from the unit-step relation Rb. To compute R, , which represents the steady-state response of the circuit to an input, we use the algorithm described in Figure 4 . The basic idea is to replace consecutive transitions by a single transition and repeat the process till all transitions occur between stable binary configurations. A similar technique was proposed in [5] .
A stable transition originates at a stable binary configu- The classification of the circuits is based on characteristics of the stable unit-step relation as well as the clocking scheme. Whether we perform the synchronous test or the combinational test depends on whether a clocking scheme has been specified for the circuit. These tests are described next.
Computing a stable unit-step relation

Identifying Combinational Circuits
A relation R, represents a combinational circuit if there is a unique, binary output value for every binary input applied to the primary inputs. If there exists an input configuration that can produce two or more distinct output configurations, then the output is not determined uniquely by the circuit inputs and must also depend some signal that is not a circuit input. This indicates sequential behavior. . This is computed from the stable unit-step relation.
We first check that the sets Ij(O,O) and Ij(1,l) are empty, i.e., that there is no input condition for which the output oj is non-binary. We then apply two checks -the completeness check and the uniqueness check for output oj . We say that an output oj passes the completeness test The classification procedure makes no assumption regarding the circuit topology as it operates only on the stable unit-step relation. It is the pre-processor TrunaZyze that handles cyclic combinational circuits by introducing unitdelay elements to break the feedback edges. It is an interesting problem to compare our approach with the condition stated in [6] for determining if a cyclic circuit is combinational.
When a relation R, fails the combinational test, the analysis may be repeated after specifying the clock signals, if any. The next section discusses how we handle sequential circuits.
Ij(a, b) = Sxy(R,(I, X , Y).(ojW E a).(of E b))
Identifying Synchronous Circuits
There are two categories of sequential circuits -synchronous and asynchronous. Synchronous operation is defined in terms of a set of distinguished "clock" signals. Hence it is necessary for the designer to identify the clock signals and specify their relationship to one another. When all circuit outputs are well-defined functions of signals that are themselves well-defined, we get a representation of the output that is equivalent to an RTL model. By using a constructive algorithm, we are assured that circuits for which an RTL model is derived do embody the extracted behavior.
The stable unit-step relation, R,(I, X, Y), for the cir- By examining the relations A? and A$ we are able to determine if node n, is well-dejined. The test for a node being well-defined consists of first checking whether the node represents a combinational function, failing which we check if the node represents level-sensitive behavior.
Combinational test: For a signal yz to be represented as a combinational function, its value must be independent of the clock configuration. Since the domains over which A? and A? are defined (SBC61 and SBCb respectively) may differ, we check for equivalence only on the common part of the domains. Node n, is combinational if the following equality holds.
If this test is satisfied, we derive the logic function fz for this node. A relation that captures the node value at all configurations of interest is minimized with respect to the total set of stable binary configurations.
Level-sensitive test: When a node n, fails the combinational test we check if it is level-sensitive. In a levelsensitive latch, the output value is allowed to change during one clock phase and is held constant during the rest of the clock cycle. For node n, to be classified as level-sensitive, exactly one of A!' and A? must exhibit this behavior. Sup-
This implies that during phasel , there is no condition for which (y, # 2,). Alternately, it means that the value of n, does not change during phasel and so it is inactive during phasel . This allows us to classify node n, as level-sensitive, active during phase2 (the other phase). The function fi that computes yz during phase2 is derived by minimizing A$ with respect to the care set of stable binary configurations, f, = min(AP, SBC). A similar test, applied to A? can be used to see if node n, is level-sensitive, active during phasel. Note that if the node n, is inactive during both phases then there is no equivalent memory device that can
Once all the components have been classified, we perform a structural check on the netlist to ensure that no output depends on a component that has been classified as undefined. When all outputs are independent of the undejined nodes, we substitute for each component the appropriate functional block (complex logic followed optionally by a level-sensitive latch of the appropriate clocking) to generate the RTL model.
Implementation and Results
The algorithm described in Section 3 has been implemented in a program called Smelt. Smelt invokes Trana-A$ n sBc@i n SBC" A$ n sB& n S B C~ model it and so the node is undefined.
lyze to perform the pre-processing of the transistor netlist to derive the unit-step relation. All the sets and relations are represented as BDDs and the computations in the algorithm are performed using the procedural interface to a BDD library. The result of the extraction is a circuit composed of logic gates and latches. The run-times reported in this section are on a SPARC20 with 512MB of RAM.
Characterizing a cell library
The first set of experiments we undertook was to see if Smelt is able to extract the functionality of the cells in the AT&T standard-cell library [7] . The cells have transistor netlists that have been extracted from their layout as well as logical descriptions provided by the library designer. Our goal was to extract the functionality from the transistor netlist. The diversity in the cell library provides a rich set of examples for the extraction algorithm. This is viewed as check for both the abstraction algorithm and for the implementation of the algorithm in Smelt .
Combinational cells in the cell library are implemented in a static CMOS style, with the PMOS network being the dual of the NMOS network. Using Smelt , we were able to correctly recover thefunctionality for all 207combinational cells in the standard-cell library. A majority of the examples (169 cells) are extracted in less than a couple of seconds each, the total time for these examples is 200 seconds. For these small cells, Tranalyze produces descriptions that are acyclic and that have no unit-delay elements. Performing abstraction on such descriptions is trivial.
For the remaining 38 examples, the descriptions produced by tranalyze had more than 9 unit-delay elements. These are introduced to model charge sharing among the transistors. In particular, for the cell AOI4444, Tranalyze produces a description containing 16 unit-delay elements and 235 2-input primitives (see [2] for a description of the primitives). Clearly, such gate-level models are unacceptable, both in terms of their structure and complexity. When we apply our abstraction procedure, the chargesharing transients are removed in the computation of the stable unit-delay relation, and we do indeed recover the correct functionality for these gates. However, due to the large number of BDD variables (each unit-delay element requires six BDD variables) the run times for these 38 examples totaled 54,000 seconds. 
Sequential cells in the cell
2 Characterizing full-custom modules
The library-characterization experiment was a success but it does not exploit the power of the extraction algorithm since the cell library design follows a rigid design discipline. Furthermore, a pattem-matching techniques can easily be constructed to match the cells in a given library. We therefore tested the abstraction on custom-designed cells.
The 6-transistor XOR gate ( [8] Figure 8 .11) presents problems to many switch-level simulators. Tranalyze produces a description which has one unit-delay element. Smelt extracts the XOR functionality for this circuit. We also derive the correct functionality for transmission-gate based adder structures ( [8] Figure 8 .13). Cyclic combinational circuits presented in [6] were also correctly extracted by Smelt.
For large circuits it is impractical to perform extraction on the entire circuit. We exploit the design hierarchy to overcome this limitation. We have developed a heuristic to determine the modules that need to be characterized. Using this divide-and-conquer approach we have characterized parts of commercial full-custom designs. These include: PE which is one of the processing elements that make up an 8x8 array which performs motion-estimation in a video-encoder, POLUNIT which is composed of arithmetic functions and implements a leaky-bucket algorithm for rate control in Asynchronous Transfer Mode (ATM) traffic management, and CPONTROL which is the controller of a SPARC microprocessor core. Table 1 shows data on the full-custom designs described above, as well as the run-time. Even though these circuits are of moderate size, their functionality can be described in terms of a fairly small set of modules. The run-time includes all of the processing -reading in the designs, determining which modules to characterize, performing the abstraction and finally writing out an equivalent logic description in terms of logic gates and latches.
Conclusions
We have presented a technique for deriving registertransfer models from transistor netlists that is based on examining the switch-level relation. Before deriving RTL models we first apply tests to determine whether the circuit is combinational or synchronous. For these circuits an equivalent RTL model is derived based on the functionality that is captured by a switch-level simulator. To the best of our knowledge, tests for determining if a circuit has an RTL model have not been available in the past. Consequently, existing techniques could not guarantee the correctness of the extracted models. The technique described here is constructive, and the extracted circuits capture the behavior modeled by the switch-level relation. We emphasize that our contribution in this paper is the algorithm that classijies the switch-level relations and not the pre-processing required to generate the switch-level relation from the transistor netlist.
