This paper presents a parallel pattern compiled code logic simulator which can handle the transport delay as well as the inertial delay of the logic gate. It uses Potential-Change Frame, incorporating with inertial functions, to execute event-canceling operation for gates, thus eliminating the conventional time wheel mechanism. As a result, it can adopt the parallel pattern strategy to increase the simulation speed. Furthermore, it is a compiled code simulator, which further improves its performance. Experimental results show that it surpasses significantly over the conventional time wheel event-driven simulator in the simulation speed. In addition, it is also found that, significant percentage (27%) of hazards should be eliminated when only the transport delay is considered in the simulation.
Introduction
Recently there has been considerable interest on studying using the compilation technique to improve the performance of logic simulation[ 1-31. The reason that the compilation technique was not received much attention was because the traditional levelized compiled code(LCC) simulator was based on the zero delay model, which is very unrealistic in treating circuit delays. The above situation had changed recently after some techniques were proposed to generate codes for more complex timing models. For example, a threaded code technique was proposed to incorporate the unit delay model in a Tortle-c simulator [2] , and a multi-delay parallel algorithm was developed to use the potential change set to handle a multi-delay model [3] . All the above techniques were based on the simple transitionindependent transport delay model. However, the transport delay model can not describe the transient behavior of the circuit properly enough. A simple CMOS NAND gate in Fig. 1 can be used to explain this. In Fig. 1-(a) , the propagation delay of the CMOS NAND gate, denoted by d, is 0.2 ns. For thl? simp!le delay model, any transition on inputs A or B ai: time t will propagate to output c at time t+0.2ns.
Hence, as shown in Fig. I-(b) , when input A has a rising transition at time t, and input B has a falling transition at time t;!, output C will have a falling transition at time tl+0.2ns and another rising transition at t2+0.2ns. However, using SPICE to simulate the transient behavior of the NAND gate by applying the above input stimulii, we find that the transitions on the input lines do not always propagate to the output line C. If the difference between t l and t2 is not greater than a particular value, 0.2ns, for this example, the transitions will disappear ( Fig. 1-(c) ). The particular value is the "inertial delay" of the gate, denoted by d,, which is an essentia.1 characteristic of the logic gate. This is because:, to switch a gate, it requires certain amount of energy which is determined by the size of the gate. Any input pulse whose duration is less or equal to d, will be automatically suppressed by the gate. Hence, if inertial delays of gates are neglected during the logic simulatiion, some invalid results will be obtained in predicting the transient behavior of a circuit. As a result, this may invalidate the result on analyzing, for examples, number of switchings of a circuit, which is very important in determining the power of the circuit [4] , or the testing of delay faults. For a logic simulator to handle the inertial delay, the event-driven method incorporating a timing wheel is usually used.
This requires an event-canceling mechanism to implement the transition elimination, making the simulator very slow. This paper presents a compiled code logic simulator which utilizes an "Inertial-Function" technique to eliminate the above problem.
Furthermore, the simulator incorporates the parallel pattern strategy to increase the simulation speed. As a result, the simulator exhibits 300 times faster than the conventional simulator employing the timing wheel and 8 times faster than the conventional interpreter type of simulator. Figure 1 (a) The example circuit of an NAND gate; (b) waveforms of the lines by using transport delay; and (c) waveforms of the line by using inertial delay model which the input transitions on line C are suppressed.
Logic Simulation Based on Inertial Delay Model
in the conventional timing wheel method for transport delay model, the simulation is performed with respect to an increasing time step. That is, the events which occur at time, for example, t=l are evaluated and their response events are inserted into the timing wheel. Then, the events occurring at time t=2 are evaluated and their corresponding response events are inserted in the timing wheel again. These evaluation and insertion are repeated until no event exists in the timing wheel. For this timing wheel technique to handle the inertial delay, it requires a check and cancellation procedure which makes the computation overhead high.
We propose a method which completely eliminates the use of timing wheel. We use an array to record the timing information of each line of the circuit and proceed the simulation in a level by level fashion. For each line, we evaluate the timing array and check for transition elimination. The operations can be achieved with simple !ogic operations on a single timing array. This saves much computation overhead as compared to the timing wheel method.
Our proposed simulator is a compiled-code simulator. An Inertial-Function is used to handle the above transition elimination and a potential change frame (PC-Frame), which is derived from the potential change set (PC-Set) [5] , is used to save memory and the simulation time. A PC-Frame records the times at which an event could possibly occur for a gate. It consists of the following fields: Time-Index, Logic-Value, Fanin-Index Array and Inertial-Index, where Time-Index, GTI[k], is to record the times that a transition could possibly occur at gate G at the kth PCFrame, Logic-Value, GLv [k] , is to store the logic value at the time of Time-Index, Fanin-Index array, GFI [k] , is to store which PC-Frames of the fanin gates of G are used to evaluate G,,[k], and Inertial-Index, GII [k] , is a parameter related with the Inertial-Function. Fig. 2 is an example of the PC-Frame of the output C of an AND gate with two inputs A and B. The PC-Frame has 5 time frames, i.e., 0, 1, 2, 3, 4, where the 0th frame is only for reference, and the gate could possibly have transitions occurring at four time steps: step 5 , step 6, step 7, and step 9, as listed in CTI of the PC-Frame. For this gate, input A has a Time-Index, ATI, of {0,3,4}, that is, it could possibly have transitions at the time steps of 3 and 4; and input B has a TimeIndex, BTI, of {0,2,6}. The gate has a transport delay d=3 units, and an inertial delay of d1=2 units. The Time-Index, CT1, is obtained as follows: First, AT] and BT1 are unioned to get CTI' which is { 0, 2, 3, 4, 6 ) . Since the propagation delay of the gate, d is equal to 3, CTI is obtained by adding 3 to each of the elements of CT1' to be { 0, 5 , 6, 7,9 }. The Fanin-Index Array, CFI, is obtained as follows: For CFI [l] [O], the CFI of input A at the time frame 0, since CTI [l] -d = 2, that is, to cause output C to have a transition at time step 5 by input A, A should have a transition at time step 2. However, ATI [l] = 3 which is > 2, i.e., the possible transition could occur only at time step 3, hence, only the logic value of the time frame 0 of input A should be used, i.e., CF, [l] [O] should be 0. For cFI [l] [1], since BTI[2] = 6 2 2 and BT,[ 11 = 2, the logic value of the first time frame of input B should be used, i.e. CFI [l] [l] = 1. For all the other elements of C,, they can be obtained in a similar way and they are listed in the figure.
The general steps to compute the PC-Frame are summarized as follows:
At the beginning, for a primary input G, GTI = {O,l}, GFI = {O,O}, and GII = (0,O); for a general gate G having a transport delay d, an inertial delay d, and M fanin gates, Io, 11, ..., IM-l , its GTI is obtained by unioning the sets IJTl for every fanin gate Ij and by adding d to each element in GTI. Assuming that the number of the elements in the set GT1 is N and the size of Fanin-Index array of each PC-Frame of the gate G is M, which is the number of fanin gates of gate G, we use GFl 4. Assign n = n + 1, and go back to step 2.
As mentioned previously, Inertial-Function is used to check the existence of other transitions. For a gate G with an inertial delay d, to have a transition at time t, the transition may be influenced by other transitions at this gate in the interval [ t-dl, t+d, 1. In the above, since the codes of the transition elimination procedure for the PC-Frames with same Inertial-Index are similar, we employ the subroutine call method to reduce the code size. Before generating the compiled codes to perform the logic simulation, we create i i set of subroutines to eliminate transitions for the Inertial-Index ranged from 1 to the maximum value of gates in the circuits. The Logic-Value of each PCFrame is first evaluated then the transition elimination for each PC-Frame is executed according to the associated Inertial-Index.
This reduces the size of codes and the compilation time because only the set of inertial subroutines is added and compiled. In our simulaior, the simulation codes generated are assembly codes.
For this simulator, since all operations in the final compiled codes are logical operations and comparisons, which are bit-wise instructions, the parallel pattern strategy is iricorporated into the logic simulation to speed up the simulation speed. The number of pattem pairs which can be simulated simultaneously depend on the word length of the machine.
Exp~eirimental Results
This proposed parallel pattern compiled code simulator was written in C language to run on a SUN SPARCClassic workstation with 96MB memory. In order to evaluate this simulator, we also implemented two other simulators which were a conventional eventdriven lime wheel timing simulator and a software interpreted type of timing simulator. Also, to compare the sinmlation times for different timing models, the above 1 hree simulators had also been implemented with only the transport delay model, i.e. the inertial delay for every gate was zero. All the experiments were done on ISCAS benchmark circuits, where for the ISCAS89 sequenl ial benchmark circuits only the combinational parts of circuits were simulated. Each circuit was simulated with 5120 random pattern pairs and the transport delay of each gate was assumed to be equal to the number of fanin gates.
The results are shown in Tables 1, where column T.W. relpreserits for the CPU time in seconds for the timing wheel simulator, Int. for the interpreted simulator and C.C. for the compiled code simulator, respectively. The columns of T.W./C.C. and 1ntK.C. are the ratios for the above times respectively. Table I -(a) is the results for the transport delay model simulation. In the table, it is seen that the compiled-code simulator had the far less simulation times for all the circuits than other two simulators. On the average, the compiled code simulator ran 359 times faster than the timing wheel simulator and 12 times faster than the interpreted type simulator. Table 1 -(b) is the results for the inertial delay model simulation where the inertial delay of each gate was assumed to be equal to the transport delay of the gate. Similar results are seen in this table, i.e., the compiled code simulator ran much faster than other two simulators. On the average, it ran 339 times faster than the timing wheel simulator and 8 times faster than the interpreted type of simulator.
Also, it is to be mentioned that for a 96 M machine, it could handle the largest circuit of the benchmark circuits. !n Table 2 , the average numbers of transitions per 32 random pattern pairs for simulating each circuit with only the transport delay model and with the inertial delay model are listed. It is seen that significant percentage, i.e., 27%, of transitions are eliminated when simulating with the inertial delay model.
Conclusion
In this paper, we have proposed a parallel pattern compiled code logic simulator for the inertial delay model.
The simulator uses the PC-Frame, incorporating with an Inertial Function calculation technique, to eliminate the use of time wheel, which is usually used in the conventional timing logic simulator. Also, because of this, the parallel pattern strategy can be used, which further enhances the simulation speed. Experimental results on the ISCAS benchmarks show that this simulator has significant speed improvement over the timing wheel event-driven simulator and the interpreted type of simulator. In addition, it was found that significant percentage (27%) of transient transitions were eliminated when the inertial delay model was used, as compared to that only the transport delay model was used, to simulate the timing waveform of the logic circuit. Table 1 -(a) The simulation times for these types of simulators for the transport delay model by applying 5 120 random pattern pairs. Table I -(b) The simulation times for these types of simulators for the inertial delay model by 
