INTRODUCTION
T HIS PAPER describes a program written to assist in the prototyping and simulation of dedicated VLSI signal processing systems. The focus has been on sampled analog systems such as switched-capacitor filters and A/D converters. The program is also capable of simulating the analog intent of digital signal processing structures such as FIR and IIR filters. A key goal has been to have digital logic included, so that circuits whose performance depends on an interaction of digital and analog circuits can be studied. Examples of such circuits would be a successive approximation A/D converter, an oversampled A/D converter, and a phase-locked loop using digital frequency division.
Programs have been available previously which allow the detailed simulation of switched-capacitor filters and A/D converters, for example, DIANA [1] , SCOP [2] and SWITCAP [3] . A good review article on computer aided analysis of switched-capacitor circuits has been presented by Lieu et al. [4] . The aim of this work is to focus at a higher system level than the work reported and to efficiently calculate the expected performance of larger systems incorporating sampled-data elements.
A key to the computational efficiency of logic simulation programs is that they only perform calculations when there is a change in logic level (an event). Programs such as SPLICE [5] extend event-driven simulation to add continuous analog capability, primarily for digital logic gate simulation at the transistor level to obtain better timing infor- The authors are with the Department of Electronics, Carleton University, Ottawa, Ont. K1S5B6, Canada.
IEEE Log Number 8717713.
mation. For continuously varying signals, this provides an increase in efficiency over a conventional analog simulation such as SPICE. However, it is easy to see that for sampled data, where signal levels change in discrete time steps, an event-driven algorithm will provide even more gain in efficiency. This paper uses the public domain program SPLICE (Version 1.7) as a vehicle to demonstrate the extension of an event-driven algorithm to model sampled-data circuits efficiently. The digital logic and continuous-time transistor-level features of SPLICE have also been maintained so.an arbitrary combination of the old and new features can appear within the same simulation.
II. CIRCUIT TYPES TO BE SIMULATED
The basic type of circuits which we wish to be able to simulate are shown in Fig. 1 . The example in Fig. l(a) is a switched-capacitor biquadratic filter, while that in Fig.  l(b) is a transversal (FIR) delay-line based filter in a feedback loop, making an overall IIR filter. The aim is to describe such circuits efficiently at a difference equation level. In the switched-capacitor filter of Fig. l(a) , for example, the goal is to describe the behavior of the integrator with output VX as a simple difference equation. Switched-capacitor circuits, which can be modeled by linear difference equations, can include the effects of finite gain in op amps and incomplete transients if defined by a fixed time constant [6] . Nonlinear effects such as op-amp slew rates could also be included in user-defined models, to be described later. However, in the present paper we are concentrating on the linear case. It would also be possilie to represent the circuit of Fig. l(a) in SPLICE (or SPICE) at the transistor level. This is expensive in computation time and can lead to stability problems because of high loop gain around the op-amp feedback paths. For simulating such circuits as components of much larger systems, a Z-domain (difference equation) based description is used which is much more efficient.
Consider now the delay-based digital filter in Fig. l(b) . The z-transform equation which describes this filter looks very similar to that for the switched-capacitor filter of Fig.  l(a) . The main difference between the two circuits is that the memory of the previous circuit output state in Fig. l In the same way, an actual hardware realization of the two circuits would have a very different sensitivityy to parameter values while achieving an equivalent transfer function. The switched-capacitor implementation will generally have much better sensitivityy behavior. In the simulation context, however, the computer word length is long enough to assume precise calculations, unless round-off or truncation errors are deliberately introduced to mode] a digital hardware realization. We now consider how to modify SPLICE to be able to add the capability of simulating examples like the above.
III. REVIEW OF SPLICE CONTINUOUS-TIME ALGORITHM Version 1.7 of SPLICE was chosen since it is public domain and relatively well documented. This version has a fixed time step of calculation, the minimum resolvable time (MRT). We have synchronized this fixed time step with some fraction of the shortest clock period that might occur on a mixed digital/sampled analog integrated circuit chip. This is a key point. Consider a switched-capacitor filter circuit running with 128-kHz clocks. These might ultimately be derived from a digital logic circuit clocked at 2.048 MHz, i.e., eight times faster. The event-driven nature of the calculation then means that both the high-frequency digital and the low-frequency sampled-data calculations can coexist very efficiently. An event-driven logic (and mixed-mode) simulator progresses through a scheduled time queue and the algorithm departs from its main "check next scheduled time step" loop only when a circuit event necessitates a calculation. In a sampled-data circuit, these events will only occur at widely separate intervals based on a multiple of the MRT. By exploiting dormancy in the algorithm, a high-frequency digital circuit or a continuous-time circuit can have as short a time step as necessary, as long as the sampled-data circuit period is expressed as an integral multiple of this time step.
We now must consider the changes necessary to the event-driven algorithm to achieve our purpose. Fig. 2(a) --- shows a continuous circuit with two floating capacitance nodes connected by a passive resistor. All nodes must have at least a minimal capacitance to ground, whose purpose in the algorithm is to integrate currents entering the node. This is a typical circuit which is simulated by the original SPLICE algorithm. The modified algorithm should be able to handle circuits such as Fig. 2 (b), which shows a typical switched-capacitor integrator with an input and output node. In the switched-capacitor integrator, the output node has memory of previous outputs, but additionally must have new output changes caused unilaterally by the input voltage V1. There is no inherent bilateralism as there is between two capacitive nodes connected by a passive resistor. Returning to Fig. 2(a) , the algorithm presently solves for the node voltage V~,for example, by integrating the current flow that has occurred since the previous simulation time step onto the (required) node capacitance to ground C2. This fundamentally casts the network equations into integral form. A Newton-Raphson iterative method is used in a converged Gauss-Seidel loop [5] . A time queue is set up as shown in Fig. 3 , and attached to this queue at "time now" (the present time step) is a linked list of structures associated with the nodes being calculated. Each of these node structures is the fan-out list of a given node, i.e., the table of nodes to which the given node is connected. This list grows dynamically as the algorithm iterates at a given time step (time now). The mechanism for adding a node is the occurrence of an "event," which for the analog case is a node voltage change greater than a certain threshold. This event is caused by an earlier calculation arising at another node in the linked list, where the fan-out list of the other node includes (causes a stimulus at) the node being discussed here. A given node might be added a number of times as the iteration commences. Fig.  3 shows nodes VI and Vz being re-added repeatedly to the linked list for further calculation since the algorithm has not yet converged. This process ends when the iterative correction AV for each node falls below a minimum threshold-(i.e., there are no more events), at which time the linked list stops growing horizontally and the calculation moves on to the next time step.
Even if VI is fixed, node Vz can be added to the list a number of times, since as a corrected Vz is calculated, the current in RI, which is (Vl -V2)/R1, changes. This necessitates a number of iterations before the algorithm finds all node voltages mutually consistent at time now.
We must distinguish between iterative progression, which is the sequence of recalculations at a time point as the algorithm moves horizontally in Fig. 3 , and progression in actual time, which is the sequence of time steps as the algorithm progresses down the time queue. The purpose of iteration is to bring voltages to mutual consistency with each other at a given time instant. Only when this iteration process is completed does the algorithm move to the next step in actual time. This does not mean that transients in actual time are resolved by the iterations, i.e., a capacitance such as Cz in Fig. 2 (a) can still be going through a voltage transient in actual time after the transient in the iteration has converged. If such is the case, the node V2 generates an event which then results in the node being added at the next time step when it is reached. That is, nodes changing in actual time have their fan-out list added at the beginning of the next time step when the current time step has completed.
This discussion highlights three important points which must be addressed in the sampled data case.
1) The iterative sequence of calculations ceases at a given time step when all nodes converge, i.e., settle within a certain predefine threshold of the actual mutually consistent set of node voltages. This implies that the algorithm does not provide exact calculations under the condition where there is repeated iteration. Bilateral elements such as RI in Fig. 2(a) cause such a condition because the two connected nodes must interact bilaterally at each time step. Fortunately many sampled-data circuits have inherent delays which break bilateral interacting links and therefore iterations may not occur beyond the first calculation. The result is a correct linear calculation of the node voltage the first time. Events which generate calculations will proceed unilaterally from some input stimulus point along a succession of nodes, ceasing when the final calculation is less than the predefine threshold at the end of the chain of nodes.
2) If repeated iterations do occur and the nodes are too tightly coupled, this can be the equivalent of positive feedback in the iterative progression even though the circuit may be stable in actual time. In simple terms, the iterative correction calculated for node Vz in Fig. 3 may grow as the node reappears successively in the iterative time linked list.
3) The analog algorithm as described so far only deals with time within one time step of simulation (present time). In the logic simulation case, arbitrary propagation delays cause nodes to be added at arbitrary time steps into the future (when there will be an event on an adjacent node). This complicates the digital algorithm. Fig. 2(b) , the switched-capacitor integrator example, shows that we are now going to have to deal with arbitrary delays in the sampled-data case also. This requires considerable modification of the continuous-time circuit algorithm.
IV. MODIFICATIONS FOR SAMPLED DATA
For sampled data, the major changes needed were in two areas: 1) in the core node processing subroutine, so as to handle the changed nature of the integrating node J?zin Fig. 2(b) ; and 2) in the scheduling algorithm, so as to reflect the need for delays in adding sampled-data events to the time queue.
The requirement for both changes mentioned is implied in Fig. 2(b) , where +1 and +2 are two-phase nonoverlapping clocks which have clock edges at times n -1, n -1/2, n, etc. The difference equations are v2, n = V2, n-I/2 -aRvl, n-1/2 v 2,n-1/2 = v2, n-1 where which can be combined over one clock period to give
Vz, n=v2,n_1-aRvl, n_l/2
giving the z-transform at the end of phase 2 as
Any sampled-data system can be seen as an interconnected array of such first-order difference equation components. For a simple switched-capacitor circuit such as in Fig. l(a) , two simultaneous difference equations of the form of (1) can be solved in closed form. However, our algorithm handles more complex circuits by realizing each first-order difference equation as a separate element. The basic element is the switched-capacitor integrator of Fig. 2(b) . For delay-line based components, as in Fig. l(b) , the local memory term V2,. _~in (1) will be absent. For coupling capacitors like Cq in Fig. l(a) , the term VI in (1) will reflect the difference in VI over the time period, i.e., the difference in two successive samples of VI rather than a single sample. With these qualifications, we can set up an algorithm to simulate any discrete-time linear sampled-data system in the time domain. An important point to note is that the virtual ground at the op-arnp negative input terminal is absent in the formulation here. A direct calculation is being done to add a correction to the node voltage V2 in Fig. 2(a) according to earlier sampled values of the node VI and similar inputs to the integrator. An iterative algorithm would not be able to realize a formulation including the virtual ground because the algorithm fails (becomes unstable) due to high gain around the op amp and feedback-capacitor closed loop.
The main node processing subroutine must be changed to add the capability to sample a given circuit node at an arbitrary time (the sample time), do a calculation such as multiply by a~in (l), and add a correction to another node voltage such as V2 in (1) (the action time). We have built two forms of the sampled-data model which do this. In the first form, the clock times are implicit. The user specifies the sample time and action time as part of an underlying periodic time scheme. This fits a context where the sampled-data clock scheme has fixed timing. The second form of the model has explicit clocks. The modlel determines the sample time and action time from the rising or falling edge of clocks supplied by the circuit. This fits a context where the clock rate varies dynamically in the circuit operation. The explicit clocks, however, have to be provided by circuit elements. This form is illustrated in the simulation example (Table III) CLKDAT model card.
We are now ready to discuss some points raised at the end of Section III.
1. Accuracy: The continuous-time SPLICE algorithm linearizes according to a Newton-Raphson iterative method. Thus, if the circuit elements being handled are linear, there will be no error in the basic calculation. Further, if there is only one iteration of a node at each time point, this can result in an exact calculation assuming that nodes are sufficiently decoupled. As an example, consider Fig. l(a) . Assume Vx is desired. Vx at the end of phase 2 is defined by the values of~and Vo. There will be no second iteration since there is delay in switched-capacitor CR2. This kind of situation is typical of switched-capacitor circuits, where instantaneous closed loops around two or more op amps are not favored for practical circuit reasons.
One complication is instantaneous feedback around am individual op amp, as caused by CR~for the specific switch clocking arrangement shown in Fig. l(a) The term in a3Vo,. represents instantaneous feedback around a single loop and will result in repeated iterations as shown in Fig. 3 . This could result in the precision of lrz being reduced as previously discussed because repeated iterations only allow accuracy within a predefine threshold.
The situation just described can be avoided by a simple manual algebraic redefinition, when accuracy is critical. Equation (3) can be rewritten by bringing all terms in l% to the left-hand side and then dividing through as shown:
Vo,n-1 + other terms vo, n=-
Formulating the difference equation in this way prevents repeated iterations in the algorithm. The program at present does not' have a means to take this choice automatically.
The possible occurrence of repeated iterations is fundamental in event-driven algorithms such as that used in SPLICE. There is no reporting of the occurrence of such iterations other than after program failure due to an excessive number. It would be useful, based on the prese:nt discussion, to enable -the user to call up-reports on the occurrence of iteration when numerical precision is the' goal. This has not yet been implemented; it is up to the user at present to recognize the occurrence of instanta- timing connection enables an instantaneous coupling around the loop. In such cases, we can see that instability arises for cases where C~z/Cz = C~d/Cl = 1. In Fig. 3 an instantaneous two op-amp loop will cause the sequence of fan-out table scheduling as shown, with nodes VI and Vz each continually being rescheduled by the other node as the iterative solution progresses. This case proceeds very similarly to the way SPLICE presently handles the circuit of Fig. 2(a) , where RI causes instantaneous coupling between nodes VI and V2 in the algorithm. For the switched-capacitor circuit, the case a~z = a~~=1, which corresponds to filter resonant frequency of 1/2 n of the clock frequency, will result in each succeeding iterative correction being equal to the last, i.e., this is the boundary of iterative instability. A value of a which is greater than 1 will guarantee instability. If there is a delay in the two op-amp loop which breaks the instantaneous feedback path, the case of a~2 = a~l >1 can cause instability in real time rather than iteratively. Rather than lock up at a single time point, the solution-will proceed in time with an oscillatory tendency, reversing at every time step. The circuit dissipation a~~will dampen this tendency. Fortunately, it is not common to design switched-capacitor circuits with a >1 because this results in resonant frequencies which are too high relative to the clock frequency for practical application.
V. ADDITIONAL FEATURES
Additional features which support a higher level functional block simulation capability have been added. These include a comparator, an output limited amplifier, and a unit delay element. The sampled-data integrator model can also be used as a delay by using a feature which shorts the output node to ground, once per period, so it has any cycle-to-cycle memory removed. This is done, for instance, in the Table III example, to achieve a sampled and hold (see model samhldl).
In addition to those mentioned above, we have added the capability of user-defined embedded models. The user can define his own functional model, which can be continuous, sampled data, or logical. Such a capability is usually provided in digital simulator contexts [7] but here is integrated with the event-driven analog algorithm as well. The purpose is to allow a higher level functional block to be defined for areas of the circuit/system where it is not necessary to descend to circuit details at a given stage of the design. This function can make use of any of the capabilities of the underlying language (Ratfor/Fortran here). The function is written in the underlying language, along with certain keywords which are directives to ; element name net list model type multl output, inputl, input2 multiply a preprocessor which prepares the function for compilation. This creates an object module which is then linked into a SPLICE executable object. Calls to the new function are then made in the simulation source deck in the same way as for normal SPLICE models. A single example to illustrate an electrical model is shown in Fig. 4 . This might be a desired functional block, embedded in a normal electrical circuit, which is to access the voltages on nodes "input 1" and "input 2," multiply them by an arbitrary parameter, add the offset and force the node "output" to the result. Table I shows the model definition and its invocation in a SPLICE source deck. The dot tokens are instructions to the preprocessor while the actual function is recognizable as normal Fortran code. The parameter PAR(1) is available as a parameter to the subroutine for this model, and is passed via the model card in the source deck as shown. Additional parameters can be added as PAR(2), PAR(3), etc. Any available Fortran functions may be used. The syntax shown here varies slightly from Fortran due to the language preprocessor (RATFOR) used in SPLICE 1.7.
The general logical model, LOGMOD, works similarly. Here the nodes being dealt with are logic nodes and the internal functions defined would be logical. These models are capable of storing data between time steps, as in the state of a binary counter for example. The genflg for the model must be asserted true for the model # to stay active for every time step of the simulation. This is to # ensure that a precise calculation is done to generate a clock edge. # That is, an integration is implied here, which should Only be dOne # once every simulation time etep. # GENFLG = .true. In effect, the sampling action in the first sample Fig. 5 . This is a frequency synthesizer, using a phase-locked and hold performs the multiplication which in a phase- Fig. 5 isat a fixed frequency F, which is the desired channel spacing of the synthesizer. When the loop is phase-locked, the switched-capacitor clocks will be synchronized with the reference sawtooth. A change in the divide number, i.e., a change in the desired synthesis frequency at the VCO output, will result in a PLL transient which may or may not be stably damped depending on the loop parameters. The parameters of the switched-capacitor loop filter (a first-order pole and zero) are critical in this regard. If the loop does become unstable, synchronization with the reference sawtooth is lost and the circuit behavior becomes chaotic.
The motive for the particular circuit shown was to consider a monolithic MOS silicon implementation of the synthesizer. A loop filter typically needs time constants in the second to millisecond range. Due to the typical silicon monolithic capacitor size limitation (approximately 50 pF), only a switched-capacitor resistor would provide the large resistance required. By synchronizing the switched-capacitor filter clocks with F,, itbecomes possible to decrease the clocks to a low enough frequency to obtain the large resistance. Table II shows the high-level analog model description for the combined VCO and the divide-by-N counter. Although the counter could be achieved by logic gates, it was more economical to incorporate it into the high-level analog model. The output pulse is analog and the edges of this pulse were used to derive the switched-capacitor clocks by manipulation of delay models.
The explicit clock version of the sampled-data model (the CLKDAT model) was used for the loop filter. The two sample-and-hold stages were achieved with the explicit clock version model also, using settings to override the integration of the sampled input. The line samhldl in Table  III , for example, uses the CLKDAT model with the outputs and inputs as defined in the commented line above. The sampling is defined by the leading edge of~1, as is the action time and the clearing time. The parameter actedge = O in the model card line chooses the leading rather than the trailing edge of @l.
Because the clocks are variable, a very short time step (MRT) was necessary to allow fine control of clock pulse timing. This time step causes an amplitude quantizaticm of the analog sampling levels because the real circuit would vary the width of pulses continuously, but we are only able to vary it in discrete steps. This does not interfere with interpretation of the results if the time steps are small enough.
In the high-level language description of the added VCO/counter model, there are various subtleties. For example, since the model is integrative in intent, it is necessary to ensure that a calculation is done only once every time step. This is done by testing a flag (keyword FRSTIM) which registers (internally to the new algorithm) whether this is the first time the algorithm has been entered for this model at this time step. Other similar internally related keywords are GENFLG and TIME. GENFLG, when set, ensures that the model is activated at every time step, i.e., does not lie dormant waiting for an input event. This is needed when the model must perform actions autonomously without initiation by an outside event, e.g., a signal generator or VCO. TIME allows the present time-step value to be read. LEDEG( n ) allows the sensing of a leading edge at the input n, and so on.
In the version presented here, only one copy (instantiation) of the added element is possible, since there is only one set of memory internal to the subroutine (e.g., REAL cntset). A later version allows multiple copies.
The added high-level model is accessed by the source deck in the same way as other models. Table HI shows the source deck description for the complete synthesizer. The simulation as shown requires 25 CPU seconds on a VAX 11-750. The particular purpose of the source deck shown was to investigate stability of the PLL. For this purpose, the frequency-synthesizer divide number was changed from 1000 to 1100 (PAR (1) and PAR (2) in ELMOD1). The change was initiated by the second input to the VCOCNTR high-level model, the timing source (TSRC) setcnt pulse, whose edge was sensed by LEDEDG(2) in ELMOD1 of Table II. The same simulation for a larger jump in the divide number, i.e., a larger change in the desired synthesized frequency, showed an odd behavior associated with the temporary loss of synchronization ("slipping a cog") because the phase at the input sample and hold had shifted from the zero crossing to beyond the peak of the reference sawtooth. After a period of chaotic behavior, the loop finally established synchronization again for the pmameters given. 
VII. CONCLUSIONS

