Abstract { A fundamental timing analysis problem in the verication and synthesis of interface logic circuitry is the determination of allowable time separations, or skews between interface events, given timing constraints and circuit propagation delays. These skews are used to verify timing properties and determine allowable propagation delays for logic synthesis. This paper presents an algorithm that provides tighter skew bounds with better asymptotic running time than previous methods, and sho ws how to apply the method to synthesis tasks.
I Introduction
Temporal behavior of interface circuitry is frequently described using event-based representations that relate the occurrence times of events with timing constraints and propagation delays [1, 2, 3, 4, 5, 6] . In this paper, we present an ecient solution to a key problem in the verication and synthesis of interface glue logic, namely, the determination of tight bounds on the temporal separations between events. To v erify a synthesized circuit, we must be able to check that the circuit's outputs will occur within the time interval required and expected by the circuit's environment. In synthesizing the circuit, we m ust be able to determine the amount of delay within which the logic may generate an interface event. This permits optimizing the logic to take advantage of the temporal characteristics of the interface. The basic subproblems of both these tasks can be phrased in terms of bounds on the skew between pairs of events.
Previous work on this problem has suered from a combination of two deciencies. First, existing verication algorithms are inecient. The method in [1] relies on exponential search, while the method of [3] does not produce the tightest possible skew bounds and has a running time which depends intimately upon the time bounds of the constraints. Second, they have not been useful for the synthesis process because they yield very loose bounds in the presence of unknown delays, a common situation before a circuit is synthesized.
In this paper, we rst present a n i n terface timing specication model that unies the concepts of timing constraint and propagation delay i n to a single constraint type. W e then provide an ecient algorithm for solving systems of these constraints. The algorithm yields tight bounds even in the presence of unknown constraint bounds, and its worst case running time can be expressed independently of the initial constraint v alues. W e conclude with a discussion of how the algorithm can be used in both verication and synthesis applications.
II Interface Timing Specification
Interface specications consist of a sequence of events, which are transitions on signal wires. Such a specication can be viewed as a partial ordering of the events and the ways in which they can be spaced in time. Temporal relationships between these interface events are expressed with propagation delays and timing constraints. In this section, we explain the semantic dierence between these two t ypes of temporal constraints and present a model that expresses both of them in a unied form.
A An Interface S p e cication Example Suppose we wish to synthesize a circuit to interface with an SRAM. We s a y that the SRAM is then the environment for our interface circuit. Figures 1 and 2 provide the interface specication for a simplied SRAM read operation { any circuit we synthesize to interface with the 31 st ACM/IEEE Design Automation Conference ® Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying it is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. SRAM must adhere to the performance requirements in Figure 2 , and may take advantage of the propagation delay information to meet any further timing constraints on its own performance. In this example, the appearance of valid data on the DATA OUT line is the result of a propagation delays from both the lowering of the signal CS and the assertion of a valid address on the Address lines. Throughout the remainder of this paper, these three events will be referred to as DV,CS, and AV, respectively. Propagation delays, or delay constraints such as these express structural dependencies between the inputs and outputs of both the interface circuitry and the environment. These constraints, here expressed as ranges of 0 to 20 time units from the lowering of CS and the appearance o f a v alid address, determine when valid data will rst appear. The data appears at the maximum of CS +t ACS and AV + t AA where t ACS and t AA are within the 0 to 20 time unit delays listed for DV relative t o A V and CS. Note that this event m a y actually occur outside the range specied by either input event's propagation delay taken alone. Therefore, we consider these constraints linked or dependent on one another. We can express these as: The other constraint t ype, which w e term timing constraints, come in two a v ors: requirements, which the environment imposes upon the circuit for proper interaction, and guarantees, which describe the operating environment independently of the underlying implementation. An example of the rst type would be the minimum time constraint t RC on how long the address must remain valid. An example of the second would be an environment asserting that it will never change two signal values within a short interval of each other. Constraints of this type are independent of one another and specify the exact time range within which one event m ust occur relative to another. Performance requirements of the circuit can also be viewed as timing constraints { specifying that an output response must be seen within a particular interval. We can express these as: Previous work has used dierent models for temporal constraints that make more explicit distinctions between the two t ypes of constraints. McMillan and Dill ( [3] ) use the terms Linear and Max constraints for timing and delay constraints, respectively. V anbekbergen ( [4] ) has a more complete yet, not largely useful, taxonomy that labels timing and delay constraints as type 1 and type 2 , respectively. W e nd it more useful to translate both types into inequalities involving the Ma xoperation. We can express both types of constraints as a system of inequalities of the following form:
x i Maxfx j1 + j 1 ;i ; : : : ; x j m + j m ;i g: (1) Since timing constraints are independent, there is only one term in the Ma xexpression { reducing Equation 1 to a simple arithmetic inequality.
Suppose that we are given an interface circuit for the SRAM of Figures 1 and 2 which meets the performance guarantees of Figure 3 Systems of these of events can be abstracted as a constraint graph over interface events. We s a y a given set of constraints induces a graph whose nodes represent the events, and whose arcs, from x j to x i with label represent each of the terms x j + in a constraint with x i on the left hand side of the inequality. The graph induced by the set of constraints given above i s s h o wn in Figure 4 .
III The Verification Problem
We can verify that a system's required performance constraints are met by determining that the maximum skew between all interface and environment e v ents in the system meet all performance r e quirements of the system.
A Formal Problem Denition
We n o w state the verication problem more formally. Given X = f x 0 ; x 1 ; : : : ; x n 1 ga set of occurrence times of events in the system C , a set of constraints c j of the form: c j : x i Max fx j1 + j1;i ; : : : ; x j m + j m ;i g; determine either a tight upper bound on the occurrence times of all variables x 1 ; : : : ; x n 1 relative t o x 0 = 0 , o r that the set of inequalities is inconsistent.
In practical applications, one would apply the verication algorithm to a fully synthesized combined circuitenvironment specication with all performance requirements removed and then check that the bounds given by the verication algorithm are no looser than any performance requirement. W e do not remove propagation delays and performance guarantees since they determine how the circuit and its environment will react.
B Previous Work
Algorithms for determining the maximum inter-event timing separations have been proposed by Borriello [1] and McMillan and Dill [3] . The algorithm of [1] is exponential in the number of nodes with propagation delays and can quickly become too costly for large composed graphs. The implementation is straightforward and uses backtracking to determine which causal relationships determine the occurrence time of an event.
The algorithm given in [3] has two drawbacks: in many practically interesting cases, it provides innite separation bounds between events with nite bounds; and its worst case running time depends not only upon n, the number of events in the system, but also upon the i;j 's, the bounds of the constraints. In this algorithm, initial innite upper bounds on node separations are rened by successive applications of appropriate constraints from the input set. The problem with this approach, as noted in [3] , is that the running time of the algorithm can depend on the values of the constraints, giving a worst case complexity o f O ( n 3 P j i;j j). This behavior occurs precisely when there is a \negative cycle" in the graph with at least one arc of the cycle belonging to a propagation delay. When applied to the SRAM example of Figure 4 , the number of times the algorithm of [3] (hereafter referred to as the MD algorithm) applies the constraints DV Max (AV + 2 0 ; C S +20) and CSDV 30 is dependent upon on the value of the 300 ns constraint from AV to CS. Increase the 300ns constraint to 600ns and the algorithm takes twice as long.
In addition, the limit of CS's maximum skew relative to AV as the 300 ns constraint is raised towards innity i s 10, indicating that the constraint is redundant. However, if the constraint is completely removed, the algorithm will give a nal bound of 1 for CS relative t o A V . I f w e assume that all events must occur eventually then an innite bound simply indicates that we do not know the relationship between event occurrence times. In this case, an innite maximum skew between the events is wrong: we know that they will occur and that CS must occur at least 10 ns before AV.
C An Improved V erication Algorithm
We n o w i n troduce our new \short circuiting" verica-Optimized Constraint Relaxation Algorithm Input: Event set X and constraint set C Result: xj contains tight upper bound on (xj x0) Set all bounds xjj 6 = 0 to symbolic quantity V. Set x0 to 0. Repeat:
Repeat n times: \Update" subroutine:
If a constraint ci exists that can reduce the bound on an xj, update xj to reect ci and record ci as the most recent to update xj.
Choose If a \negative cycle" can be discovered, we can then predict how many times the constraints along that cycle can be re-applied. This information can be used to speed up the performance of the McMillan and Dill algorithm.
Since we assume that all events will eventually happen, it is correct to dene the problem using the limit of the maximum skews as an initial bound on all maximum skews goes to innity. This allows us to accurately handle cases such as that of Figure 4 with the redundant 300 ns constraint removed. If we dene the dependency graph of the system to be the subgraph induced by those constraints which w ere used to provide the current bound on each node, then patterns of repeated constraint application appear as strongly connected c omponents in this dependency graph. To calculate the limit of the maximum skews as an initial bound goes to innity, w e begin the algorithm by setting the maximum skews of all nodes in the graph to the symbolic constant V, with the exception of one node whose time is set to 0 to serve as the origin of the time measurement. We assume that V is a very large number, and so perform all calculations involving it symbolically. An intuitive description of the algorithm follows; pseudocode is given in Figure 5 .
The short circuit algorithm cycles through the following four steps:
Pass through n rounds of the Update subroutine, where n is the numb e r o f e v ents in the system. The Update subroutine applies to each e v ent the constraint that most greatly reduces its bound. During this process, the dependency graph summarizing which constraint w as used most recently to update each e v ent's maximum skew is maintained. After n rounds, any current cyclic behavior will appear since every cycle has at most n nodes on it. Perform a strongly connected components analysis of the dependency graph. \Negative cycles" will appear as strongly connected components in this directed graph and may b e a n y strongly connected component, not just simple cycles.
Among such components of size 2, nd the topologically rst ones. These indicate the constraint dependencies which m a y be protably \short circuited".
For each of these components, nd all constraints whose arcs have their tails outside the component (called exterior arcs) In Figure 6 , the only such exterior arc is from AV to DV. When the constraint relaxation procedure is exhibiting cyclic behavior, it Figure 4 with the redundant 300ns constraint removed.
will continue to do so until one of the exterior arcs provides the actual bound on the node it points to. We discover which node will limit the cycle by comparing the current s k ew bounds of all nodes that such exterior arcs enter with the value they would have if the interior arcs (those arcs with tails inside the component) were to be removed. Whichever of these has the least dierence between the current and exterior-provided skew bounds is chosen as the \winner", and we update that node's skew to match the incoming arc.
Note that the last step is where the symbolic value V becomes useful { a component m a y h a v e all nodes with values containing a V term when all exterior arcs provide potential bounds not containing V. In such a case, the MD algorithm will erroneously calculate an innite maximum skew for all nodes in the connected component. We assume that any v alue containing an V is larger than any value not containing V, and this allows us to shortcircuit these components as well. Note that the use of V also allows us to apply ShortCircuit to systems that contain variables with true upper bounds of innity. These variables will be precisely those whose nal bound as given by the algorithm still includes a V term.
In Figure 7 we show the results of applying the shortcircuit algorithm to the graph in Figure 4 , without the redundant constraint CS AV 300 since we can now handle an initial upper bound of innity o n CS AV .
D Practical Results
Each of the n update rounds takes time at most jCj where jCj is the number of terms x j + j;i in the constraint set. The topological information takes time at most O(jCj) to calculate. We h a v e unfortunately been unable to determine a tight bound on the number of short circuiting passes that must be made in the worst case. It is our intuition, however that the number of required passes is polynomial and we h a v e been unable to generate any example that takes more than P = O(jCj) such passes. The algorithm must be run once for each possible assignment o f x 0 , t h us giving a bound of n P (n j C j + jCj) to determine all n 2 maximum event separations in the worst case, which w e feel is probably n 6 . F or practical problems, C is O(n), giving a likely bound of n 4 . In contrast, the bound for the MD algorithm is n 3 P jj in the worst case and n 2 P jj for the practical case. We would expect that n 2 is much less than the sum of the 's for practical problems. An absolute worst case on the number of passes required by our algorithm is T , where T is the number of distinct rooted trees over the constraint set C. This bounds the number of dierent dependency graphs we will see during the short circuiting portion of the algorithm { it can be shown that with each pass, the portions of the dependency graph which topologically precede all strongly connected components of size greater than one must be distinct.
We h a v e implemented the algorithm and run both practical examples [7, 3] and randomly generated larger examples built to look like practical examples (i.e. similar constraint sizes and constraint t ype ratio). In these cases no more than three short circuiting phases were required to nd maximum skews relative to a single event. Running times were on the order of 20 seconds on a DEC station 5000 to nd all n 2 maximum skews for a dense constraint graph with 80 nodes, which i s m uch larger than we expect to see in practice.
IV Applications to Synthesis
In this section, we dene a synthesis problem over systems of propagation delays and performance constraints, and show w h y previous verications algorithms are inadequate to perform synthesis tasks. We use an example problem to give a n i n tuition for the dierences between the verication and synthesis problems, and provide a new constraint taxonomy to distinguish the two problems.
A A T axonomy of Constraint Types
Given a collection of devices whose temporal behavior is fully specied and a fully synthesized interface circuit, the verication problem determines that the interface circuit meets the timing requirements of the components it interconnects. In contrast, as we synthesize interface logic, we w ould like all the timing constraints provided to guide our synthesis process. In particular, we wish to utilize the circuit's required timing constraints to determine allowable propagation delays for that circuit. To accomplish this, we partition the types of delays encountered during the synthesis procedure into the following two orthogonal categories: all performance c onstraints do not exist Propagation Delays vs. Timing Constraints:
Both circuit and environment m a y include structural timing information in the form of propagation delays.
The environment m a y include timing requirements which indicate the allowable time separations of inputs to the environment and timing guarantees which summarize its temporal behavior.
Constrainable vs. Unconstrainable ranges: Constrainable delays and performance measures indicate time ranges which m a y be further constricted, as needed, to create a consistent circuit-environment combination; unconstrainable ranges represent elements for which the circuit must function correctly for arbitrary delay behavior anywhere within the given range. Figure 8 gives a summary of these categories, which are here more completely described from top to bottom.
Constrainable Propagation Delays:
Timing behavior for which logic is either not completely synthesized, in the case of interface circuitry, or for which timing behavior can be modied, in the case of the environment.
Unconstrainable Propagation Delays:
Timing behavior for which logic is already synthesized.
Constrainable Timing Guarantees:
Environment timing behavior that is modifyable, but for which no explicit structural information is provided.
Unconstrainable Timing Guarantees:
Unmodifyable environment behavior for which n o explicit structural information is provided.
Constrainable Timing Requirements:
Performance requirements may always be over-met. requirements, and the synthesis problem consists of constraining all constrainable propagation delays and constrainable guarantees until they, in conjunction with their unconstrainable counterparts meet all of the timing requirements.
The algorithms of [1] and [3] are both verication algorithms, meaning that given a set of constraints, they determine maximum bounds on the separation of signal events given that all constraints hold. In contrast, a synthesis algorithm must determine bounds on the time available for a circuit to generate an event to ensure that the desired constraints hold.
B Solutions are Inherently Disjoint
Consider the simple system in Figure 9 . The environment provides output A, and some time later expects two inputs, at B and C, from which it generates the output at D. I f w e m ust synthesize the logic that produces B and C, w e h a v e t w o problems:
They must be synthesized quickly enough after A occurs that the environment can produce D within the 100 ns maximum time from A. At least one of them must produce output late enough to keep D from happening before it's minimum 80 ns time bound from A.
Note that we need not cause both B and C to occur later in order to meet the minimum time constraint. If we synthesize logic to produce B from A within the delay range h30; 60i then we m ust synthesize C from A with delay exactly 50. Alternatively, i f w e generate C from A with delay range h20; 40i , w e can synthesize B from A anywhere in the range h60; 70i. Note that these are two disjoint solutions.
C Towards a Synthesis Algorithm
The dierences between the verication and synthesis problems are fundamental { for a given system there is one correct answer to the verication problem, but perhaps none or many correct synthesis solutions. However, we note that the following steps can greatly improve the bounds returned by the verication algorithm.
Perform the verication algorithm only on the Unconstrainable constraints to get unconstrainable skews.
Perform the verication algorithm only on the Constrainable constraints to get constrainable skews.
Replace each unconstrainable skew of the form x i x j with a constraint x i x j . Create new synthesis constraints as possible from combinations of one constrainable skew and one of the \ipped" unconstrainable skews Run the algorithm on this last set of synthesis constraints.
The resulting bounds on skews between nodes connected by Constrainable Delay constraints provide tighter bounds on how those arcs may be synthesized than are obtainable from the verication algorithm alone. Bounds on Constrainable relationships are narrowed subject to Unconstrainable constraints taking on their worst case delays. This procedure essentially requires that all Timing Requirements hold no matter where an Unconstrainable Delay occurs within its allowed range. The bounds can then be used to guide an iterative heuristic synthesis procedure to determine nal bounds on all Constrainable Delays and Constrainable Guarantees.
V Conclusions and Future Work
We h a v e discussed the dierences in verifying and synthesizing interface circuit systems specied with propagation delays, performance guarantees, and performance requirements, and provided a taxonomy of constraint types to express the range of desired behaviors. A new algorithm was presented for satisfying systems of constraints as arise in interface timing verication. This algorithm improves upon the previous work of McMillan and Dill [3] i n t w o w a ys: it robustly handles innite delay bounds, and its worst case running time is not dependent on the individual delay v alues of the constraints. We h a v e proven the algorithm correct (the complete algorithm and proof of correctness can be found in [8] ) , and shown that it is practically applicable. In addition, we h a v e shown how to modify the verication algorithm to more readily handle synthesis tasks.
Currently we are working on determining the verication algorithm's theoretical time performance bounds, as well as developing a full synthesis procedure.
