Generuli:ed symbolic trujectmy evuluurion (GSTE) is a powerful, new method for formal verification that combines the industriallyproven scalability and capacity of classical symbolic trajectory evaluation with the expressive power of temporal-logic model checking. GSTE was originally developed at Intel and has been used successfully on Intel's ncxt-generation microprocessors. However, the supporting algorithms and tools for GSTE are still relatively immature.
INTRODUCTION
Generalized symbolic trajectory evaluation (GSTE) is a powerful, new method for formal design verification [24]. GSTE is based on classical symbolic trajectory evaluation [ZO], which has proven itself able to handle large, industrial designs and has been in active useatCompaq(nowHP),IBM,Intel,andMotorola(e.g., [16, 14, I, 61) . Classical symbolic trajectory evaluation, although efficient, is very limited in the types of properties that it can specify and verify.
GSTE extends classical symbolic trajectory evaluation to handle a full range of temporal properties (all @regular properties), giving it comparable expressive power to more established model-checking approaches [S, 17, 22, 121, while still maintaining the efficiency and capacity of classical symbolic trajectory evaluation. GSTE was originally developed at Intel and has been used successfully on Intel's next-generation microprocessors, where users reported superior efficiency and capacity for some challenging formal verification tasks 14).
However, a formal verification algorithm alone, no matter how automatic or efficient, does not constitute an entire verification flow.
In practice, numerous supporting algorithms and tools are needed to connect any given formal verification method to the overall verification effort. Especially needed for formal verification are the abilities to quickly debug and revise specifications before substantial effoon is invested in the formal verification process, to generate counterexamples and diagnose the causes of bugs, to relate and compose smaller verification results to solve a larger verification problem, and to bridge between different verification methods (e.g., formal 'Work done while visiting Intel SCL. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted withoul fee provided that copics are not made or distributed for profit or commercial advantage and that copies bear this notice and the full cilalion on the first page. To copy othenvise, to republish. l o post on servers or to redistribute l o lists, requites prior specific permission andlor a fee. Copyright2003 ACM 1-581 13-762-11031001 I ... $5.00.
Jin Yang Strategic CAD Lab
Intel Corporation jin.yang@intel.com vs. simulation-based) and between different levels of abstraction (c.g., system-level vs. RTL). GSTE, being a very recent development, currently has less of this supporting infrastructure than older formal verification methods do. For example, the efficiency of GSTE model checking relies, in part, on the particular specification style used, but no work has been published connecting GSTE specifications to other verification mcthods.
An especially useful piece of methodological glue is the monitor circuit. A monitor is simply a small circuit that watches, without interfering, the system being verified and flags whether or not the system is obeying some user-specified correctness property. Implementing the monitor as a circuit (rather than in, for example, a formal specification language) allows the same monitor to be used at all levels of the design cycle and with both formal and informal verification tools.' Extensive research has demonstrated the value of monitor circuits as the cornerstone of a practical verification methodology [3] , as an enabler of hierarchical, compositional verification [ I I, 21, IO] , and as a testbench generator for simulation 1261. Monitor circuits could even be synthesized into an emulation system to aid error observability and debugging.
This paper presents a linear-time, linear-size translation from the specifications used by GSTE into monitor circuits, thereby enabling new ways to integrate GSTE into the verification flow. Our generated monitor circuits handle fully general GSTE specifications ifthe simulator can perform a small amount of symbolic simulation; we also describe how GSTE specifications with some restrictions could be translated into monitor circuits suitable for fully scalar simulation or emulation. The immediate application for our translation is to allow quick, (symbolic-)simulation-based "sanity checks" of GSTE specifications before trying to apply model checking -in practice, considerable effort is often spent combatting state-spacc explosion before aformalverification engine yields it's first counterexamples, and we'd like to avoid this effort until we've eliminated simple specification errors. If we can simulate our formal specifications, we can quickly catch many erroneous specifications before investing in trying (and failing) to formally prove them correct. We also envision the generated monitors connecting GSTE-based and monitor-based verification methodologies. In addition, our monitor constructinn is a building block for initial work on compositional verification with GSTE [9].
GSTE AND ASSERTION GRAPHS
GSTE is explained in detail in several sources (e.g., [24, 25, 231, etc.) . Here, we give only a brief overview of the specification style used by GSTE in order to make this paper self-contained. GSTE is a model-checking method, where the possible behav-
Figure 1: GSTE Assertion Graph Example. The propetty specified is that a value written to a memory will be read correctly an arbitrary number of cycles later, subject to alignment and masking operations, provided it was not overwritten. Edges are labeled by an antecedent followed by a consequent, Every path through the assertion graph is a temporal assertion: if every antecedent along that path is satisfied in the system being verified at the corresponding clock cycle, then every consequent must be satisfied as well.
iors of the system being verified are considered to be the (usually infinite) set of all possible execution traces, and verification consists of checking that all of these traces obey the specification. The specification in GSTE is called an assertion graph, and is basically a special kind of automaton. One can think of the assertion graph as defining the set ofexecution traces that it accepts (i.e., the execution traces that obey the specification), so the verification problem is to chec.k that the set of execution traces the system can produce is contained in the set of execution traces that the assertion graph accepts (i.e., GSTEmodel checkingfollowsthelanguage containment paradigm, as advocated by, for example, Cospan [12]). Figure 1 shows an example assertion graph, adapted from [24]. It was used in the verification of an industrial memory design, which reads and writes data with a variety of selection and alignment options. The propem beingverifiedisthat,ifdatavalueDiswritten to address A , followed by an arbitrary number of clock cycles that don't overwrite the same address, followed by a read ofthe address, then the value retumed is the value that was written, appropriately aligned and masked. The edge labels arc of the form "antecedent / consequent", where the antecedents and consequents are simply combinational formulas over the state of the system at a given clock cycle. For example, the antecedent WRITE specifies that the value of the write-enable input we is high, that the address input addr is equal to some value A, and that the data input datawr is equal to some value D. The capital letters denoting values, like A, D, etc., are symbolic constants that can be equal to any value, making the verification result hold for all possible values of the symbolic constants. A path is a sequence of edges that start from the initial vertex VO. A terminal path is a path that ends with a terminal edge (indicated in the figure by a tic-mark on thc edge, e.g., the edge from v2 to v3). A path accepts an execution trace if at least one antecedent on that path fails (evaluates to false on the state of the system at that clock cycle) or ifall antcccdents and all consequents on the path succeed (evaluate to true on the corresponding clock cycle). Intuitively, a path is an if-then assertion: the antecedents say if the asscnion is relevant; the consequents say what must hold in the case that the assertion is relevant. If an antecedent fails, the assertion is vacuously true; if all the antccedcnts are satisfied, then the consequents must be satisfied as well. The assertion graph as a whole accepts an execution trace if every terminal path in the assenion graph accepts that trace.' Intuitively, the assertion graph *GGSTE theory actually includes four different kinds of acceptance: takes a potentially infinite set of assertions about the system and rolls them up into a graph; therefore, every trace must satisfy every assertion (vacuously or otherwise).
To someone familiar with formal verification theory, two characteristics of assertion graphs stand out: the antecedenticonsequent labeling of edges, and the graph accepting based on acceptance for all paths. The antecedenticonsequent style comes from classical symbolic trajectory evaluation [20] and is a natural way to specify temporal properties. For example, timing diagrams are typically interpreted this way (e.g., if some sequence of events happens, then some other events must happen) [2]. In addition, the GSTE modelchecking algorithm exploits the explicit antecedenticonsequent labeling to limit the search space during fixpoint computation, providing some efficiency gains and aiding user control of the model checker. The "for all paths" acceptance criteria makes assertion graphs a variety of V-automata [ 131, which are less familiar than the usual existential acceptance of non-deterministic automata (where a trace is accepted if there exists a corresponding path through the automata), but the V semantics provides both usability and efficiency benefits. The usability arises because an assertion graph defines a set of assertions, and one typically wants all assertions to be true; in contrast, usually with automata as specifications, the automata directly defines a set ofpossible behaviors, so verification consists of determining if the system's behavior exists in the set provided by the specification. The theoretical basis for the efficiency is that a 'dautomata is essentially pre-complemented, so model checking can bypass the expensive step of complementing a non-deterministic automaton. GSTE shares this theoretical efficiency advantage with other approaches that have used V-automata as specifications [13,
As an aside, we note that assertion graphs are a low-level specification style. On one hand, this low-level orientation gives the user precise, fine-grained control over the GSTE model-checking algorithm, making it easier to avoid blow-up when verifying large dcsigns. On the other hand, in an overall verification flow, translating from a higher-level property specification language into assertion strong, in which all finite paths must be satisfied; terminol, described here, in which all paths that end at terminal edges must be satisfied; normal, in which all infinite-length paths must be satisfied, andfair, in which all fair (generalized Biichi fairness) paths must be satisfied. Since we are building monitor circuits that should urork with simulation and synthesis, we are concentrating on terminal acceptance, which includes strong acceptance as a special case.
12,2].
graphs might enhance ease-of-use. We are not advocating assertion graphs as the ultimate property specification formalism; rather, we accept that assertion graphs are an entry point to an industriallyproven, practically-efficient formal verification approach -GSTE model checking -and seek to provide methodological support for dealing with assertion graphs.
MONITOR CIRCUIT CONSTRUCTION
Wenow present how to construct amonitorcircuit for an assertion graph. The construction runs in linear time and produces a circuit that is linear size relative to the size of the assertion graph. Our approach is inspircdby vcry efficient methods for generating circuits directly from regular expressions [19, 18, 151. The monitor circuit should watch the system being verified and check that the cxecution trace so far is legal, To check whether the execution trace is legal, thc monitor must verify that all (terminal) paths in the assertion graph accept the execution trace. The intuition of our construction is as follows:
Imagine that the monitor circuit has an internal copy of the assertion graph. The monitor can use this copy to track all paths in the assertion graph by placing tokens on the edges. Each token indicates that there is a path that ends at that edge, on that clock cycle. The tokens move forward one edge at each clock cycle, and fork into multiple tokens when a vertex has multiple outgoing edges (corresponding to multiple continuations of the current path). Each token corresponds to a path, which must accept the execution trace so far.
For example, consider the assertion graph in Figure I after, say, 2 clock cycles. At that point, there would be 2 tokens: one on the edge from V I to "2, corresponding to the path from VO to VI to v2; and another token on the self-loop on VI, corresponding to the path from VO to VI and looping to VI again. After another clock cycle, the first token would move to the edge between v2 and "3, and the second token would fork into two tokens: one on edge ~1 .~2 , corresponding to path "0-vl-vl-v2; the other on self-loop VI-VI, corresponding to path vO-vl-vl-vl The actual implementation follows this intuition closely. The structure of the generated circuit itself forms the "copy of the assertion graph'. Latches at each edge are set or cleared to indicate the presence or absence of tokens. Vertices are basically fan-out stems, distributing incoming tokens to all outgoing edges. The challenge is to keep the number of tokens finite (and small) to make creating a circuit possible.
The key insight is that paths are almost memoryless. All paths that reach an edge at some point in time share the same future. The only difference between paths is three different kinds of pasts, which we dub "blessed", "happy", and "condemned: Blessed Blessed paths have had an antecedent fail already. These paths will always accept, from here to eternity, and need not be tracked by the circuit.
Happy Happy paths have had all antecedents and all consequents succeed so far. These paths are currently accepting, but their continuations may or may not accept extensions ofthe current trace.
Condemned Condemned paths have had all antecedents succeed, but at least one consequent has already failed. These paths do not accept (and therefore, the existence of any condemned paths means the current trace is rejected), but the continuations from these paths may eventually become blessed.
Because of the limited history information required, the circuit can merge all tokens of the same type that arrive at a given edge at the same time. Therefore, the number of latches required to track all the tokens is only two per edge: one to track if any happy paths are at this edge, and the other to track if any condemned paths are at this edge. The circuit structure perfectly matches the structure of the assertion graph, with sub-circuits corresponding to each edge and each vertex of the assertion graph. The circuit works by passing tokens from edge to edge, updating the kinds of tokens depending on whether the current antecedents and consequents succeed or fail. More formally, given an assertion graph G, we build a monitor circuit such that the set of traces that the assertion graph accepts (under terminal satisfiability) is the same as the set of input sequences that cause the monitor circuit's accept output to be high. We assume that the labels on the edges for the antecedents ant(e) and consequents cons(e) are given as combinational logic over the signals (e.g., the inputs and state variables) of the system being verified, as well as over the symbolic constant latches (defined below).
The monitor circuit has inputs that are driven by signals in the system being verified. In particular, each signal name that is mentioned in the assertion graph G is an input to the monitor circuit. There is one additional input i n i t . The monitor has one output accept For all the symbolic constants in G, the monitor circuit has latches that are initialized to non-deterministic values, and then hold the values indefinitely.
The monitor circuit is composed of sub-circuits, one for each vertex and each edge in G. These sub-circuits are wired together exactly as the graph F is connected, with two signals, happy and condemned, between sub-circuits. For example, if directed edge e ends at vertex Y , then the happyout and condemnedout outputs from the circuit for e connect to a happyin and condemnedi, input of the circuit for Y .
The sub-circuit for a vertex v i s combinational. It basically ORs all incoming happyin signals and fans the result out to all of its happyout outputs. Likewise, it ORs all of it's incoming condemned signals and fans the result out on all of its condemned outputs. The circuit for the initial vertex VO has an additional input which is ORed in with the happy signals. This input is used to start the first token in the circuit at initialization, described below.
. The sub-circuit for an edge e is more complicated (Figure 2) . Build combinational circuits ant(e) and cons(e) that check the current value on the monitor's inputs against this edge's antecedent and consequent. By combining happyi,, condemnedin, ant(e), and cons(e), we can compute the correct values of happyout and condemnedout, which will be delayed by one cycle in latches, as well as output signals alive(e) and rcject(e), which will be used to determine whether the monitor circuit overall accepts or rejects. The signal alive(e) indicates if there exist any unblessed paths at this edge at this point in time. The signal reject(e) indicates if there exist any condemned paths at this edge at this point in time. Intuitively, alive(e) is just happyv condemned, and reject(e) is just condemned. However, to allow the monitor to respond immediately to the current cycle's inputs (Mealy machine), some additional combinational logic is needed:
. Thc global accept output of the monitor is based on all terminal edges. Basically, if an edge is alive, it must not be rejecting:
Initialization is accomplished by assening the i n i t signal. When i n i t is asserted, the outputs of all the edge latches (happyout and condemned,t) arc forced to 0. (This happens combinationally because the construction creates a Mealy machine.) In addition, a I is asscrted on the extra initialization input for the initial vertex VO. For certain kinds of assertion graphs (basically, those that are expected to hold in any state of the circuit, without forcing an initialization sequence as an antecedent), we can obtain additional simulation coverage for free by continuously inserting happy tokens at the initial vertex. Doing so checks all suffixes of a given tsace in a single simulation run. To enable this functionality, our construction gives the user the option of having the initialization input to VO always tied to 1 instead of being dependent on the i n i t signal.
This completes the construction.
The overall construction is obviously linear-size and linear-time in the size of the assertion graph, because each portion of the assertion graph requires a constant amount of work to translate and produces a constant amount of circuitry. Some straightfonvard optimizations are possible. For example, the acceptance signal is really based on not having any condemned paths at a terminal edge, so the alive(e) and reject(e) signals can be simplified. Also, there is no need to track happy paths if there is also a condemned token on the same edge.
From a conventional simulation or emulation perspective, our method for handling symbolic constants is problematic. The issue is that symbolic constants are intrinsically symbolic, encoding slightly different assertions for every possible value. In our construction, the latches for the symbolic constants "guess" the values of the symbolic constants; if the guess is wrong, the assertion graph accepts vacuously and incorrectly. If the symbolic constants are simulated symbolically, however, the simulation is simultaneously guessing all possible values of the constant, giving the correct results. We have chosen this translation because it is simple and compact, and because our simulator is actually a symbolic simulator, so it can correctly simulate all possible values of the symbolic constants at the same time. Note that this is very lightweight symbolic simulation, because it is simply matching the symbolic constants to what occurs in the trace being monitored; all other aspects ofthe assertion graph are being simulated normally. It is far less expensive than full symbolic simulation or model checking.
lfsymbolic simulation isnot possible (e.g., ifthe monitor is being compiled into an emulator), then an alternative construction could be to use a three-valued (0, I, or X) encoding for the symbolic constant latches. The latches would start uninitialized (all Xs), and the first time the symbolic constant is used, the latch would get the specific value written to it. For example, in the assertion graph inFigure I, when the WRITE happens, the A and D symbolic constants would record the values of the addr and datawr inputs, and these values would be checked by the later edges. The cost of this alternative is higher hardware complexity and some syntactic restrictions on the assertion graphs -antecedents involving symbolic constants would have to be convertible into assignments to those constants. Symbolic constants are typically used in this manner (intuitively, to remember values for comparison later), so the syntactic restrictions may not matter in practice. A better solution may be to make assignments to symbolic constants explicit; this would also allow creating monitors that are "retriggerable", e.g., when no longer needed to monitor one transaction, the same symbolic constants could be reused to monitor another transaction.
EXPERIMENTAL RESULTS
We have implemented the above translation into Intel's Forte verification systcm3 and report empirical results measuring the sizes of generated circuits as well as the impact of the monitor circuits on simulation speed. Our system is implemented in an interpreted, 3Forte is available for download at http:llwww.intel.com/software lproductslopensourceltools1lverificatiod but our new algorithms are not yet part of the standard distribution. functional programming language, which provides an excellent prot o w i n g environment at some cost in raw performance. Accordingly, run times should be considered as a relative indication of performance, rather than as absolute speed measures. All experiments were run using an Intel Pentium 4 processor running at 2.8 Ghz. Memory size was not a factor in our experiments, despite the small amount of symbolic simulation used to handle symbolic constants.
We have run three types of experiments: two using realistic, but somewhat idealized, assertion graphs that are easily scalable, and a third experiment using an actual industrial circuit. The first huo experiments allow us to measure the run time and the size of the generated monitor circuit as a function of different kinds of scaling of the assertion graph. The third experiment lets us measure simulation slow-down due to our generated monitors.
FIFO
The first example is a scalable family of assertion graphs used to verify a FIFO buffer. The property specified is that the empty and full signals are being set properly, and that the enqueued data is not corrupted and comes out at the right time. In this example, the size of the assertion graph, measured in terms of vertices and edges, is easily scaled for different buffer depths. Indeed, the assertion graph itself is generated via a script. Table 1 shows the results as we scale the assertion graph size for different buffer depths. The size of the generated monitor circuits is clearly growing linearly in the size of the assertion graph. The generation time appears to be growing slightly faster than linearly, probably due to implementation overheads. In any case, both the monitor sizes and the times required to generate them arc quite reasonable. Some simple logic optimization would likely further reduce the sizes of the monitors.
Memory
The next example is an assertion graph for verifying memories. The property being specified is that a read from a given address will retum the most recent data value written to that address, similar to the assertion graph shown in Figure I . In this example, the structure 
Industrial Cache Circuit
The circuit is a 32K cache with non-trivial reaawrite logic. (The circuit has 403972 gates and 35157 latches in total.) The property being specifiedichecked is similar to that for the memory example: a read will retum the value of the most recent write to that location. The assertion graph has 3 vertices and 3 edges, the generated monitor circuit has 9363 gates and 46 latches, and generation took 2.7 seconds. The point of this example is to measure the slowdown of our monitor circuit on simulation.
We generated a 25000 cycle random trace. At each cycle, the write probability was 50% and the read probability was 80%. All memory cells were the target of at least one write in the trace, and all reads were to addresses that had previously been written.
Using the simulator built into our verification environment, we could simulate this trace on the cache circuit alone in 1109.6 seconds. Adding the monitor to the simulation resulted in a run time of 1283.1 seconds, orroughlya 16% overhead. Since our simulator is a symbolic simulator, the simulation run with the monitor circuit covered all possible values of the symbolic constants in a single simulation N n .
In a production environment, running on a multiprocessor, it should be possible to eliminate the overhead of the monitor circuit entirely. The information flow is one-way from the system being verified to the monitor, so the monitor could be simulated asynchronously on a second processor. As long as the monitor simulation can keep up with the system simulation, there would be no slowdown. Even without further optimizations, however, the relative simulation overhead of our generated monitor is minimal.
CONCLUSION AND FUTURE WORK
The last example is a real, industrial circuit.
We have presented a procedure to construct monitor circuits for GSTE assertion graphs. The construction is highly efficient in theory, and experimental results confirm that monitor circuits for real industrial GSTE assertion graphs can be constructed in negli-gible time and impose minimal simulation overhead. The ability to build monitor circuits from formal assertions has numerous applications for tying formal vcrihcation into the overall verification flow.
The most obvious diiection for future work is to improve the handling of symbolic constants under simulation, as described earlier.
Our current constructibn, however, does work very well with s pbolic simulation. Similirly, efficiency improvements are possible in the sub-circuits generated to check antecedents and consequents.
In this paper, our assumption has been that the motivation for using assertion graphs ii-to access the efficiency of GSTE model checking. The monitor construction, however, translates an assertion graph into a circuit, which allows using assertion graphs as specifications with other formal verification engines (e.g., symbolic model checking (71, boundedmodel checking [SI, etc. ). An intriguing exercise would he to explore for what types of problems each verification engine workbest. Our main direction for future work is to investigate compositional reasoning. Because oftlie capacity limitations offormal verification tools, being able to compose smaller verification results into larger conclusions is a critical part of a scalable formal verification flow. For example, we may wish to assume one property while trying to verify another (e.g., assume-guarantee reasoning, environment constraints). Because the GSTE model-checking algorithm verifies a relationship between a circuit and a specification, the ability to convert specifications into monitor circuits creates the possibility of using GSTE to verify relationships involving multiple specifications (e.g., assume one assertion graph while verifying another). We have prelimina~y results along these lines, showing that the construction presented here.is.a valuable building block toward efficient compositional verification with GSTE 191.
