I predefined assertion suites (such as AMBA and PCI assertion checker suites);
I source code semantic analysis (such as checking full_case and parallel_case); and I formal verification of assertions with model checkers based on various technologies such as BDDs, SAT (Boolean satisfaction), symbolic simulation, automatic test-pattern generation, automatic abstraction, and lightweight theorem proving.
Several hotly debated assertion languages are vying for standardization. For example, the Accellera technical committee has chosen IBM's Sugar language (http://www.eda.org/vfv/), and Synopsys has announced OpenVera Assertions (http://www.eedesign.com/story/ OEG20020415S0029).
This article by Shimizu and Dill continues a sequence of articles related to constraint-based verification. [1] [2] [3] These articles, and others, have shown that simple Boolean constraints (that is, assertion checkers) with small companion state machines could define a bus protocol, which could be easily model checked. This constraint-based modeling resulted in identifying bugs in the well-established PCI protocol. The current article explains how the same methodology can be used to check bus interface units and also measure simulation coverage of the constraints themselves. The fundamental computational algorithm of constraint-based verification is the method for automatically converting simple (but powerful) interface checkers into stimulus generators 3 for simulation-based verification, and environments necessary for formal verification. 1 Hence, constraintbased verification unites informal (that is, simulationbased) verification with formal verification, both supporting assertion-based verification.
Constraint-based verification establishes a new paradigm for verification. Constraints can be developed incrementally and inexpensively without a heavyweight test bench, and can help animate the design under verification (DUV) at the earliest opportunity. Constraints are simple to write, and designers can use them directly, well before turning over the design to a verification team. Constraints can then be "flipped" to become checkers at higher integration levels, whereas conventional simulation drivers are just discarded during full-chip or system-on-a-chip integration. Constraints therefore enable assume/guarantee reasoning-that is, assumptions about the environment of a DUV must be proven when the DUV is connected to its true environment. Constraint-based verification can be easily integrated into, and extend, a conventional verification flow.
ishing time to market, precious resources are allocated to more pressing design needs.
To counter these disincentives, we have developed a methodology that increases a specification's value beyond its role as an interface documentation. If designers can use formal specifications in novel ways that enhance design productivity, they may be less reluctant to develop them. In this light, our methodology automates simulation-based validation procedures, which are now done manually through exploitation of formal specifications. For register-transfer-level (RTL) designs simulated in software, designers can use the specification to directly generate its inputs, check its behavior, and monitor simulation coverage.
The problem
To verify a hardware description language (HDL) design module with software simulation, an engineer needs additional tools, as Figure 1 illustrates:
I Input generator. Logic is required to drive the design inputs. One way is to use random sequences. The problem with this approach is that because the inputs are not guaranteed to be correct (they might be garbage from the design's point of view), it is difficult to gauge the design's correctness. A less haphazard method is directed testing, in which input sequences are manually written, but this approach is time-consuming, and writing such sequences correctly is difficult. I Output correctness checker. Logic to determine the correctness of the module's behavior is needed because manual scrutiny is usually too cumbersome. There are two types of correctness. When designers check design outputs for protocol violations, they focus on the interface for correctness. In contrast, when they check design behavior by observing actions at multiple interfaces, the correctness centers more on the design and less on a single interface. The output correctness checker described in this article can check only for the former, in which the checker's observations are contained within one interface. I Coverage metric. Because complete verification coverage that tests all possible input sequences is not possible, some metric must quantify its progress. Such a metric lets the verification engineer know whether the design's functionalities have been thoroughly exercised and all interesting cases have been reached during the simulation.
Our approach
Our methodology is based on a unified framework approach in which the three aforementioned tools are generated from a single source specification, as Figure 2 shows. This unified framework is possible because all three tools are based on the interface protocol. The input generator generates input sequences based on what the interface protocol allows, the output correctness checker compares the output to what the protocol deems correct, and the coverage metric quantifies coverage by exploiting the fact that the protocol defines the set of all possible interface events. Thus, designers can transform the interface specification into the three tools using automated methods. Currently, each verification aid is written from scratch, requiring a tremendous amount of development time and effort. By eliminating this step, our methodology enhances productivity and shortens overall development time. Furthermore, a thoroughly debugged, solid specification invariably leads to correct input sequences, correctness-checking properties, and coverage metrics. The correctness of the core document guarantees the correctness of the derived tools. In contrast, with current methods each verification aid must be individually debugged. The advantages of only having to check the specification are most pronounced for standard interfaces where the correctness effort can be concentrated in the standards committee responsible for defining the interface and not duplicated among the many interface implementers. Furthermore, a change made to the protocol (a frequent occurrence in industry) requires only the associated change in the protocol specification because the verification aids can be regenerated from the revised document. Otherwise, the engineer would have to manually determine the effect of the change for each tool.
Generating the three tools
The output correctness checker is the most straightforward of the three tools to derive from the specification. The checker functions on the fly; during simulations, it flags an error as soon as the module violates the interface protocol. It checks for protocol conformance but does not check for the design-centric correctness described earlier. For example, a PCI checker will verify that a module obeys the protocol rules trdy → devsel or prev(trdy ∧ stop) → stop. The specification is guaranteed to be executable by the style rules, so the translation from it to an HDL checker requires minimal changes. We and Hu describe the details of this translation from the specification to the output correctness checker in another article. 1 We do not describe the details of the output correctness checker here. In this article, we focus on the other two tools. The input generator produced by our method is dynamic and reactive; the generated inputs depend on the previous cycle outputs of the design under verification. In addition, these inputs always obey the protocol, and the generation is a one-pass process. On every clock cycle, the algorithm solves the constraints that the specification imposes on the inputs. The design cannot discern the difference between interacting with this setup and interacting with an actual HDL implementation of the environment.
Although input generation using constraint solvers is not by itself novel, our approach is the first to use and exploit a complete specification. Writing constraints on an ad hoc basis for the express purpose of generating signal sequences is common. However, few have succeeded in transforming an existing, complete specification into a generator for verifying the complex designs commonly found in industry.
Finally, we introduce a new simulation coverage metric and describe the automatic input biasing based on this metric. Although more experiments are needed to validate this metric's effectiveness in measuring coverage, its main advantage (currently) is that it is specification-based and saves time. Extra work is not needed to write out a metric or to pinpoint the interesting scenarios; they are gleaned mechanically from the specification document.
Methodology
The bedrock of our methodology is the specification style, which is language independent. We explain the style rules that determine the specification structure so that the specification can be used for input generation and tracking coverage. 
Specification style
The specification style can be applied to many specification languages, from SMV to Verilog. Best described as a way to structure and restrict a specification, the style has been used to formally specify the core subsets of the signal-level PCI 1 and Intel's Itanium processor bus protocols. 2 A structured specification has many benefits that a freeform one lacks. For example, although Yuan et al. have also developed an input generator, SimGen, 3 our generator is far more memory efficient than SimGen because our approach exploits the specification's structure. The structure lets only the relevant portions of the specification be extracted for signal generation. This results in dramatically smaller data structures and can allow input generation for large designs that previously could not be handled. For example, with the PCI design that we experimented on, a SimGen-like method would have required, in the worst case, 2 161 nodes in the data structure, whereas our method required only 2 15 nodes. The specification uses multiple constraints to collectively define the signaling behavior at the interface. The constraints are short Boolean formulas that follow certain syntactic rules. The constraints are also independent of one another; rely on state variables for historic information; and when joined together by an AND operator, define exactly the correct behavior of the interface. This method is similar to using temporal logic for describing behavior. However, our methodology allows and requires only the most basic operators for writing the constraints, and eschews the more powerful operators such as "eventually event A will happen" or "it is always possible for event A to happen." Furthermore, many designers use temporal formulas to describe a system's isolated characteristics, but this methodology requires constructing a self-contained, complete specification for the interface.
Style rule 1. The first style rule requires that the constraints be written in the following form:
where "→" is the logical symbol for "implies". The antecedent is the expression to the left of the "→" symbol, and the consequent is the expression to the right of it. The allowed operators are AND, OR, NEGATION, and prev. The prev construct allows the value of a signal (or the state of a state machine) a cycle before the current state to be expressed. The constraints are written as an implication with the past expression (events from any state previous to the current state) as the antecedent and the current expression as the consequent. In essence, the past history, when it satisfies the antecedent expression, requires the current consequent expression to be true; otherwise, the constraint is not activated, and the interface signals do not have to obey the consequent in the current cycle. In this way, the activating logic and the constraining logic are separated. For example, the PCI protocol constraint, prev(trdy ∧ stop)→ stop, means "if the trdy and stop signals were true in the previous cycle (the activating logic), then stop must be true in the current cycle (the constraining logic)."
This separation is key to memory-efficient signal generation. It identifies the relevant (that is, activated) constraints on a particular cycle so that only these constraints are used. The other constraints, because they are not activated, can be ignored for this particular cycle. Also, the separation allows the final constraint formulas to contain only the consequent halves. The activating half can be discarded because it is used only to determine the relevance of the constraint in a cycle. For these two reasons, the final formula is far smaller, and memory efficiency is greatly improved.
Style rule 2. The second style rule-separability-requires each constraint to constrain only one component's behavior. Thus, a single constraint should constrain only outputs from one component, and not its inputs or other components' outputs. Equivalently, because the constraining part is isolated from the activating part (due to the first style rule), the separability rule requires the consequent to contain only one component's outputs.
Because only the consequent halves are used when the signals are generated and the consequents contain only outputs from one component, the number of variables in the constraint formula is greatly reduced. Otherwise, the formula will contain internal state variables, input variables, or output variables from other components. Consider the PCI constraint, "master must raise irdy# within eight cycles of the assertion of frame#," which translates to "IF the agent is the master and it has been seven cycles since frame was asserted and irdy has not been asserted yet and irdy is not asserted in this cycle, THEN the output irdy must be true in the next cycle."
Without the technique, the constraint formula will have to contain masteris (a 1-bit state variable), frame8 (a 3-bit counter), irdy (a 1-bit state variable), and irdy_out (a 1-bit free variable whose value we will choose). With the technique, the constraint formula contains only irdy_out if the constraint is activated and nothing if it is not. This is why, for the PCI example, the separation technique reduced the number of Boolean variables in the data structure from 161 to 15 and the space complexity from 2 161 to 2
15
.
Style rule 3. The third rule requires that the specification is free of a certain type of contradiction. This rule effectively guarantees that an output vector satisfying all the activated constraints always exists for a module as long as the output sequence so far has not violated the constraints.
There is a universal test that can verify this property for a specification. Using a model checker, the following computation tree logic (CTL) 4 property can be checked against the constraints, and any violations will pinpoint the dead state:
AG(all constraints have been true so far → EX(all constraints are true))
This rule is necessary to guarantee that a correct vector exists for every clock cycle when generating signals. If there was a contradiction, there would be no possible correct output for the module at a particular execution point.
Deriving an input generator
After designers fully specify a protocol with the list of constraints, they can use these constraints directly to emulate the protocol in software. Two setups are possible. In one scenario, no implementations have been designed yet, but designers would like to see possible interface signal traces to understand the protocol. For this scenario, dummy agents, each representing a particular module at the interface, can be created automatically. By using the constraints, the dummy agents generate outputs in every cycle and demonstrate how the interface modules would interact using the protocol. This is particularly useful because it is hard to visualize how the protocol works just from a collection of constraints.
In the second scenario, an implementation for a module has been designed and correct inputs are needed to stimulate the design. The dummy agents (minus the one for the implemented module) can emulate the behavior of the other agents at the interface and act as the environment for the module. The pseudocode in Figure 3 (next page) outlines how dummy agents can be created from the structured specification to generate inputs for the design, as in Figure 4 (on p. 103).
Biasing the inputs
There is a lot of interest in how to steer simulations through meaningful scenarios so that bugs can be found. Most designers acknowledge that the main challenges in simulation are determining the interesting scenarios and deciding how to lead the design to those scenarios. We explain how designers can use the specification to attain these two goals.
Coverage metric. As a first-order approximation of interesting scenarios, or corner cases, designers can use the antecedents of the constraints, because the implementation needs to comply with the constraint clause only when the antecedent clause is true. For example, consider the PCI constraint, "master must raise irdy within eight cycles of the assertion of frame." The antecedent is "the counter that starts counting from the assertion of frame has reached 7 and irdy still has not been asserted" and the consequent is "irdy is asserted." Unless this antecedent condition happens during the simulation, compliance with this constraint cannot be completely known. For a simulation run that has triggered only 10% of the antecedents, only 10% of the constraints have been checked for the implementation. In this sense, the number of antecedents fired during a simulation run is a rough coverage metric.
One major drawback results from using this metric for coverage. The problem is integral to the general relationship between implementation and specification. To create an implementation, the designer chooses an action from the choices offered by the specification for every state in the state machine. As a result, the implementation will not cover the full range of behavior allowed by the specification. Thus, some of the antecedents in the specification will never be true, because the implementation precludes any paths to a state where the antecedent is true. Unless verification engineers are familiar with the implementation design, they cannot know whether an antecedent has been missed because of the lack of appropriate simulation vectors or because the antecedent is structurally impossible. If there are n interface components, there will be n groups.
Remove the group whose constraints apply to //These will not be needed. component under verification.
Now there are n -1 groups of constraints. neers often apply biasing to input generation. If problematic states are caused by certain inputs being true often, the engineer programs the input generator to set the variable, n% true, instead of the neutral 50% true. For example, to verify how a component reacts to an environment that delays its response, env_response, the engineer can set the biasing so that the input, env_response, is true only 5% of the time. Designers should not use 0%, because it might cause the interface to deadlock. With prevailing methods, designers must provide the biasing numbers, requiring expert knowledge of the design, and must determine the biases by hand. In contrast, by targeting antecedents, designers can automatically derive interesting biases from the specification without knowing anything about the design. The algorithm works as follows:
1. Gather the constraints that specify the outputs of the component to be verified. The goal is to get as many antecedents of these constraints to become true during the simulation runs. Several interesting conclusions can be drawn regarding this algorithm. First, although effort was invested in determining optimal bias numbers exactly, biases that simply allowed a signal to be true (or false) often were sufficient. Empirically, interpreting "often" as 49 out of 50 times (98%) seems to work well. Second, an antecedent expression contains not only interface signal variables but also countervalues and other variables that cannot be skewed directly. Just skewing the input variables in the antecedent is primary biasing; dependency analysis produces a more refined, secondary biasing. For example, many hard-to-reach cases are states in which a counter has reached a high value; using dependency analysis, we determined biases that allow a counter to increment frequently without resetting.
Implementing biasing. To find variable assignments that satisfy a Boolean constraint, algorithms often use a binary decision diagram (BDD). 5 A constraint can be transformed into a tree-like BDD in which each node corresponds to a variable in the constraint, as Figure 5a shows.
A node has two outgoing branches; the algorithm takes the THEN branch if the variable is set to true, and the ELSE branch if false. By traversing this tree, the algorithm will eventually reach one of the two leaf nodes. Terminal node 1 indicates that the choices of variable assignments along the path taken (a = true, b = false, c = false, …) satisfy the constraint, whereas node 0 indicates that the assignment does not. For example, in Figure 5a , the path (a = false, b = true, c = false, f = false, h = false) leads to node 1, so the assignment satisfies the constraint.
The biasing of the input variables occurs during the BDD traversal stage of the input generation. Yuan et al. introduced the basic technique, 3 and we modified it. After the algorithm builds the input formula BDD for a component, the algorithm traverses the structure according to the biases. If variable b is biased to be true 49 out of 50 times, the TRUE branch from it is taken 49 out of 50 times (as in Figure 5a ). Likewise, if b is biased to be false 49 out of 50 times, the FALSE branch will be taken with that probability.
The input generation algorithm has an extra step to accommodate the biasing. Often, for a single simulation run targeting a specific antecedent, only a few input variables are biased. The variables must be reordered so that the biased variables are at the top of the BDD, and their truth values are not determined by the other variables. In Figure 5b , variable c is intended to be true most of the time. However, because c is buried toward the bottom of the BDD, if {a = 0, b = 1} is chosen, c is forced to be false to satisfy the constraint. In contrast, if c is at the top of the BDD, the true branch can be taken as long as the other variables are set accordingly (for example, a = 1). 
Experimental results
To demonstrate our verification methodology on a meaningful design, we chose the I/O component from the Stanford Flash design, 6 which is part of the multiprocessor project, for verification. The I/O unit, along with the rest of the project, had been extensively debugged, fabricated, and tested and is part of an operational system. We evaluated the methods on the PCI interface.
The design is described by 8,000 lines of Verilog and contains 283 variables, each ranging from 1 to 32 bits-a complexity that renders straightforward model checking unsuitable.
The setup
We used a formal PCI specification to constrain the inputs and check the outputs at the design's PCI interface. Using a compiler tool written in OCAML (http://caml.inria.fr), we generated, from the specification, a simulation checker that flags PCI protocol violations and an input generator that controls the design's PCI inputs. The I/O unit (the design under verification), checker, and input generator are connected and simulated together, and results are viewed using the Value Change Dump (VCD) file. We skewed the inputs with different biases for each simulation run to produce various extreme environments and stress the I/O unit.
Verification results
Using the 70 assertions provided by the interface specification, we found nine previously unreported bugs in the I/O unit. Most resulted from incorrect state machine design. For example, one bug manifested itself by violating the protocol constraint, "once trdy has been asserted, it must stay asserted until the completion of a data phase." Because of an incorrect path in the state machine, in some cases, the design would assert trdy and then, before the completion of the data phase, deassert trdy. This can deadlock the bus if the counterparty infinitely waits for the assertion of trdy. We easily corrected the bug by removing the problematic, and most likely unintended, path.
The setup makes the verification process far easier; the process of finding signal-level bugs is now nearly automated, so most of the effort can focus on reasoning about the bug once it is found.
Coverage results
Because the Flash PCI design is conservative and implements a small subset of the specification, basing the coverage metric on the component's specification was not especially useful. For example, the design initiates only single datanever multiple data-phase transactions. Or, instead of having the flexibility to respond with any of the three termination modes, it always responds with the same mode. Thus, most of the antecedents remained false because the component never performed many of the actions allowed by the specification.
However, by using the metric to ensure that the environment is as flexible as possible, we found that the coverage proved to be far more powerful. The goal is to ensure that the design is compatible with any component that complies with the interface protocol. The design should be stimulated with the most general set of inputs, and so, using the antecedents from the constraints that specify the environment (in Figure 6 , a0, a1, and so on) to determine biases was extremely fruitful; most of the design bugs were unearthed with these biases.
Performance results
Performance issues, such as speed and memory use, did not pose problems, so we were free to focus on generating interesting simulation inputs. However, to demonstrate the method's scalability for larger designs, we tabulated performance results. We ran the simulations on a four-processor Sun UltraSparc-II 296-MHz system with 1.28 Gbytes of main mem- ory. The specification provided 63 constraints to model the environment. The BDDs used for signal generation were small; the peak number of nodes during simulation was 193, and the peak amount of memory used was 4 Mbytes. Furthermore, speed was only slightly sacrificed to achieve this space efficiency. Table 1 lists the execution times for different settings. With no constraint solving, where inputs are randomly set, the simulation takes 0.64 seconds for 12,000 simulator time steps. If we used the input generator, the execution time increased by 57% to 1.00 second. Such an increase in time was not debilitating, and the inputs were guaranteed to be correct. Table 1 
