Abstract. Semiconductor companies have increasingly adopted a methodology that starts with a system-level design specification in C/C++/SystemC. This model is extensively simulated to ensure correct functionality and performance. Later, a Register Transfer Level (RTL) implementation is created in Verilog, either manually by a designer or automatically by a high-level synthesis tool. It is essential to check that the C and Verilog programs are consistent. In this paper, we present a two-step approach, embodied in two equivalence checking tools, VERIFOX and HW-CBMC, to validate designs at the software and RTL levels, respectively. VER-IFOX is used for equivalence checking of an untimed software model in C against a high-level reference model in C. HW-CBMC verifies the equivalence of a Verilog RTL implementation against an untimed software model in C. To evaluate our tools, we applied them to a commercial floating-point arithmetic unit (FPU) from ARM and an open-source dual-path floating-point adder.
Introduction
One of the most important tasks in Electronic Design Automation (EDA) is to check whether the low-level implementation (RTL or gate-level) complies with the systemlevel specification. Figure 1 illustrates the role of equivalence checking (EC) in the design process. In this paper, we present a new EC tool, VERIFOX, that is used for equivalence checking of an untimed software (SW) model against a high-level reference model. Later, a Register Transfer Level (RTL) model is implemented, either manually by a hardware designer or automatically by a synthesis tool. To guarantee that the RTL is consistent with the SW model, we use an existing tool, HW-CBMC [15] , to check the correctness of the synthesized hardware RTL against a SW model.
In this paper, we address the most general and thus most difficult variant of EC: the case where the high-level and the low-level design are substantially different. Stateof-the-art tools, such as Hector [14] from Synopsys and SLEC from Calypto, 4 rely on equivalence points [18] , and hence they are ineffective in this scenario. We present an approach based on bounded analysis, embodied in the tools VERIFOX and HW-CBMC, that can handle arbitrary designs.
VERIFOX is used for equivalence checking of an untimed software model against a high-level reference model and HW-CBMC is used for equivalence checking of the RTL implementation against a software model. EC is broadly classified into two separate categories: combinational equivalence checking (CEC) and sequential equivalence checking (SEC). CEC is used for a pair of models that are cycle accurate and have the same state-holding elements. SEC is used when the high-level model is not cycle accurate or has a substantially different set of state-holding elements [1, 11] . It is well-known that EC of floating-point designs is difficult [12, 19] . So there is a need for automatic tools that formally validate floating-point designs at various stages of the synthesis flow, as illustrated by right side flow of Figure 1 .
Contributions:
In this paper, we sketch two significant equivalence-verification tools:
1. VERIFOX, a tool for equivalence checking of software models given as C programs.
We present a path-based symbolic execution tool, VERIFOX, for bounded equivalence checking of floating-point software implementations against a IEEE 754 compliant reference model. VERIFOX supports C89, C99 standards in the front-end. VERIFOX also supports SAT and SMT backends for constraint solving. VERIFOX is available at http://www.cprover.org/verifox. 
VERIFOX: A tool for equivalence checking of C programs
VERIFOX is a path-based symbolic execution tool for equivalence checking of C programs. The tool architecture is shown on the left side of Figure 2 . VERIFOX supports the C89 and C99 standards. The key feature is symbolic reasoning about equivalence between FP operations. To this end, VERIFOX implements a model of the core IEEE 754 arithmetic operations-single-and double-precision addition, subtraction, multiplication, and division-which can be used as reference designs for equivalence checking. So VERIFOX does not require external reference models for equivalence checking of floating-point designs. This significantly simplifies the users effort to do equivalence checking at software level. The reference model in VERIFOX is equivalent to the Softfloat model. 5 VERIFOX also supports SAT and SMT backends for constraint solving. Given a reference model, an implementation model in C and a set of partition constraints, VERIFOX performs depth-first exploration of program paths with certain optimizations, such as eager infeasible path pruning and incremental constraint solving. This enables automatic decomposition of the verification state-space into subproblems, by input-space and/or state-space decomposition. The decomposition is done in tandem in both models, exploiting the structure present in the high-level model. The approach generates many but simpler SAT/SMT queries, similar to the technique followed in KLEE [4] . The main focus of our technique is to pass only those verification conditions to the underlying solver for which the corresponding path conditions are feasible with Figure 3 shows three feasible path constraints corresponding to the three paths in the program on the left. In contrast, the last column of Figure 3 shows monolithic pathconstraint generated by HW-CBMC.
Incremental solving in VERIFOX. VERIFOX can be run in two different modes: partial incremental and full incremental. In partial incremental mode, only one solver instance is maintained while going down a single path. So when making a feasibility check from one branch b 1 to another branch b 2 along a single path, only the program segment from b 1 to b 2 is encoded as a constraint and added to the existing solver instance. Internal solver states and the information that the solver gathers during the search remain valid as long as all the queries that are posed to the solver in succession are monotonically stronger. If the solver solves a formula φ, then posing φ ∧ ψ as a query to the same solver instance allows one to reuse solver knowledge it has already acquired, because any assignment that falsifies φ also falsifies φ∧ψ. Thus the solver need not revisit the assignments that it has already ruled out. This results in speeding up the feasibility check of the symbolic state at b 2 , as the feasibility check at b 1 was true. A new solver instance is used to explore a different path, after the current path is detected as infeasible.
In full incremental mode, only one solver instance is maintained throughout the whole symbolic execution. Figure 2 uses synthesis to obtain either a bit-level or a word-level netlist from Verilog RTL. The bottom flow illustrates the translation of the C program into static single assignment (SSA) form [9] . These two flows meet only at the solver. Thus, HW-CBMC generates a monolithic formula from the C and RTL description, which is then checked with SAT/SMT solvers. HW-CBMC provides specific handshake primitives such as next time f rame() and set inputs() that direct the tool to set the inputs to the hardware signals and advance the clock, respectively. The details of HW-CBMC are available online. 7 
Experimental Results
In this section, we report experimental results for equivalence checking of difficult floating-point designs. All our experiments were performed on an Intel R Xeon R machine with 3.07 GHz clock speed and 48 GB RAM. All times reported are in seconds. MiniSAT-2.2.0 [10] was used as underlying SAT solver with VERIFOX 0.1 and HW-CBMC 5.4. The timeout for all our experiments was set to 2 hours.
Proprietary Floating-point Arithmetic Core: We verified parts of a floating-point arithmetic unit (FPU) of a next generation ARM R GPU. The FP core is primarily composed of single-and double-precision ADD, SUB, FMA and TBL functional units, the register files, and interface logic. The pipelined computation unit implements FP operations on a 128-bit data-path. In this paper, we verified the single-precision addition (FP-ADD), rounding (FP-ROUND), minimum (FP-MIN) and maximum (FP-MAX) operations. The FP-ADD unit can perform two operations in parallel by using two 64-bit adders over multiple pipeline stages. Each 64-bit unit can also perform operations with smaller bit widths. The FPU decodes the incoming instruction, applies the input modifiers and provides properly modified input data to the respective sub-unit. The implementation is around 38000 LOC, generating tens of thousands of gates. We obtained the SW model (in C) and the Verilog RTL model of the FPU core from ARM. (Due to proprietary nature of the FPU design, we can not share the commercial ARM IP.)
Open-source Dual-path Floating-point Adder: We have developed both a C and a Verilog implementation of an IEEE-754 32-bit single-precision dual-path floating point adder/subtractor. This floating-point design includes various modules for packing, unpacking, normalizing, rounding and handling of infinite, normal, subnormal, zero and NaN (Not-a-Number) cases. We distribute the C and RTL implementation of the dualpath FP adder at http://www.cprover.org/verifox.
Reference Model:
The IEEE 754 compliant floating-point implementations in VERI-FOX are used as the golden reference model for equivalence checking at the software level. For equivalence checking at the RTL phase, we used the untimed software model from ARM as the reference model, as shown on the right side of Figure 1 .
Miters for Equivalence Checking:
A miter circuit [3] is built from two given circuits A and B as follows: identical inputs are fed into A and B, and the outputs of A and B are compared using a comparator. For equivalence checking at software level, one of the circuits is a SW program and the other is a high-level reference model. For the RTL phase, one of the circuits is a SW program treated as reference model and the other is an RTL implementation.
Case-splitting for Equivalence Checking: Case-splitting is a common practice to scale up formal verification [12, 14, 19] and is often performed by user-specified assumptions. The CPROVER assume(c) statement instructs HW-CBMC and VERIFOX to restrict the analysis to only those paths satisfying a given condition c. For example, we can limit the analysis to those paths that are exercised by inputs where the rounding mode is nearest-even (RNE) and both input numbers are NaNs by adding the following line:
CPROVER assume(roundingMode==RNE && uf nan && ug nan); Table 1 ) is detected by HW-CBMC in the RTL implementation of ARM FPU when checked against the SW model of ARM FPU for the case when both the input numbers are NaN. This happens mostly due to bugs in the high-level synthesis tool or during manual translation of SW model to RTL. VERIFOX and HW-CBMC is able to detect bugs in the SW and RTL models of these designs respectively -thereby emphasizing the need for equivalence checking to validate the synthesis process during the EDA flow. Further, we investigate the reason for higher verification times for subnormal numbers compared to normal, infinity, NaN's and zero's. This is attributed to higher number of paths in subnormal case compared to INF, NaN's and zero's. Closest to our floating-point symbolic execution technique in VERIFOX is the tool KLEE-FP [8] . We could not, however, run KLEE-FP on the software models because the front-end of KLEE-FP failed to parse the ARM models.
Related work
The concept of symbolic execution [4, 7, 13] is prevalent in the software domain for automated test generation as well as bug finding. Tools such as Dart [13] , Klee [4] , EXE [5] , Cloud9 [16] employ such a technique for efficient test case generation and bug finding. By contrast, we used path-wise symbolic execution for equivalence checking of software models against a reference model. A user-provided assumption specifies certain testability criteria that render majority of the design logic irrelevant [12, 14, 19] , thus giving rise to large number of infeasible paths in the design. Conventional SATbased bounded model checking [2, 6, 15] can not exploit this infeasibility because these techniques create a monolithic formula by unrolling the entire transition system up to a given bound, which is then passed to SAT/SMT solver. These tools perform casesplitting at the level of solver through the effect of constant propagation. Optimizations such as eager path pruning combined with incremental encoding enable VERIFOX to address this limitation.
Concluding Remarks
In this paper we presented VERIFOX, our path-based symbolic execution tool, which is used for equivalence checking of arbitrary software models in C. The key feature of VERIFOX is symbolic reasoning on the equivalence between floating-point operations.
To this end, VERIFOX implements a model of the core IEEE 754 arithmetic operations, which can be used for reference models. Further, to validate the synthesis of RTL from software model, we used our existing tool, HW-CBMC, for equivalence checking of RTL designs against the software model used as reference. We successfully demonstrated the utility of our equivalence checking tool chain, VERIFOX and HW-CBMC, on a large commercial FPU core from ARM and a dual-path FP adder. Experience suggests that the synthesis of software models to RTL is often error prone-this emphasizes the need for automated equivalence checking tools at various stages of EDA flow. In the future, we plan to investigate various path exploration strategies and path-merging techniques in VERIFOX to further scale equivalence checking to complex data and control intensive designs.
A Appendix
This appendix provides simple, illustrative examples of the use of VERIFOX and HW-CBMC, as well as further technical details. Figure 4 demonstrates the working of VERIFOX as a property verifier in the absence of a reference model. Note that equivalence checking is a special case of property verification where the property is replaced by a reference model. Hence, VERIFOX can be configured as a property verifier or as an equivalence checker.
Worked Example of VERIFOX
Let us consider a software model as shown in column 1 in Figure 4 . The program implements a high-level power management strategy to orchestrate various modules, such as, core, memory etc. Depending on the interrupt status (env), power modes (mode) and power gated logic (power gated), the call to core or memory is made. These units are complex implementations of a processor core or a memory unit.
State-of-the-art verification tools may not be able verify the whole system due to resource limitations. Therefore, it is a common practice to write additional constraints, also known as assumptions, that exercise only a fragment of the entire state-space. Verification engine can use these assumptions to partition the state-space, thus decomposing a hard proof into simpler sub-proofs. Column 2 presents the result of property-driven slicing on the input program. This step is purely syntactic, meaning that we perform a backward dependency analysis [17] starting from the property which only preserve those program fragments that are relevant to the given property. The sliced program is then passed to the symbolic execution engine that performs eager infeasibility based path-pruning. The result of infeasible path pruning based on assumption is shown in column 3. This step is semantic because VERIFOX determines the feasibility of paths in the sliced program in an eager fashion with respect to the user-provided assumptions using satisfiability queries.
An important point to note here is that the number of path constraints after slicing and infeasible path pruning are significantly less compared to the initial program. Additionally, these per-path constraints are much easier to solve compared to a monolithic formula generated from a BMC-style symbolic execution tool.
Command to run VERIFOX. Below are the commands to run VERIFOX in partial or full incremental mode. When VERIFOX is used as an equivalence checker, the input file is usually a miter in C which must include both the reference model and the implementation model. However, in the absence of a reference model, one can write assertions inside the software model to configure VERIFOX as a property verifier. The command line switch --unwind is used to specify the unwind depth for the software model. To use the SMT backend with VERIFOX, the command line switch is --smt2, followed by the name of the SMT solver, for example --z3. Note that the SMT solver must be installed in the system. The switch --help shows the available command line options for using VERIFOX. Worked Example of HW-CBMC Figure 5 demonstrates the working of HW-CBMC as a C-RTL equivalence checker. Columns 1-3 present a C model of an up-counter, an RTL model of the same device, and a miter that feeds the same input to the C and RTL model and asserts equivalence of their outputs. HW-CBMC can be configured in bit-level or word-level mode. In bitlevel mode, the input models are synthesized to And Inverter Graphs (AIG) 8 and then passed to the SAT solver. In word-level mode, the input models are synthesized into an intermediate word-level format, which are then despatched to a word-level SMT solver.
Command to run HW-CBMC. Shown below are the commands to configure HW-CBMC in bit-level or word-level mode. The first command using --gen-interface is used to generate the interface for the hardware modules automatically. These interface signals are required to construct the miter as shown in column 3 of Figure 5 . Fig. 5 . Example of equivalence checking using HW-CBMC the <VERILOG-FILE-NAME> can be specified as (.v) or (.sv) file, where (.v) is an extension for Verilog files and (.sv) is an extension for SystemVerilog files. We assume that the <MITER-FILE-NAME> includes the reference model in C and implements the miter. Note that HW-CBMC expects the reference model and the miter implementation to be C programs. The command line switch --aig instructs the tool to operate in bitlevel mode. Without this option, the default operating mode in HW-CBMC is word-level mode. The switch --bound and --unwind is used to specify the unwind depth for the hardware and software transition system respectively. The switch --module specifies the name of the top level module in the Verilog design file. HW-CBMC also provides an option, --vcd to dump counterexamples in Value Change Dump (vcd) format in case of assertion failure, which can be analyzed for debugging purpose using waveform viewer, such as gtkwave. 9 The switch --help shows the available command line options for using HW-CBMC. 
Monolithic and Path-wise Approach to Equivalence Checking
We investigated the structure of the ARM FPU and dual-path adder examples discussed the paper to analyze the effect on runtimes of the monolithic and path-based equivalence checking approaches followed by HW-CBMC and VERIFOX respectively.
We observe that the pipelined implementation of ARM FPU forces VERIFOX to traverse deep into a particular path and then backtrack to a much higher level in the symbolic tree due to infeasibility of the current path. This causes VERIFOX to throw away several path fragments that were earlier considered feasible while going deep in the path only to be discovered as infeasible much later. This results in the wastage of significant computation time in VERIFOX. On the other hand, the dual-path adder contains a statemachine that implements separate cases for the addition of different types of numbers. This allows VERIFOX to perform an early infeasibility check and prune most of the irrelevant logic upfront in the symbolic execution phase using assumptions. On the other hand, the monolithic constraint generated by HW-CBMC for the dual-path FP adder was extremely difficult to solve. In this way, our experiments give some insight into how the path-based symbolic execution in VERIFOX and the monolithic BMC-based approach in HW-CBMC are sensitive to the structure of the original floating-point design.
Synthesizable Constructs in HW-CBMC
Our Verilog front-end in HW-CBMC support IEEE 1364.1 2005 Verilog standards. This includes the entire synthesizable fragment of Verilog. The detailed list of synthesizable Verilog constructs supported by our Verilog front-end is available in our website www.cprover.org/ebmc/manual/verilog language features.shtml. Figure 6 shows an example miter for checking equivalence of a 64-bit floating-point adder at the software level and RTL phase using VERIFOX and HW-CBMC respectively.
Miter Construction for Equivalence Checking
For the miter in VERIFOX, we provide the same floating-point numbers as inputs to the reference design (built inside VERIFOX) and an externally provided untimed SW implementation (in C). We then set the rounding mode of the reference model and the SW implementation accordingly. Subsequently, the results of addition from the reference model (sum re f ) and the SW implementation (sum impl) are checked for equivalence using the function, assert(compareFloat(sum ref, sum impl));.
