Abstract-Simulation-based unit-level hardware verification is intended for dynamical checking of hardware designs against their specifications. There are different ways of the specification development and design correctness checking but it is still difficult to diagnose something more than incorrect data on some or other design outputs. The proposed approach is not only to find erroneous design behavior but also to make an explanation of incorrectness on the base of resulted reactions based on special mechanism using a list of explanatory rules.
I. INTRODUCTION
Taking up to 80% of the total verification efforts [1] , verification of HDL designs remains being very important. We expect the verification labor costs to be decreased by means of more convenient and substantial diagnostic information. The most complicated problem that underlies in all the approaches to hardware verification is how to represent the specification in machine-readable form that can be both convenient for development and useful for verification purposes. Typically, the specifications can be represented by means of temporal assertions (like in SystemVerilog in general and in Unified verification methodology [2] in particular), or using implicit contracts in form of pre-and post-conditions applied for each operation and micro-operation [3] , or by means of executable models. The way of assertion usage lacks of certain incompleteness as assertions covers some of other quality and their possible violation shows only the quality without any guesses why it has happened. To guess something in this case, we should have had a bit higher representation of specifications. The way of implicit specification by means of contracts allows showing which micro operation does not work, but it is still difficult to interpret such information as such interpretation requires lower specification representation.
The executable specification can be considered as being the most useful in the error explanation. To be the most appropriate, the specification should imitate the logic architecture of HDL designs, and the test system is to have mechanisms of explanations the results of simulation. Exactly such mechanisms based on executable specifications are implemented in the proposed approach to test system development as it will be shown later.
The rest of the paper is organized as follows. The following chapter introduces the method of specification and test system development. The third chapter tells about reaction checker work. The fourth chapter reveals the theory underlying the explanatory mechanism. The fifth chapter says a few words about implementation of the approach in C++ library named C++TESK Testing ToolKit [4] . Then a few words about the approach application are given. The seventh chapter concludes the paper.
II. SPECIFICATION AND TEST SYSTEM ARCHITECTURE
The typical test system for unit-level simulation-based hardware verification includes the following three parts: generator of stimuli, reaction checker, and design under verification (DUV) connected to the test system via special adapter. The proposed approach follows the same tradition but formulates properties of test system components more strictly. Let us shortly consider all the parts of the ordinary test system developed according to the approach (see Figure 1) and then review reference model development more thoroughly.
It should be noticed that fully colored elements in Figure 1 are derived from the supporting library (C++TESK), halfcolored elements are developed for each DUV manually on the base of the supporting library, and white-boxed elements are developed fully manually.
Test oracle is the test system core. In fact, the test oracle works as a typical reaction checker; it receives stimuli flow from stimuli generator, receives implementation reactions (DUV reactions) enveloped into messages, and compares them with model reactions produced by reference model. Each message consists of a number of fields carrying data. Only messages with the fields of the same data types are comparable. The test oracle includes a replacement for the reference model which is called reference model environment. The environment consists of a list of operations and functional dependencies between data on output and input interfaces. The operation description is based on extension of external reference model with timing properties.
The other parts of the test oracle are reaction matcher, and diagnostics subsystem. The reaction sequence made by the reference model is processed by the reaction matcher. It consists of processes each of which processes reactions on one particular reference model interface. As each reference model reaction is bound to a particular output interface, so that all the reactions are subdivided into a set of model interfaces. A reaction arbiter is defined for each output model interface. This component orders model reaction as follows.
When the model reaction is received by the reaction matcher, special process waiting for correspondent implementation reaction is started. If the implementation reaction is found, the process asks the reaction arbiter of the interface whether it can catch the reaction. The reaction arbiter contains a list of model reactions, registered at the interface where the arbiter is defined and not yet matched to the implementation reactions. The match process asking the arbiter about possibility of catching, the arbiter checks the list and according with a strategy of reaction selection (i.e. FIFO, LIFO, data matching) permits or forbids the matching process to catch the implementation reaction.
It is the way of reaction arbitration on each output interface. If the catching is allowed, the model reaction is deleted from the arbiter's reaction list, and the couple of model and implementation reaction is sent to the diagnostics subsystem. If the catching is forbidden, the matching process returns to the state of looking for the next implementation reaction. If the waiting for implementation reaction timeout is reached (the timeout can be set up to each interface separately), the reaction is sent to the diagnostics subsystem alone without implementation reaction marked as missing reaction. Besides processes looking for implementation reactions launched by model reactions, special processes named listeners are launched by test system for each interface. Each listener is bound to a particular interface and works as follows.
It contains an infinite loop of receiving implementation reaction, shaping the message with the reaction data, checking whether the reaction is matched to the correspondent model reaction at the next cycle after the implementation is completely received by test system. If the matching has happened, the listener returns to its first state and starts looking for the next implementation reaction. If the listener finds out that the implementation reaction has not been taken by any model reactions, it has been waiting for a certain implementation reaction timeout, having placed the implementation reaction into special buffer, offering next model reaction to match with the given implementation reaction. If the implementation reaction timeout is reached, the reaction is sent to the diagnostics subsystem alone without model reaction marked as unexpected reaction.
III. REACTION CHECKER ALGORITHM
The reaction checker work can be described by means of an algorithm showing clearly its possibility of catching all visible DUV defects. To provide the algorithm, some introduction might be useful. There are two definitions, the algorithm and a theorem about the reaction checker work.
All the input and output signals of DUV (implementation) are subdivided into input and output interfaces. The set of input and output interfaces of the reference model (specification) matches the one of the implementation (In and Out). Alphabets of stimuli and reactions of the implementation and specification also match each other (X and Y ). Set of implementation state (S impl ) and specification states (S spec ) speaking generally might differ but initial states of implementation and specification are marked out (s impl0 ∈ S impl and s spec0 ∈ S spec ).
Applied during testing to input interface in ∈ In stimuli are elements of the sequenceX
, where x i ∈ X is a single stimulus, t i ∈ N 0 is the time mark of its application (t i < t i+1 , i = 1, n − 1). The set of stimuli sequences applied during testing to input interfaces will be denoted asX =<X in1 , . . . ,X inn > and called stimuli sequence. Stimuli sequence admissibility is defined by
Implementation answering the stimuli sequenceX produces reactionsȲ
and sends them to the output interface out ∈ Out, where y i ∈ Y is a single reaction, t i ∈ N 0 is time of its sending (t i < t i+1 , i = 1, m − 1). Let the set of reaction sequences emitting by the implementation to all interfaces be denoted asȲ impl =< Y Specification answering the stimuli sequenceX produces reactionsȲ
and sends them to the output interface out ∈ Out, where y i ∈ Y is a single reaction, t i ∈ N 0 is time of its sending (t i ≤ t i+1 , i = 1, k − 1). Let the set of reaction sequences emitting by the implementation to all interfaces be denoted asȲ spec =<Ȳ out1 spec , . . . ,Ȳ out K spec > and called specification reaction sequence.
Let each output interface out ∈ Out to be equipped with reaction production timeout ∆t out ∈ N 0 . Let also each finite stimuli sequence results in a finite reaction sequence. Let us denote single element of reaction sequenceȲ = (y i , t i )
The operation of element removing from the reaction sequenceȲ \(y, t) is defined as follows: if the element being removed is absent in the sequence, the result consists of the former sequence; if the element is in the sequence, its first entrance in the sequence will be removed. The sequence length is denoted as m = |Ȳ |.
Definition 1:
The implementation is said to correspond to the specification if ∀out ∈ Out and ∀X ∈ Dom |Ȳ out impl (X)| = |Ȳ out spec (X)| = m out is satisfied and there is a rearrangement π out of the set {1, , m out } so that ∀i ∈ {1, . . . , m out } y i = y j t j ≤ t i ≤ t j + ∆t out is satisfied, where j = π out (i).
Definition 2: The implementation behavior is said to have an observable failure if the implementation does not correspond to the specification or ∃X ∈ Dom and ∃out ∈ Out so that either |Ȳ out impl (X)| = |Ȳ out spec (X)|, or for each rearrangement π out of the set {1, . . . , m out } ∃i ∈ {1, . . . , m out } for which
Lemma 1: If reaction sequenceȲ impl andȲ spec are finite, and |Ȳ impl | = |Ȳ spec | then test oracle returns negative verdict.
Proof: Suppose the main cycle of the algorithm not to find a failure. In this case the number of elements in sequenceȲ * spec (which at the first step was equal to the number of elements inȲ spec ) will be decreased to the number, which the sequencē Y impl contains. If |Ȳ impl | > |Ȳ spec |, then there is no step of the test oracle algorithm to find reaction from sequenceȲ spec correspondent to current being worked under reaction from sequenceȲ impl . In this case test oracle finishes its work with negative verdict. We had supposed that such a situation cant occur, so that |Ȳ impl | < |Ȳ spec |. In this case |Ȳ * spec | = |Ȳ spec |− |Ȳ impl | > 0 and the oracle finishes its work with negative verdict in due to condition if |Ȳ * spec | = ∅ then return (false) after the main cycle having finished.
Theorem 1: Test oracle working according to the proposed algorithm allows constructing significant tests (it means that oracle is not mistaken having found certain defect).
Proof: The case when |Ȳ impl | = |Ȳ spec | meaning that there are different numbers of implementation and specification reactions is considered to be erroneous according to the definition 2. It was considered in the lemma and shown that the test oracle in this case does return negative verdict.
Let us consider the case |Ȳ impl | = |Ȳ spec | = 0. Here the main cycle of test oracle work is not executed, the condition if |Ȳ * spec | = ∅ then return (false) is not satisfied too and the test oracle returns positive verdict (true). The case of empty sequences is understood as correct according to the definition 2. According to the induction rule of inference, let us suppose that for the case |Ȳ impl | = |Ȳ spec | = n test oracle returns verdict correctly. Let us prove that the same situation takes place if the numbers of elements in sequences are equal to n + 1. According to the definition 2, defect can be found if for each rearrangement π of set 1, . . . , n∃i ∈ 1, . . . , n, when
Type name
Reaction pair Definition of type 
is satisfied, where j = π(i).
Let us remove last elements of sequences and make sequences where the numbers of elements are equal to n and to which test oracle works correctly. Let us consider the case of the following two removed reactions. Negative verdict can be returned only in two cases: the first one is if Y Therefore, test oracle returns negative verdict only when there is erroneous reaction in any finite reaction sequences.
IV. DIAGNOSTICS SUBSYSTEM
Let reaction checker use two sets of reactions:
. Each specification reaction consists of four elements: r spec = (data, if ace, time min , time max ). Each implementation reaction includes only three elements: r impl = (data, if ace, time). Notice that time min and time max show an interval where specification reaction is valid, while time corresponds to a single timemark: generation of implementation reaction always has concrete time mark.
The reaction checker has already attempted to match each reaction from R spec with a reaction from R impl , making a reaction pair. If there is no correspondent reaction for either specification or implementation ones, the reaction checker produces some pseudo reaction pair with the only one reaction. Each reaction pair is assigned with a certain type of situation from the list normal, missing, unexpected, incorrect.
For given reactions r spec ∈ R spec and r impl ∈ R impl , these types can be described as in Table I . Remember that each reaction can simultaneously be located only in one pair.
The diagnostics subsystem has its own interpretation of reaction pair types (see Table II ). In fact, the subsystem translates original reaction pairs received from the reaction checker into new representation. This process can be described as M ⇒ M * , where M = {(r spec , r impl , type) i } is a set of reaction pairs marked with type from the list above. M * = {(r spec , r impl , type * ) i } is a similar set of reactions pairs but with different label system. It should be noticed that these might be different M * dependent on the algorithm of its creation (accounting for original order, strategy of reaction pair selection for recombination, etc). This question will be discussed after so called transformation rules are presented.
Having made reaction pair set, reaction matcher sends it to the diagnostics subsystem to process them providing verification engineers with explanation of problems having occurred in the verification process. The diagnostics subsystem is underlain with a special algorithm, consisting of consequent application of the set of so called rules, each of which transforms the reaction pairs. Some rules decrease the number of pairs, having found pairs with correspondent implementation and specification reactions, collapse them and write diagnostics information into log-file. Other rules make it possible to recombine reaction pairs for better application of rules from the first type. The third part of rules uses special technique to find similar reactions according to the distant function to recombine the reaction pairs for better readability but do. The distant function can be implemented in three possible ways. To begin with, it may account the number of equal data fields in two given messages. Second, Hamming distance may be used as one can compare not only the fields but the bits of data carried by the fields. The measure of closeness between two given reactions is denoted as C(r spec , r impl ).
Each rule consists of one or several pairs of reactions. In cases of missing of unexpected reactions, one of the pair elements is undefined and called null. Each pair of reaction is assigned with model interface. Left part of the rule shows initial state and right part (after the arrow) shows result of the rule application. If the rule is applied to several reaction pairs, they are separated with comma. Now, let us review all these twelve rules that we found. Rule 2: If there is a normal reaction pair (a spec , a impl ) : data aspec = data a impl , it should be collapsed. (a spec , a impl ) ⇒ (null, null).
Rule 4:
If there is a missing reaction pair and an unexpected reaction pair (a spec , null), (null, a impl ) : data aspec = data a impl , they should be united into one reaction pair. {(a spec , null), (null, a impl )} ⇒ {(a spec , a impl )}.
Rule 5: If there is a missing reaction pair and an incorrect reaction pair (a spec , null), (b spec , a impl ) : data aspec = data a impl , these reaction pairs should be regrouped.
Rule 6: If there is an unexpected reaction pair and an incorrect reaction pair (null, a impl ), (a spec , b impl ) : data aspec = data a impl , these reaction pairs should be regrouped. 
The rules 1-7 allow finding the closest reaction pairs. The algorithm of their implementation is shown in 2 and 4 algorithms.
When the rules from the list of normal rules have been applied, the sets R spec and R impl does not contain any not yet collapsed reactions with identical data. In this part of diagnostics subsystem work the stage of fuzzy rules (See 3 and 5 algorithms) comes.
, where c is the selected distance function and the value of c is the best amoung other fuzzy rules, these reaction pairs should be regrouped.
If there are two reaction pairs {(a spec , null), (null, a impl )} and the value of the selected distant function c = (a spec , a impl ) is the best amoung other fuzzy rules, these reaction pairs should be regrouped. {(a spec , null), (null, a impl )} ⇒ {(a spec , a impl )} Rule 10: If there are two reaction pairs {(a spec , null), (b spec , a impl )} : c(a spec , a impl ) < c(b spec , a impl ), where c is the selected distance function and the value of c is the best amoung other fuzzy rules, these reaction pairs should be regrouped. {(a spec , null), (b spec , a impl )} ⇒ {(a spec , a impl ), (b spec , null)} Rule 11: If there are two reaction pairs {(null, a impl ), (a spec , b impl )} : c(a spec , a impl ) < c(a spec , b impl ), where c is the selected distance function and the value of c is the best amoung other fuzzy rules, these reaction pairs should be regrouped.
When all the metrics of fuzzy rules have been measured and all the most suitable rules have been applied, the time of the last rule comes.
Rule 12: If there is a reaction pair (a spec , a impl ) with both specification and implementation parts, it should be collapsed.
(a spec , a impl ) ⇒ (null, null).
The last rule allows transforming all the incorrect reaction pairs to show the diagnostics for the whole list of reaction pairs. Typically, after the application of each rule, the history of transformation is traced and then it is possible to reconstruct the parents of the given reaction pairs and all the rules they are undergone. Such a reconstruction of the rule application trace we understand as the diagnostics information.
V. IMPLEMENTATION
The proposed approach to development of test systems, reference model construction, reaction correctness checking, and diagnostics subsystem has been implemented in the open source library C++TESK Testing ToolKit [4] developed by ISPRAS. The library is developed in C++ language to be convenient for verification engineers. It contains macros enabling the engineers to develop all the parts of the test systems which should be done by hands. Some parts, like diagnostics subsystem algorithm, are hidden inside of the tool.
Results of diagnostics work are shown each time after the verification is over. Now they look like tables with all found errors and results of rule application: new reaction pair sets and the way of their obtaining.
VI. RESULTS
The C++TESK testing toolkit including diagnostics subsystem has been used in the number of projects of industrial microprocessor development in Russia. The aim of the all approach is unit-level verification and on this level it can be a competitor to widely used UVM mentioned in the introduction.
It might be shown by the following fact. Typically, we started verification by means of C++TESK starts when the whole system had been already verified by UVM-like approaches. In spite of power of UVM, it does not include means to direct test sequence generation, which C++TESK does, means of quick analysis of verification results as diagnostics subsystem etc.
Results of application of different approaches depend on the qualification of the engineers and their familiarity with the approach. And on this point, we should say that our toolkit was used by people now being close to its development kitchen and despite it, they exactly managed to find those bugs we have already mentioned.
VII. CONCLUSION
The proposed approach to simulation-based unit-level hardware verification solves in some sense the task of dynamical checking of hardware designs against their specifications. It includes both means of specification development and diagnostics subsystem producing an explanation of incorrectness on the base of special mechanism using formally represented specifications and a list of explanatory rules.
The approach has been used in the number of projects and shown its possibility to find defects and help verification engineers to correct them by means of diagnostics information.
Our future research is connected with more convenient representation of diagnostics results by means of wave-diagrams, localization of found problems in source-code.
