Errors in network components can have disastrous effects so it is important that all aspects of the design are correct. We describe our experiences formally verifying an implementation of an Asynchronous Transfer Mode (ATM) network switching fabric using the HOL90 theorem proving system. The design has been fabricated and is in use in the Cambridge Fairisle Network. It was designed and implemented with no consideration for formal specification or verification. This case study gives an indication of the difficulties in formally verifying real designs. We discuss the time spent on the vlerification. This was comparable to the time spent designing and testing the fabric. We also describe the problems encountered and the errors discovered.
Introduction
Communication networks are rapidly becoming all pervasive. Systems are increasingly being networked in the local area with applications using non-local services. In the wide area, telecommunications companies are turning to digital networks. As networks bccome all-pervasive, the consequences of errors in the design or implementation of network components become increasingly important. This is especially so if networks are used in safety-critical applications where conimunication problems could cause loss of life. For example a t.elephone network problem can contribute to loss of life if the emergency services cannot be contacted. Errors could cause the network to deadlock, particular links to crash, the service to be degraded to an unacceptable level, or even the whole network to crash. Network problems affect a wide range of users and applications and can cause whole systems or companies to grind to a halt [16, 171. Asynchronous Transfer Mode (ATM) is a relatively new technology that is being adopted by the computer and telecommunication industries in local and wide area networks in response to changing communication demands. It is likely to be the most important transfer mode of the foreseeable future. It is being touted as a technology that can be used "everywhere" : in widearea, metropolitan area, local area and even desk area networks [14] . ATM systems could become high-rvolume products for which dependability is paramount. It is thus an important application for verification research.
We describe our experiences in formally verifying an ATM network component. This work is part of a larger project to investigate the formal verification of the Fairisle ATM network [13] . It is an experimental network, designed and built at the University of Cambridge Computer Laboratory. The network carries real user data and thus provides a realistic case study for the investigation of the formal verification of ATM network hardware. The network component we have verified is the Fairisle 4 by 4 switching fabric. The 4 by 4 switching fabric forms the heart of the Fairisle switch. It performs the actual switching of data cells from input ports to output ports and arbitrates cell clashes. The formal verification work has been performed on completed implementations. This is generally considered to be harder than integrating the formal specification and verification int,o the design process. This problem was exacerbated further since there was little documentation. A significant amount of reverse engineering was required. The behavioural specifications were largely deduced by examining the implementation.
We produced formal descriptions of both the implementation and its behaviour. We then used formal logic to rigorously prove that the behaviour prescribed by the description of the implementation satisfies the specified behaviour. In contrast to validation using non-exhaustive testing, the results hold for all valid sets of inputs, not just for some small subset. Formal verification corresponds in this sense to exhaustive testing. However, exhaustive testing is infeasible for all but very small designs due to the large number of possible input values. Formal verification is feasible because of the use of mathematics (such as induction) to consider the behaviour resulting from ranges of input values together.
The proofs described were carried out using the HOL90 theorem prover: a Standard ML implementation of the HOL system [6] . It is a machine implementation of a classical higher-order logic. It provides mechanical assistance to the proof process, ensuring mistakes are not made. The system will only call something a theorem if it has been rigorously proved.
There has been much work in the area of formal hardware verification, most notably in the area of microprocessor verification [l, ll] . There has been some previous work on the formal verification of network components. For example, Herbert initially used LCM-LSM and later HOL to formally verify an ECL chip: a local area network interface used as part of the Cambridge Fast Ring [9, 81 . It is of a similar complexity to our switching fabric, though the ECL proof took much longer to perform. This is an indication of the increased maturity of both HOL and hardware verification io general. Melham also used HOL to verify the T-ring: a. very simple ring communication network that was designed as a formal verification case study [15] . Joseph.s et a1 produced a hand proof that switching elements similar to those proposed for the INMOS Transputelr could be connected together in a regular way to imple.nient a router of arbitrary size [lo] .
We do not go into details of the formal specifications and proofs here. We concentratme on our experiencrs performing the verification. The time taken was comparable to the time originally spent designing, implemcmting and informally testing the fabric. No errors were found in the fabricated implementation. This was unsurprising as the fabric had been in service for some time prior to the start of the formal verification work. Undocumented "features" were found, however , and many erroi's were discovered in the formal specifications written for the verification. This was also unsurprising since doaimentation of the design and its implementation was sparstr.
The full details of the specifications are givt:~ in a 100 page literate document [3] derived from t,htI HOL source files using the HOL mweb tool. An overview is given separately [4] .
The Fairisle Switching Fabric
The Fairisle switch consists of three types of component,: input port controllers, output port controllers and a switching fabric. The port controllers process incoming or outgoing cells of data. They manage the routeing tables, setting up and breaking down virtual circuits. Each is connected to either an input or output link of the switch, and to the fabric. The fabric switches cells from input port controllers to output, ones. This iis illustrated in Figure 1 . The fabric is the place where cells contend for bandwidth. If different port controllors inject, cells destined for the same output port c.ontroller into the fabric at the same timr:, then only one will succeed. The others will be reject.ed and must retry later. The fabric consists of a series of identical switching elements connected in a regular array. The simplest, fabric consists of a single element. It is this fabric that we have formally verified. We have not verified tlw port controllers.
The port controllers append to each cell a routeing tag. The fabric uses this to determine which outgoing transmission link the cell should be transmitted on. It also includes one bit of priority information which is used by the fabric when arbitrating clashes. Arbitration takes place in two stages. Firstly, high priority cells are given precedence over low priority ones. Of the remaining cells, the choice is made on a round-robin basis. The input port controllers are informed of whether their cell was successful using acknowledgement lines. The fabric sends a negative acknowledgement to the unsuccessful input ports, but passes the acknowledgement from the requested output port to the successful input ports. This means the output port, controllers may reject cells even if they successfully pass through the fabric.
The port controllers and fabric all use the same clock so bytes are read in on each link synchronously. They also use a higher level cell frame clock-the frame start signal. It ensures that the port controllers inject data cells into the fabric synchronously so that the routeing bytes arrive at the same time. The behaviour of the fabric is cyclic. In each cycle or frame, the fabric waits for cells to arrive, reads them in, processes them, sends successful ones to the appropriate output ports and sends acknowledgements. It then waits for the next round of cells to arrive. The boundaries of separate frames are determined by the frame start signal. Whenever it goes high, a new frame commences. When cells are not being offered by the input ports, they must inject zeros into at least the first bit of each byte. This is the active bit of the cell header. When a new frame starts, the fabric watches the active bit of each input port. The cells from all the input ports start when the active bit of any one of them goes high. The fabric does not know when this will happen. However, all the input port controllers must start sending cells at the same time within the frame, since any which have not set the active bit at the header time are assumed not to be transmitting cells for the whole of the frame. If no input port raises the active bit throughout the frame then the frame is inactive-no cells are processed. Otherwise it is active.
The Implementation
The 4 by 4 fabric is implemented on a 4200 gateequivalent Xilinx programmable gate array. It was designed using a Hardware Description Language: Qudos HDL [53. This is a simple HDL that allows the structure of hardware to be specified. It does not allow behaviour to be specified directly.
To formally verify the fabric, we needed a structural description of the implementation in a language with a formally defined semantics. Without this no formal reasoning about the behaviour of the circuit is possible. Unfortunately, no formal semantics exists for Qudos HDL. As we intended to perform the verification using the HOL theorem proving system, we ultimately needed a semantics in higher-order logic. We therefore manually translated the original descriptions to HOL-HDL. This is a subset of the HOL logic with the flavour of a hardware description language. It, allows descriptions to be given which are very similar to descriptions in Qudos HDL. The translation could have been donc. conipletely mechanically, involving only the changing of syntax. However, for several reasons we decided to make changes to the description. It was not intended that, these changes alter the design, only the description of it. Both descriptions should describe the samo collect,ion of logic gates. To be completely ccrtain that change:, to the design were not inadvertently made, the netlists from the two descriptions could be compared. Two kinds of changes were made: adding t.xtra layers t,o t,he hierarchy and simplifying the description using features of HOL-HDL which were not available in Qudos HIIL.
The designers of the switch expressed a desire for the new description to make use of features of HOIL-HDL that were not available in Qudos HDL. The most notable was that Qudos HDL cannot describe words of words. The data input and output of the fabric. consist, of 4 byte-wide lines. They are thus best t1escrit)ed as a word of length 4 with each field a word of lcngth 8. This could be done in HOL-HDL, but in Qudos HDL it had to be described as 32 individual signals. Tlie Qudos HDL description was thus more unwieldily thilrl necessary. Where the use of words allowed descriptions to be simplified, this was done. H O L H D L also has a facility for describing generic hardware such as an n-bit. adder irnplernented in terms of n 1-bit adders. The value of n can be a variable in the description. This is not, possible TRANSMISSION LINES in Qudos HDL. Such generic descriptions were used to replace several separate Qudos HDL descriptions. The changes made the descriptions clearer and made formal reasoning about, the design more tractable.
Many of the modules in the original description were large and encapsulated several distinct tasks. We therefore added several levels of hierarchy that were not used in the original Qudos description. This made the description clearer. As the behavioural specifications were largely obtained by reverse engineering the HDL, this was vcry useful. It also facilitated the formal verification. For example, in the original description the top lcvel module described the fabric in t8erms of input and output buffers, latches, a header decoder, priority filter, timing unit, arbiter, dataswitch and acknowledgement unit. We grouped all but the latches and buffers into a single unit at the top level. It was then subdivided at the next level into the arbitral ion unit, acknowledgement unit and dataswitch, with the arbitration unit further subdivided into a header decoder, priority filter, timing unit and arbiter. The resulting hierarchy is shown in Figure 2 .
Both Qudos HDL and HOL-HDL have facilities for duplicating components. The different copies of a duplicated component can be wired in more general ways using HOL-HDL, however. In particular, arithmetic can be used to specify which bit of a word is connected to an input or output of a component. For example. we can specify that for all i, the 2i-th bit of an oiit,put is connected to the i-th bit of a subcomponent. This means that the duplication construct could be used in the HOL-HDL version whereas the copies had to be written out in full in the Qudos version.
To illustrate the kinds of changes made, we will considtv the Qudos HDL and HOL-HDL structural descriptions of a multiplexor component of the dataswitch ~ DMUX4T2. We first givc t,he Qudos HDL definition. The description starts with declarations. The first, line states that we are defining the module DMUX4T2 and that it has two inputs: d which is 4 I)its long (with bit positions numbered from 0 to 3) and x which is one hit. It, has one two bit output, dOut. The second line declares a local variable, xBar. We then have the description 'of the layout, enclosed by BEGIN and END. The first statmient is it dummy statement that provides information about the, way the design should be mapped onto a Xilinx gate array. It provides no semantic information. The next. statement describes a Xilinx invcxrter XiINV. It hits input x and output xBar. This particular inverter is given the name InvX which is used by thr! simulator. Therct then follows two AND-OR logic gates, AO. They each prlot-luce one bit of the dOut output, using differing bit,s froni the d, x and xBar signals. They are given the array iia.rne B, each being one entry. The individual bit positioiis are given in square brackets.
DEF
This description can be ininlicked in HOL-HDI,.
IlPfinit,ions of modules take a single pair argunicnt.. The inputs form the first part of this pair and the outputs the sccond. Multiple inputs and outputs are grouped into a tuple in the appropriate part of the pair. We onlit the information about, the word lengths of the inputs and outputs. This is specified whe~i the module is used (and in the correctness theorem). This i : i more flexible. The local variable xBar is introduced using the LOCAL construct. The three components are then given, separated by the A operator. The dummy definilion of the Qudos HDL is omitted as are the names of th: components. They provide no semantic information and are not needed to perform formal verification. The bit positions are indicated using the function SBIT, and the inputs and outputs have the structure described above, but otherwise the component descriptions are the same.
The unwieldiness of the HOL-HDL syntax could be overcome using parser and pretty-printer support.
In HOL-HDL, we can do better than the above. d can be thought of as being two signals each two bits wide. The input x chooses either the first bits of each of these signals or the second. We cannot describe this in Qudos HDL as structured signals are not supported, but we can in HOL-HDL.
Now d is a two level signal. The low level bits are accessed using two calls to SBIT. The use of the two level signal makes the description closer to the designers mental model. Furthermore, whilst making the structural specification look superficially more complex, it simplifies the behavioural specification. This makes that specification easier to understand and also simplifies the verification task. We can now simplify the structural specification further. The two A0 gates are performing identical functions, the only difference is in the bit positions. We can therefore introduce the duplication binder FOR.
It introduces an index variable i, to range over the different values of the bit positions, from zero up to, but not, including, the value after the keyword TO. ,x) ,dOut) = LOCAL xBar. XiINV(x,xBar) A FOR i : : TO 2 .
DMUX4T2((d

SBIT i d0ut) AO((SB1T 0 (SBIT i d) ,xBar,SBIT 1 (SBIT i d), x),
The body of the duplication binder FOR describes the replicated components in terms of the index, here i.
The Formal Behavioural Specification
To formally verify a hardware design, a formal behavioural specification is needed. Ideally the formal spccification would be written first and the device implemented from it. The fabric had been designed and implemented without any form of formal behavioural specification being created. There was only sketchy, informal documentation. Therefore the formal specification was reverse-engineered from the HDL description. This wils a very time-consuming process and the early versions of the specification contained many errors. These are discussed later. To a large extent the formal verification process was used to aid the reverse engineering. Initially an "educated guess" at the specification was made. As the proof progressed it was gradually corrected as the implementation was analysed and a better understanding of the design was obtained.
In addition to a behavioural specification of tihe fabric as a whole, a behavioural specification of each module was also needed. These were produced in the same way in a bottom-up manner. The behaviour of the lower level modules was first determined. These were then used as aids to determining the behaviour of u p p t~ level modules and ultimately the whole design. Auxiliary definitions were shared between the different modulc specifications.
Behavioural specifications of all modules were written before any proofs were carried out. This was blxause when the project started some tools were riot av;;tilablc in the version of HOL being used. They were available hy the time the specifications were finished. It, would have been better if each module had been vcrified as it was specified. This would have prevent,ed crro~'s in the specifications of the lower modules from propagatiittg u p to higher ones. ,41so1 because stiveral weeks elapsctl between t,he specifications being written and being used, t8he details of why the specifications were correct I d to be part,ially rediscovered.
The formal behavioural specification of the fabric is a formal description of the timing dia.grams of thc output signals. It describes the expected values of thc signnls at each time instance. The specificiition is a relation 011 tlic inputs and outputs of the fabric and on the state .which rccords the last successful inputs. As the behaviour of t.iw fabric is cyclic, the formal specification is based on the frame cycles. The behaviour of each output signal over the period of a frame is described in ternis of thc stat,e and values on the inputs during that frame. Frames can either be inactive or active. During an iIiact,ive frame no cells arrive, so no arbitration need be performed. During an active one at lcast, one input port injects a cell into the fabric. An inactive fi.aIrie could bc thought of as just a degenerate case of an activc! one. However, the specification is cleaner if the two c; treated separately. For inactive frames, the behaviour over the whole franie can tie treated uniformly. E x an active frame, the behaviour is typically split intci two parts: that up to somc fixed time after the act,ivc signal arrives; and that froni this point until the end of the frame. Thus there are several cases to consider in the formal proof for each output signal. The main aspec:t of the functional behaviour of the fabric is the arbitr.ation proccss. We specified this process as a function 011 the c,ell headers. Given a set of headers, the function describes the arbitration decision t,hat will be made. 'The new state and the values of the acknowledgement and data signals to each output port are defined in terms of this function.
Time taken
As we mentioned earlier the specifications for each of the 43 modules (both behavioural and structural)
were written prior to any proof. This took between one and two man-months. No detailed breakdown of this time has been kept however. Much of the time was spent attempting to understand the design. The structural specifications were adapted directly from the Qudos HDL. The behavioural specifications were more difficult. The specifier had no previous knowledge of the design or of the Fairisle Network. Documentation was minimal. There was a good English overview of the intendcd function of the fabric. This also outlined the function of the major components. Whilst it gave a good introduction, it was not sufficient to construct an unambiguous behavioural specification of all the modules. The behavioural specifications were instead constructed by analysing the HDL. This was very time-consuming.
A consequence is that the formal specification describes what the irnpleirientation does. This may not be what the designer intended it to do. Errors in the design may have become "features" of the specification. However, since the fabric was designed and in use prior to the formal specification and verification being carried out, this was unavoidable. It can be alleviated to some extent if thc designers thoroughly examine the top level specification. Ideally, formal specifications should be written by the designers of the module. This would also aid the design process.
The proof of each module in the design was independent, of the implementations of all other modules, inc:luding sub-modules of the module in question. Only the specifications of sub-modules were used. The separate proofs were then combined to give proofs for the modules' actual implementat ions in terms of the implementations of thcir component parts.
Approximately two man-months were spent performing the verification. Of this one week was spent proving general purpose theorems about machine words and signals. These theorems will be of use in future projects. Approximately 3 weeks were spent verifying the upper modules of the arbitration unit, and a further w7eck WiLs spent on the top two modules of the switch. 3-4 days were spcnt, combining the correctness theorems of the 43 modules to give a single correctness theorem for the whole circuit. The remaining time of just over t,wo weeks was spent proving the correctness theorems for the 36 lower level units. These proofs were automated rising tact.ics which were developed throughout the verification, t,hough normally additional human effort was required to finish the proofs. The breakdown Figure 3 . It shows the cumulative time taken as each module was verified. Thr time spent proving general word theorems ment,ioned above is not, included. On the whole, the simpler modules were verified first. The latches and buffers were particularly easy as they all had very similar structures. The proofs of the upper-level modules were generally more time-consuming for several reasons: there were :sever a1 intervals to consider; they gave the behaviour of :itveral outputs; and those behaviours were defined in t,errns of complex notions. Tactics developed for the early proofs n-er(' used in the later proofs. This speeded up the more difficult ones. The verifier had not previously performed a hardware vcrification and was unfamiliar with the word library, though was a competent HOL user. Proof times were consequently reduced as experience was gained.
HARDWARE UNITS VERIFIED
40-
35-
30-
25-
20-
Much of the effort, was expended in understanding informally why the implementation was correct. This was hampered to some extent by the delay between writing the specifications for a module and doing the proof. This meant that much time was wasted re-understanding t,he specifications and working out how the implem~,rit.;it,ions actually worked. It would have been much betttir to pcrform the proof of a module when it was specificcl had this been possible.
Errors in the specificatioris slowed progress. When an error was present in a module, some time was, spent at,ternpting to prove theorems that were unt,ruv. When this wits discovered, the error had then to be located. The manner in which the proofs failed usually gave a strong indication as to the location of the m".
However, it, was not always immediately clear wlict,hor thc error was in the behavioural or structural specification of the module, or because a coniponent module's q)ecification was too weak. Determining which was so involved cxamining the specifications of other modules as well as t,he faulty one. The specification then needed to lit: corrected and the proof completetl. The main errors were in the modules DMUX2B4CA11, and ARBITRATION, where there are corresponding plat,eaus in the graph. Tho original bchavioural specification for the latter contained several minor errors and some more serious ones. I t was largely rewritten during the verification. We discuss the (mors found in the next section.
After the proof had been successfully coniplet,cd the behavioural specification was reviewed. Major diangcs had been made to it during the course of the vorification. Consequently, some aspects of the specific,ibtion were overly complex. The spccification was t,lit:refore simplified and the proof redone. The changes involved t,he major reworking of the proofs for the timing module, t,he upper level modules of the arbitration unit, arid the upper level modules of the fabric itself. Due t,o t,lw modular nature of the design and proof, only the modules which were affected needed to be reverified. It took a total of one month to complete the original proofs of these modules. It took less than three days to modify the specifications and redo the proofs. Redoing the proof was quicker for several reasons. The proofs were split into a series of lemmas. Many lemmas were not affected by the changes and so their proofs did not need to be redone. The new specifications did not contain errors. The reason why the design was correct was well-understood because at an abstract level it was unchanged from the original. Most proofs that did need to be changed only needed to be changed in m a l l ways. Thus the scripts could be rerun with only a few modifications. Some proofs were simplified by the changes.
No detailed record of the time spent designing and testing the fabric was recorded. The design evolved from carlier designs, and several different designs were produced at the same time, making it difficult to accurately estimate the time scale involved. The designer estimated that, had it been designed from scratch the initial design time would have been in the order of several months. The time spent testing would have been in the order of several weeks. However, errors were discoverod after the testing process had been completed when the fabric was in use. Thus the time spent to formally specify and verify the design was not unreasonable. Had it, been performed as an int,egral part of the design, it is unlikely that it would have unduly slowed the design cycle. Furtherniore, it is likely that the formal specification and verification would have been much quicker if done as the fabric was designed since much of the time was spent attempting to understand the design. Had formal verification been applied to the ancestors of the design, the formal verification could have tracked the changm made, with a minimal amount of time spent adapting the proof for the new generation. Similarly the proof could have been quickly adapted to the other versions of the fabric that were designed at the same time.
Errors Found Durine. Formal Verification
In this section we give an overview of the errors discovered during the formal verification process in the various drscriptionq of the switching fabric We outline the various kinds of errors which occurred. In the more interesting cases, we give an indication of why they occurred and how they were discovered. Our aim is to give a flavour of the wide range of errors that can occur in formal specifications as well as in implemcritations and how formal verification can help in their detection. All errors found during the course of the verification were corrected and the proofs completed successfully.
The IrnDlementation
No errors were discovered in the actual implementation of the switching fabric. This is not surprising, since the fabric has been in use for some time. Also as noted above, the behavioural specification was written by examining the implementation. It is therefore possiformat. The error was discovered during the formal verification because the commentaries were being used to construct informal arguments to guide the formal proof.
ble that discrepancies between the implementat,ion and designer's intended behaviour have become "features"
The Behavioural Specifications of the behavioural specification
The Structural Specification Several errors were found in the HOL structural specification. These were introduced in the translation from Qudos HDL, due to the introduction of multi-level words, etc. For some errors this was due to the specifier misunderstanding the original description. In srveral places the wrong number of copies of a unit w a specified. For example, in DMUX4T2, (FOR i : : TO 1 . . . ) was originally used to replicate the A0 module. Tliis created only one copy rather than the required two. This was because Qudos HDL takes an initial index and final index rather than the number of copies. Thi:, could have been avoided if a duplication construct had been defined in HOL to mirror the HDL more closcly. This had originally been intended. The length wits wentually used since this made generic specifications simpler. In other modules, where a piece of hardwan. was cluplicated using the length of one of the signals. the length of the wrong signal was used.
In several modules the sizes of the local signals were not specified. This iriforniation was needed in the proof.
In FAB4B4, 2 bytes of a signal were selected whexi actiially it should have been 2 bits from each byte.
In DMUX2b4CA11 and ARBITER, two signals were incorrectly wired. This was discovered because thtl suhgoal ( CT, F l = CF, TI I was generated in the proof attempt. One side of this equality originated from the bchavioural specification and one from thr st riictural specification. This illustrates how the discovcq of an error can give a strong indication of its cause It was clear from the proof attempt that, two signals had been swapped and also which signals they were from the context of the subgoal. It was not immediately char in which specification they had been swapped.
The English commentary
An English commentary was given with the behavioural specification of each module. Such a conimentary is useful as it gives a brief, if not precise, ovrrview of the behaviour of the rriodulv. This helps tlw vrrifier kcep a mental picture of the purpose of each module. It would also be of use to engineers who are not familiar with the formal notation. An error was discowred in the English commentary of the PRIORITYDECODER. It mistakenly stated that the priority decoder outputs one word per output port. It actually returns oncl word prr input port. The formal specification was corrr'rt Thr confusion arose because other components out put this Many errors were found in the behavioural specifications of the modules. Most involved the incorrect specification of word lengths. Such errors were generally easy to detect and correct. The specification of the NOR gate was correct only for 2 input gates, but was used for gates of varying sizes. When originally specified it was thought that only 2 inputs would be used. This error was noted in the first proof in which an incorrect NOR gate occurred. One of the primitive word operators, WSEG, which selects a segment from a word, was incorrectly used. This was because the specifier had misunderstood its definition, assuming it took the end points of the segment as arguments, when it took one end point and a length. Again this was spotted in the first proof where it was used.
The timing of several modules was incorrectly specified. For example, in the specification for the module TIMING, an event was stated to occur at the same time as the frame start signal when it actually occurred on the subsequent cycle. Such errors were normally easy to detect and correct. Goals of the form ( t s = t s + 1) were obtained. In fact, in the specification for the upper modules, the timing was purposely worked out in this way. Educated guesses were made about when events occurred and used in the specification. The actual values were then discovered during the proof and corrections made. This was easier than attempting to work out the timing by hand.
It was specified that thti two bits of one of the signals to the dataswitch were sampled at thci same time. In fact the implementation samples them at different times.
When initially writing the specifications it was assumed that the same definition of a frame between successivc frame start signals could be used for all modules. However, the frame start signal is passed to all modules with no delay, whereas othex signals suffer delays at various points in the circuit. In particular, the active signal is delayed at several points. This means that the definition of a frame must vary between modules to account for the different relative times of the frame start and the active signal arriving at a module.
In several modules, the structure of the signals output was confused in a similar way to the error in the English commentary mentioned earlier.
Some of thv specifications contained redundant information that either was not required for the proof, or was essentially stated twice within the specification. Whilst, this did not affect the proofs for the modules in question, having unduly complex specifications would have complicated the proofs of the upper levels.
The specification of the arbiter did not give its behaviour on the last cycle of a frame in which no cells arrived. This was discovered because a subgoal had to be proved about the value of the grant signal at this time, but no information was available. The initial specifications for the upper modules in the hierarchy did not consider inactive frames as a special case. It was believed that inactive frames were covered in the behaviour given, However, this was not so. Consequentially an extra case needed to be added.
In several places the expression SBIT k which selects the k-th bit of a signal was used when what was needed was ($= k o BNVAL) which converts the word to a number and then tests if it was equal to k. This arose clue to confusion over the form in which the data was being stored on the outputs of those modules.
The Correctness Statement
An additional assumption needed to be added to the correctness statements of some modules. It was known about by the designers, though it did not appear in the informal documentation. It, concerned the effect of the active signal arriving close to the1 frame start signal. It had initially been thought that the fabric would finictiori correctly irrespective of when the active s i g d arrived. This was not so. An assumption was therefore needed that the active signal did not arrive at an inopportune moment. This assumption appeared in V~W~O U S forms for various modules as well as in the full corrcctness statement for the fabric. The fabric has no c.o-rit,rol over when these signals arrive as they are determinu1 by the external environment. The design of the port, controllers must ensure that the assumption is upheld. The assumption could have been included in the spccificat,ion of the modules concerned rather than being explici t in the correctness theorem. This would have made little diffcrence to the proofs.
Conclusions
We have demonstrated that a fully machine-checked formal verification of a real piece of communications hardware can be conducted in a time scale cornpxahle to that required for its original design, implernmtation and informal testing. This was despite the formal specification and verification only starting after ttic dcsign had been implemented; the documentation t)ciIig sketchy, and the verifier having no previous knowlcdge of the design or its application. Whilst no eirois were found in the implementation, problems were fouricl in the original versions of the formal specifications. These were corrected and the verification completed. We also discovered and formally documented assumptions a bout the environment which must hold for the switching fabric to operate correctly. Whilst, this information was known by the designers it was not documented. This could have led to errors if the fabric was used in a way other than that originally intended.
We have a rigorous description of the behaviour of the fabric, and of all its constituent modules. Having formally verified the implementation against it, we can have a high degree of confidence that the fabric does have that behaviour, provided it was correctly fabricated from the HDL description. This will be of use if changes are made to the design, and when interfacing the fabric to other components of the switch. The proofs can quickly be modified for other designs of the switching element since correctness results for the modules can be reused if the modules are reused in the other designs [2] .
We verified all modules down to the gate level. This was done to illustrate the feasibility of such a complete formal verification. However, the hierarchical methodology allows a more pragmatic approach to be taken if desired. Modules that are simple enough to be exhaustively simulated, or for which there is already a high degree of confidence for other reasons do not need to be formally verified. They can be taken as being basic niotlules! as was done with the specifications of the logic gates. Their formal specifications are then assumed to be correct. The formal verification of the rest can be carried out as normal. This allows more effort to be expended on the verification of modules where errors are thought most likely to occur. However, it should be remembered that not only must the implementation of the rnodule be correct,, but the formal specification must be an accurate description of it.
Many errors were found in the original formal specifications. This highlights that just like implementations, specifications are hard to get right. The main reasons for there being so many errors are that the specifier was not originally familiar with the designs being specified and because very little informal documentation was available. The specifications would probably have contained fewer errors if written by the designers during the design process, or if they had produced informal documentation for each module. Errors corresponding to those found in the behavioural specification could just as easily have been in the structural specification. The formal verification would have found such errors in the same way. The exercise does illustrate how well formal verification can discover errors and ensure that their correction does not introduce new errors, whether they are in the implementation or specification. Many errors concerned the sizes of words. These might have been discovered earlier if the sizes of the words could have been included in the type information -dependent typing. Then some of the errors might have been discovered during type-checking.
Some systems such as VEHTAS [7] HOL-HDL could in principle be used in future projects instead of Qudos HDL t,o give the original struct,ural descriptions of hardware. However, before this is sensible, more tools are required, for example, for simulating the HOL-HDL descriptions and for obt,aining netlists from them. A quick way this could be acltiieved would be to write a translator from HOL-HDL to Qudos HDL. The Qudos tools could then be used. As the two languages are so similar, this would be relatively easy. 'rhc additional information neetled, such as nodf: ri;irnes, could be provided using dunirny definitions whicdi semantically did not use the oxtra information.
