In this paper, we develop a formal framework to widen the scope of retargetable compilation. The goal is achieved by the unification of architectural models for both the processor architecture and the ASIC architecture. This framework enables the unified treatment of code generation and behavioral synthesis, and is being used in our experimental codesign environment to drive system-on-a-chip synthesis from an object oriented language.
Introduction
For billion-transistor chip design, synthesis, reuse, and exploration are three important vehicles to help reduce design effort and improve system performance. In this setting, the conventional wisdom of Y chart [I], which defines synthesis as the generation of structure, or architecture, from the behavior, can be refined as the generation of low level representation from the behavior, to conjigtin? a reused architecture. This view is readily understood for programmable devices, where software compilation is the generation of instruction stream to "configure" the reused processors.
With a little bit more thinking, behavior synthesis can he viewed as generation of a finite state machine to "configure" the data path, where the data path itself can he parameterized in terms of the numbers and types of the units available. When the system design starts from a uniform specification in a programming language, the entire system synthesis task can then he best viewed as the familiar retargetable compilation.
It is thus necessary to establish the architectural model before any synthesis tasks can be carried out. Note that the purpose of an architectural model is neither collecting every information one need to build the actual hardware, for example, the detailed netlist; nor repeating every information one can find on the traditional architectural manual. Instead, the architecture model should serve as guidance to the corresponding synthesis tools. pemiission 10 make digital or hard copies of all or part of this work for prism~al or classroom use is granted witlioul fee proridrd that copies art. ,not madu 01 dish-ibuled ibr prolit iii commmciai advantage and thal copics bear this IIOI~CC and the full citation on the first page. 'IO copy &errrise. to republish, 10 post on sciveis or to redistribute to iists, lsquircs prior specific permission andlor a k c .
CODES '99 Rome Italy
Copyight ACM 1999 1-581 13-132-1199105 ... $5.00
In this paper, we demonstrate the first step toward our goal of developing formal architectural model for system-on-a-chip by a unified treatment of its most important components: the instruction set architecture (ISA) for processor and finite state machine with a datapath architecture (FSMD) [14] for ASIC. The rest of the paper is organized as follows. Section 2 reviews the related works and highlight our contributions. Section 3 discusses the behavioral models that are relevant to the subsequent discussion of architecture models. Section 4 discusses the instruction set architecture. Section 5 discusses FSMD architecture. Section 6 describes the algorithms which unite the FSMD and ISA architecture. Due to space reasons, detailed illustrations and algorithms are omitted. Interested readers are referred to [15]. Our approach, as presented in this paper, is unique in the following aspect:
Related Works and Contributions
e completeness: unlike previous works discussed, which focus on the ISA architecture, our model also covers the FSMD architecture. Our future work will extend this work to include the communication architecture, with a unified treatment of the local commnnications, that is, the calling conventions, as well as the system wide communications.
e uniformness: Our model unifies the apparently different FSMD and ISA architecture, this effectively helps to unify the software compiler and behavioral synthesizer. In fact, under the retargetahle compiler infrastructure of our experimental codesign platform, behavioral synthesizer appears
just as yet another target in the backeud. formality: Our model is the first to formally define the essential elements as well as their relationship in the architecture. Without a formal model, the architecture specifier tends to be overwhelmed hy the language syntax and the amount of information one has to capture in a typical architecture, and a clean interface between the architectural model and the synthesis tool is difficult to define. Despite its uniqueness, our work is in many ways inspired by the previous works. The concept of implicitly representing the instruction set of a processor using a structural description, was first proposed in MIMOLA [9] and detailed in [lo] . Expression [I31 also uses the same concept to reduce the size of processor specification. Our work differs in that we extend this concept to the FSMD architecture, and our structural specification is abstract, parameterized and partially constrained. Our model for ILP is an abstraction from the one in MDES [81.
Behavioral Model
As we discussed before, it is important in the architectural model to associate architecture resource with the behavior piece that it can implement. We define our model of behavior piece in terms of trees. . 
ISA Architecture
An instruction set architecture is characterized by its instructions, storages, the instruction level parallelism, as well as communication schemes such as calling conventions. Definition 3 gives our formal model of an ISA architecture. . .
wherz m is the size of the instruction word; n is the nwnber of pipeline stages: S is the set of stores: I is the set of instructions: S C is the sporialconsrraint, TC is the temporal constraint and C A is the communication architecture. 0
To simply the model, we assume that the size of the instruction word is constant: For example, for a typical 32-bit RISC processor, m is 32; for a four-issue VLIW processor, m maybe 128; for a four-issue superscalar processor, m is 32. The irregularity of CISC processors causes some problems, but it can be easily handled in the implementation. The model for communication architecture C O M M , however, will not be covered in this paper.
Stores
Much the same way as the addressing modes one can find in the architectural manual, The stores (Definition 4) model processor storage resources such as register files and memories. The information of interest is the set of cells, which is in turn defined in Definition 5, that it contains and the way they can he accessed and allocated. 
1
, where s is the store to which it belongs, n is the number mpresenting its ofset in the store. 0
A store is called a finite store if the number of its cells IC1 is finite, it is called infinite store otherwise. Register files are usually finite stores since they contain a fixed number of cells: the registers. The immediate stores, fall into the category of infinite stores, since their cells are created "on-demand". Although memory stores are physically finite, they are conceptually simpler to be considered as infinite. Note that the fields base and o f f s e t make sense only in the memory store. The allocation of cells in each store is characterized by its allocation state space. The allocation state space of a register file can he modeled as a bit vector, where each bit corresponds to a register of the smallest granularity. The allocation state space of a memory store is usually modeled after the alignment status of the current available memory location.
Given the cnrrent allocation state and a data type, the allocation function can allocate a new cell by updating the allocation state and output the allocated cell number. a particular instruction, the storage allocation has to he performed to determine the destination, which is essentially a cell within the store corresponding to the left hand side of the pattern; and the set of sources, which are cells within the stores corresponding to the operands of the pattern.
The instruction encoding is modeled by the opcode, which is a bit pattern of size m; the dest, which is a pair of fields (Definition 7) for the base and offset of the instruction destination respectively; and STCS, which are field pairs for the bases of offsets of instruction sources. 
Constraints
The temporal constraint models temporal parallelism between the instructions. Modern microprocessors are always pipelined, and hence allow the interleaved execution of instructions. However, the possibility of interleaving is limited by various forms of dependency between instructions: the flow dependency (read after write), the anti dependency (write after read), and the output dependency (write after write). The situation is further complicated by the processor's capability of bypassing. To make things even worse, some processors allow bypassing within the same functional unit, but not amss. Our model of temporal constraint maps any pair of instruction i~, i~, into a triple of numbers TC(i1,iz) = (djlou, donti,doutplrt), where each number indicates the minimum number of cycles that i z should he scheduled after il for flow, anti and data dependency respectively.
The soatial constraint models soatial narallelism between the ~. 
Instructions
VLlw architechire, contains miiltiple functional units to allow the _,
The instruction (Definition 6 ) models processor computational resources. The information of concern is its semantics and binary encoding. Modeled as ahehavioral pattern, theinstruction semantics helps to identify the behavior piece that the instruction can implement. When a behavior piece is identified to be implemented by
FSMD Architecture
An ASIC is trpically implemented in the FSMD architecture, which consists of a control path and a data path. The control path implements a finite state machine which generates a set of control signals, called the control word, at every clock cycle . The data path pexforms the computational tasks specified by the control signals by transforming data values in its storages.
Our formal model of FSMD is given in Definition 8, which specifies the control path implementation style, the set of units and buses in the data path, as well as their interconnection. The implementation style can be random logic based, PLA based, or microcode based. , where P is a set ofports: F is the set ofconmlfields, s is its associated store, OP is a set ofoperations. 0
A control field (Definition 7) is characterized by its offset and width in the control word. While the width is a fixed value, the offset is determined only when the associated unit is instantiated in an FSMD. The operations that a unit can perform are characterized by the behavior patterns, and the corresponding control configurations. Note that the nonterminal set of the behavior pattern is limited to the ports and store of the unit. The control configuration is characterized by a set of control fields, and the Corresponding methods to compute the control values, which are specified either directly by integer numbers; or by stores. For those specified as stores, the control values are taken as the cell numbers as the results of storage allocation.
Definition 10 An operation ofunit U is a member of set Operation { . ' B P ( 0 x T , u.P U {us}):
, ~F U x ( i n t u s t o r e ) . In our model, the notion of bus goes beyond the physical wires. They are used to indicate the possible data transfers between the units. A shared bus does imply one physical interconnection hetween the set of units, but the corresponding steering units such as bus drives and keepers can he automatically generated. On the other hand, an on-demand bus implies as many point-to-point connection as needed by the behavior. The FSMD model defined in Section 5 has an apparently different structure than the ISA model defined in Section 4. To perform the mapping of behaviors to these two architecture, the code generator (for the ISA), and the behavioral synthesizer (for the FSMD), have to go through similar procedures such as controlldataflow analysis, target independent optimzations, instruction selection (binding), scheduling and emission. Unifying these two architectural models can help unify these procedures, and consequently merge the different tools into one. Obviously, such unification can greatly reduce the development effort, and simply the user interface of the synthesis tool.
P
This goal is achieved by deriving an ISA model ( m , n , S , I , I E , S C , T C , C A ) from the FSMD model (c,U,B,PMF,CA).
In the derived ISA model, the instruction word is viewed as the control word of the FSMD model. Hence n can be easily computed by summing up the width of all the control fields, whose side effect is to determine the offset of each control field of the FSMDunits [IS] .
The set of stores S c a n he simply computed by enumerating all the stores associated with the FSMD units [IS] .
The problem left is the derivation of the set of instructions as well as the spatial constraints. The problem can he solved hy first deriving the partial instruction sets associated with the value holders, that is, the buses and the stores of the FSMD. Intuitively, a partial instruction stands for a storage-to-storage or storage-to-bus operation. In other words, a partial instruction associated with a value holder has a behavior pattern lhs t rhs which maintains the following inwiant: lhs must be equal to the value holder itself and each operand of rhs must be stores. The partial instruction is also associated with encoding information as well as its resource usage, which is essentially a set of buses or units. The partial instruction set of each value holder can he derived following a topological order, that is, if value holder a is used in an operation of a unit to compute a value in the value holder b, then the partial instruction set of a is first computed [151.
Definition 12
Each instruction can be mapped to a partial instruction. Given a set of instructions, and its mapping to the partial instructions, the spatial constraints of the instruction set can be computed by examining if there are resource conflicts between any pair of instructions [IS] .
The instruction set can be derived by enumerating the partial instruction set associated with the stores of the FSMD. Since FSMD does not involve pipelined control, it is easy to conclude that m = 1 and TC = 0. 
Conclusion
We have presented the fornial models for the ISA and FSMD architectures. These formal models can serve as the hasis of amhitecture description for an retargetable compiler environment, and represents the first step towards unifying code generation and hehavioral synthesis.
