architectures in future embedded-system designs will bring many advantages in terms of performance, cost, reliability, power consumption, and system size. However, to fully benefit from those advantages, designers must fine-tune the SOC architecture to suit application-specific characteristics and requirements. An application-specific multiprocessor SOC architecture (ASMSA) constitutes an ideal hardware platform since, in theory, it can be configured to fit the application's needs exactly. Such architectures allow many customizations. One of the most important design decisions is the topology and protocols chosen for communication between processors, memories, and peripherals.
shows a possible ASMSA design flow. The Colif intermediate design model makes this flow independent of the input language. Of course, we must provide translators to and from the selected specification and simulation languages. The design flow starts with a heterogeneous specification and uses a gradual refinement process to generate an ASMSA implementation. The heterogeneous specification can mix different abstraction levels and design languages. The design methodology must include rules that enable the derivation of an executable model from the heterogeneous specification. This executable model might span multiple design environments because the heterogeneous specification can use multiple design languages and different abstraction levels. In that case, the designer must validate the specification using the concurrent execution of multiple simulators-that is, using a cosimulation model.
The goal of this design flow is to generate an ASMSA description ready for low-level hardware synthesis and software compilation. Thus, the hardware and software interface-synthesis tools must share some modeling information. Moreover, the same design structure has different interpretations as implemented in software or hardware. These different interpretations One advantage of using a unique intermediate model is that all the tools read and write to the same model as different aspects of the design are refined. In addition, this approach increases the design flow's flexibility and modularity. It is an effective design flow for ASMSA because it lets designers use a divide-and-conquer approach for complex designs and focus on the important customizations. This flow helps designers make progressive refinements. The ability to mix abstraction levels in Colif lets them develop parts of the design at different paces or reuse legacy blocks described at low abstraction levels. Simulation of the generated executable model provides feedback.
Multilevel representation for ASMSA design
Colif's key aspects are the concepts it uses to realize a modular system specification, its object model, and its semantics at four abstraction levels of communication refinement.
Modular system specification
Colif uses three basic concepts to specify a modular system: module, port, and net. Colif represents a system as an ensemble of hierarchical modules interconnected by a communication network composed of hierarchical nets and ports. This syntactical representation does not depend on the abstraction level used to describe the system. Consequently, Colif can use a uniform syntax to represent heterogeneous systems-those described at multiple abstraction levels or in different specification languages. As Figure 2 shows, each module is defined by its interface, which consists of a set of ports, and its content. The module content is a netlist of other module instances or a composition of tasks. Modules are represented as white boxes if they contain instances, as black boxes if their content is not known or irrelevant, and as labeled circles if the modules are leaves in the hierarchy. A leaf module contains a known behavior; for simplicity, we call leaf modules tasks. The Colif object model has two parts: declarative (declarations) and instantiative (instances). The declarative part represents objects that define a reusable template with generic property definitions (PARAM_DEF) and default values for some properties (PARAMETER). The instantiative part represents objects used to build a tree of instances.
Colif object model
As a general principle, each basic concept in Colif-MODULE, PORT, and NET-is split into two parts: an interface (ENTITY) and the content (CONTENT). ENTITY has a TYPE that holds user-definable properties. CONTENT holds a reference to an internally or externally described behavior and/or a list of DECL objects. The latter case implies that the CON-TENT object is hierarchical and that it has objects situated at a lower hierarchical level. An INSTANCE is a specialization of an object's definition; it inherits all object properties, but each INSTANCE can have its own set of specific properties. An INSTANCE holds only a reference to the reused object and an independent, user-definable name. Colif classes are polymorphic in that their semantics change according to the abstraction level considered.
The MODULE class is a template for defining structure or storing behavioral descriptions. When defining structure, MODULE_CONTENT holds module and net declarations. If the MOD-ULE is a leaf in the structural hierarchy, its content holds a list of behaviors that is an algorithmic description of the module's functionality using high-level languages or intermediate formats.
The PORT class defines the interface between components inside the module and the external nets.
To allow independent refinement of a module's internal structure from the external world and truly encapsulate the module's content, the ports are classified into two categories: internal (specific to the module) and external (specific to the communication channel). Thus, modules and channels described at different abstraction levels can be mixed within the same description. This fundamental concept allows the separation of behavior and communication. For instance, suppose a module is described at a given abstraction level, and the communication network is described at another. In that case, a hierarchical PORT_INSTANCE class composed of internal and external ports acts as an adapter between the content and the external environment.
The NET class represents the communication media between a set of different port instances, and its behavior can be described at multiple abstraction levels. The net concept plays an essential role in Colif. Depending on the abstraction level used, a net can hide a very complex behavior. As explained in the next section, Colif uses a generalized net concept.
Colif's flexibility resides mainly in the hierarchical structure of modules and the generalized port and net. Flexibility is by far the most important feature for the design of distributed multiprocessor SOCs and was one of our main goals in Colif's design. Of course, for this flexibility, we pay the price of a lack of formal analysis, which requires well-defined execution semantics. Our choice is justified by the nature of the problem we want to solve-the synthesis and design of complex heterogeneous embedded systems using multiprocessor SOC architectures. Usually, designers decompose the specification of such a system into multiple parts that they design in a modular flow. In that case, they can apply well-defined execution semantics and formal analysis to each subsystem and use Colif to model the entire SOC architecture.
Application-Specific SOC Multiprocessors

12
IEEE Design & Test of Computers
Flexibility is by far the most important feature for the design of distributed multiprocessor SOCs and was one of our main goals in Colif's design. Table 1 summarizes the semantics associated with Colif's communication channels, or generalized nets, at the various abstraction levels. The service, message, and driver levels collectively constitute what is usually called the system level. Three features characterize communication channels: medium, data type, and behavior. The medium is the infrastructure for carrying data of a certain type while performing actions that transform this data. The behavior is the transformations performed on the data.
The service level is the highest abstraction level with specified communication semantics. At this level, the design representation uses a combination of requests and services, and tasks can request services from one another. This model completely abstracts out the underlying protocols, connection topologies, and essential timing issues. The service level can support several time models based on the concurrency structure and local time capabilities of the tasks themselves. The Common Object Request Broker Architecture 4 is a good example of the request-service model in the software domain. In Corba, programs or libraries register their services through descriptions in an interface definition language, and one or more object request brokers route communication between a mutual request-service pair.
At the message level, hierarchical modules communicate through active channels that encapsulate all low-level protocol details.
Active channels can abstract data and/or protocol conversions. Access ports provide methods such as send or receive of data; thus, tasks can use a high-level interface to access the active channels. Data can be generic, and its size is not necessarily predetermined, so communication time is nonzero and nondeterministic. The message level is a simple model, but changing the underlying semantics and channel behavior enables it to describe diverse communication schemes. Refining active channels into logical interconnections normally requires that a module describing the channel behavior act as a communication controller.
An example of a message-level design model is the Specification and Description Language. 5 SDL uses channels with the basic send or receive primitives and infinite queuing. Some projects have used this level for functional specification for synthesis, but they impose restrictions on the specification and communication. 6 The popular remote procedure call (RPC) technique is often used to implement this level of communication.
7
A driver-level design uses logical connections that exchange fixed, enumerated data types (such as integers and real numbers). The basic channel behavior is logical transmission according to a fixed protocol. Communication time is nonzero but predictable because the data size and structure and the transmission protocol are well known. Typical driver-level communication abstractions are master-slave buses and rendezvous or point-to-point communications based on first-in, first-out (FIFO) protocols. For a At the register-transfer (RT) level, a design represents the system as a set of interconnected modules that communicate through physical wires and exchange data with fixed bit-vector representations. The basic protocol is physical transmission of values on physical ports, and the only transformation performed is a resolution function to handle transmission conflicts (for example, bus OR/AND). Designers must describe address decoding and interrupt management in detail. Setting values on physical wires produces an immediate reaction with respect to values and time.
Most commercial synthesis tools are still at the RT abstraction level. They offer sophisticated ways of describing the inner content of processes. The communication model, however, is still bound by restrictions and a lack of abstractions.
Design modeling with Colif
The following example illustrates the use of Colif at different abstraction levels. The application consists of four tasks passing tokens (an abstract data structure) to their neighbors, and a counter task. Two tokens circulate in opposite directions among the four tasks, forming a ring. Each task's behavior can be described in different high-level languages, such as C/C++, SDL, or VHDL. The specification starts at the highest abstraction level, the service level, and is refined until it attains the RT level. The design flow targets heterogeneous multiprocessor architectures-those comprising multiple processor types such as microprocessor units, microcontroller units, and digital signal processors; thirdparty intellectual-property (IP) components; on-chip memories; and hardware accelerators such as application-specific ICs and field-programmable gate array blocks. The main challenge is to design an efficient communication mechanism among these heterogeneous hardware blocks.
As we move to lower abstraction levels, we can perform many possible refinements. These refinements fall into three categories: structural, behavioral, and communication. Structural refinements modify the design's hierarchical structure. Hardware-software partitioning, allocation, and binding belong to this category.
Behavioral refinements modify task descriptions. For instance, I/O primitives can be refined to comply with interface constraints, and task contents can be modified to comply with the model semantics at different abstraction levels. For instance, when moving from the driver to the RT level, we must schedule computations executed on hardware blocks into clock cycles and adapt the software code to the processor that will execute it. This may include adding specific system calls to the embedded operating system. Finally, communication refinements modify the topology of the communication infrastructure composed of ports and nets.
Service-level design model
The token ring's service-level specification, shown in Figure 4 , uses an abstract network with dynamic routing between servicerequest pairs. The network is accessed by outgoing request ports and connects those requests to service ports. The network keeps information about the services offered by each task, the type of parameters required, and the routing paths. In Figure 4 , service port P5 indicates to the network that task T5 offers a service called Count, and port P2 indicates that task T2 offers a GetToken2 service that takes an object of abstract type tokendir as a parameter.
Tasks ask for a service by its name, using the available request ports. The pseudocode for T2_token_task() shows calls to request methods of request port P2 to transmit the token to task T3 when task T2 has the token that circulates clockwise. The pseudocode for T5.service(Count) shows that an internal counter is incremented every time the Count service is called. The pseudocode for T2.service (GetToken2,tokendir) shows that internal flag dtoken is set every time the GetToken2 service is called with a direct parameter. These tasks loop forever, but they relinquish control when they execute wait() and when they call a port I/O method. Different service-resolution policies can be implemented and stored in an abstract-network behavior library.
The objective of communication refinement at the service level is to replace the abstract network by an explicit network composed of channels. Furthermore, designers can determine the kind of high-level communication protocolfor example, infinite FIFO, broadcast, or mailbox-associated with each channel.
Message-level design model Figure 5 shows the token ring described at the message level, where communication flows through point-to-point active channels between tasks. Tasks T1 and T2 go into the same module, as do T3 and T4. Ports provide high-level send and receive procedure calls to access the network. Active channels can process protocol conversions on abstract data types. In the figure, T5_counter_task() calls a receive method on its port P1 to read a Boolean value; if this value is true, the internal counter is incremented. T2_token_task() calls the send method of its port P2 to transmit a Boolean true value to task T5 when it receives a new token from T1 or T3 on its port P1. Net behaviors are described in an external library for any highlevel protocol used in active channels.
When refining communication from the message to the driver level, designers must implement protocols. Consequently, they refine abstract channels to a specific implementation, fixing most of the details such as FIFO size, blocking or nonblocking read and write, and so forth.
Driver-level design model
At the driver level, each module and task at the highest hierarchical level is mapped onto an embedded processor or hardware block in the final architecture. In Figure 6 , for example, Module 1 was mapped to a software module, so tasks T1 and T2 go on the same processor, as do tasks T3 and T4. the nets. For example, T1_token_task() in Figure 6 gets a token from T4 using a FIFO protocol and sends this token to T2_ token_task() using a handshake protocol over the net connected to its port P1. T5_counter_task() gets the values from its ports without any specific high-level protocol, as if they were hardware ports.
If the design contains software modules that must be executed by a specific CPU, the communication refinement may include the use of an operating system to manage I/O and communication between parallel tasks. Figure 7 shows the RT-level architecture, which maps Module 1 onto an MC68000 processor running an embedded operating system. Module 2 was mapped to an ARM7 processor, and task T5 is implemented by a third-party counter component (IP). Hardware interfaces are introduced between the embedded processors and the IP to implement point-to-point interconnections. and so forth. Software tasks execute I/O operations through operating system calls. In Figure  7 , T1_token_task() calls the P3_IN_ FIFO_16() system routine to get a token from task T4 using a FIFO protocol and calls P1_OUT_HS_24() to pass the token to T2 using a handshake protocol. This is a cycle-accurate model. That is, nets represent physical buses transmitting bit vectors of known sizes. In this example, the last number in the system-routine names indicates the size of the bit vector for their parameters, and the first parameter is the I/O address associated with each task port in the host processor.
RT-level design model
The RT-level model produced by this design flow is ready for low-level synthesis. Designers can compile the software tasks and the synthesized embedded operating system using processor-specific software-development environments. They can use hardware synthesis tools to generate hardware interfaces and blocks from the synthesizable description.
Implementation details
Colif's permanent storage format is the Extensible Markup Language. 8 XML facilitates tool integration and design data exchange between synthesis environments. Furthermore, XML files are stored as simple text files, which can be read and manipulated with simple text editors. We could have used XML and its associated data structures directly to implement Colif, but, for efficiency, we decided to write a specialized parser and develop in-memory data structures to handle very large designs. Figure 8a shows a Colif file extract. Line 2 shows that Colif uses the Middle-ML grammar (see Figure 8b) , an XML dialect we created, with which users can describe arbitrary complex object-data models. Line 5 starts the definition of a new type called MODULE and states that it is structured as a set of named fields. Line 9 says that there is a field named entity, line 10 says that this field is a reference to another data structure, and line 11 says that the referred structure's type is MODULE_ENTITY.
WE ARE WORKING on an extension to SystemC, called Vadel (Virtual Architecture Description Language), which incorporates all the Colif concepts. We have developed a tool that translates a Vadel specification annotated with synthesis parameters on Colif. Another tool adds abstraction-level and protocol adapters to the Colif model and generates a cosimulation model. 9 Other tools automatically generate the hardware interfaces between processors 10 and 
