Abstract-We proposed a hardwarelsoftware cosimulation environment using an RTL simulator with a software language interface. The proposed simulation environment introduces the "OS interface (OSIF)," which invokes system calls in the OS on the simulation platform to execute application software. The OSIF consists of data adaption facility and function correspondence management allowing it to cooperate with the OS of the simulation platform, We show the results of experiments with an R3000-compatible processor model. This environment verified our processor model with SPEC benchmarks that require various operating system services. For example, with a lisp interpreter program Zz, our detailed RTL description for the core part of R3000 was simulated only within 20 hours on a 109 MIPS workstation.
I. INTRODUCTION
For detailed verification of micro-processor design, execution of application software with operating system (OS) functions such a s system call transactions and accesses t o external devices is indispensable. Since designers desire to try several types of processor architectures, the verification environment needs to be available as quickly as possible. Thus, simultaneous simulation of hardware and software on processor design is crucial.
There is a trade-off between accuracy and simulation time in the verification environment. When the hardware is modeled in higher level description, the software is easily applied to the model and simulation time is short, but the verification may be inaccurate. On the other hand, although the simulation will be accurate when the hardware is modeled in detail, it could be too difficult t o execute the application software with the model, or its execution speed is extremely slow.
In order t o solve this problem, in most current design styles software simulation uses hardware models which are different from detailed hardware designs. Application software is executed with a trace-level or pipe-linelevel simulator. Since such simulation environments treat a n ideal hardware, processor performance might not be realistic. For development of circuit-level hardware designs, hardwarle description language (HDL) simulation is used. At this level of simulation, hardwarelsoftware coverification with HDL simulation is not easy, because the software must be executed with system calls and accesses t o external devices. In addition, verification speed on the HDL simulator is at most 100 patterns per second. Thus, executing application software such as SPEC benchmarks
[I] requires a great amount of simulation time for emulating full opei-ating system codes on the HDL processor model. Several simulation methods have been proposed for efficient verification and quick software development [2, 3, 4, 5, 61. Since most of these simulation environments are aimed at developing embedded systems. Therefore they are unsuitable for a developing general micro-processors. For example, the Virtual Processor model [i'] divides a processor model into two abstractions. One is a target hardware model to verify, and the other is a trace-level software CPU model. This transformation from HDL to software CPU model makes implementing a simulator for application software execution possible. Although this approach is suitable for software development, the hardware model is too simple t o detect several types of flaws such as hazards.
In this paper we propose a hardwarelsoftware cosimulation environment attained through the use of an RTL simulator with a software language interface. To execute the alpplication software, we introduce the "OS interface (OSIF)" which invokes system calls in the OS on the simulation platform. Designers can omit those parts of HDL descriptions which are dispensable in the early stages oif the target processor design. The OSIF consists of proxy functzons for the omitted parts of the HDL descriptions. It also has transform functzons that translate seveiral hardware-dependent parameters such as address space from the target processor into the simulation platform, and vice versa. Whereas the Virtual Processor model abstracts hardware parts, our simulation environment substitutes OSIF functions, including the entire simulation of system calls for application software, for a large part of the hardware model. Thus, our environment achieves high-speed simulation for general purpose processor design.
The simulation environment, on which application programs are executed, consists of the following components:
1. Description of the target processor core written in 2. Operating system of simulation platform through the 3. Proxy functions that equate t o abridged functions on RTL HDL.
OSIF.
the HDL description.
Since our simulation environment combines detailed simulation for the core part of the processor with rapid simulation using proxy software, it greatly decreases simulation time.
Experimental results obtained with an R3000-compatible processor model are presented t o demonstrate the validity of this approach. This environment successfully verifies the effectiveness of our processor model in executing SPEC benchmark programs which require various operating system services. As an example, with a lisp interpreter program 12, our detailed RTL description for the core part of R3000 was simulated in less than 20 hours on a 109 MIPS workstation. Since a detailed RTL description for the whole R3000 takes weeks t o simulate, our simulation environment achieves greatly improved target processor core verification in the early design stages.
BACKGROUND T O THE HARDWARE/SOFTWARE CO-SIMULATION WITH AN HDL SIMULATOR
In this section, we describe the difficulties in executing application programs on the HDL simulator.
A. Executing an application program with a general HDL simulator
In the circuit design stage, a processor model is written in a hardware description language (HDL) such as Verilog or VHDL.
However, there are two main problems in simulating a new processor with an HDL simulator which does not specialize in processor modeling:
It takes several weeks merely to boot up the operating
Manipulation of external devices:
The time scale of an HDL simulator is about 1M times slower than the speed of the external devices. 
B. System-level behavior verification
Processor performance must be evaluated not only through the processor movements but also the systemwide behavior.
In a conventional simulation environment, system-level behavior is evaluated and verified by the HDL processor model and bus-level simulation models of external devices, because the simulator uses only an HDL description.
1. In general, bus models of the external devices are 2. The scope of bus-level architecture is limited t o that 3. These external devices will be outdated by the time provided as existing processor families.
of existing external device models. the processor is put on the market.
For these reasons, external devices must be developed at the same time as the processor in a conventional development sequence.
C. Process of debugging system programs with the HDL processor model Since a new processor's system software is modified from other software, the core of the debugging process is repeated execution around the updated code, even if the processor being designed has wholly new specs. Therefore, two main problems of HDL simulation environment (cf. Section 11-A) also have a very serious effect on the software debugging process. Furthermore, state control is a characteristic problem of a debugging process:
The RTL processor model is too strict and complicated t o set the status safely through the debugger. system.
-

THE HARDWARE/SOFTWARE CO-SIMULATION METHOD
In this section, we describe the design flow of a processor model through the use of a hardware/software cosimulation environment.
A. The process of gradually designing the processor description
We propose a co-simulation environment which manages partially implemented HDL processor models.
The deficient region of these HDL processor models are supplemented by software-based proxy functions (S/W-PF). Therefore, our simulation environment can handle both behavior-level HDL descriptions and detailed gatelevel HDL descriptions. In the early stage of processor design, designers investigate many kinds of architectures which are partially written in HDL. In this stage, other areas of the processor model such as FPU or bus interfaces are substituted by S/W-PF ( Fig. 2(a) ).
As the processor design progresses, the HDL description is enlarged and increasingly detailed ( Fig. 2(b) ). Finally, the fixed area of the processor design is replaced with a high-speed simulation tool (Fig. 2(c) ). The functional capacity of S/W-PFs ranges from the scale of an H/W functional block to that of a software procedure which includes the execution of system calls.
There are several ways to incorporate S/W-PFs in the system. As an example, these functions can be connected through the TEXT10 feature of hardware description language. In our environment, the HDL simulator's "software language interface" is adopted, because of its speed and simplicity of installation.
C. Application programs executed by processor model
The object codes which are executed with the processor model are almost the same as the object of the final product of the processor.
If a compiler of the processor is completed, we can use it t o build a n object.
Our simulation environment supports the cooperative and simultaneous development of the compiler and the processor.
Iv. IMPLEMENTATION O F THE SIMULATION ENVIROiNMENT INCLUDING OPERATING SYSTEM EXECUTION
We describe the implementation of an "OS interface (OSIF)" as an S/W-PF which executes the "system calls" of the UNIX operating system in their entirety. Our environment enables trace data of the application program execution t o be obtained because the OSIF solves the two major simulation problems described in Section 11-A.
As seen in Fig. 3 , the rough structure of our environment comprises an H/W-S/W interface (HSIF) within the processor descriptions and the OSIF within the HDL simulator, These two interfaces are described in detail in the following section. 
A.2 Instruction decoding
Tlic, siitiulator cxc*ciitcs a "systciri call" triggerctl by tho .'syscc'"ll" iiistriictioii which is iiicliiclccl iii the tiugct t i p B.2 Data conversion ~i~~~~, tllc a~~c~r c s s crated 11y the :iiiridator is assigiicd by tlic platforrri's OS.
tlie addrcss spa,c('s of tlicsc olijccts arc tliff'crcrit from the addrcw spacc of tlir proccssor iriodcl. Tlicrcforc, our e r iviroiiiriciit iriakcs address coiivcrsioii l~ctwceii thcsc two ciiviroriiricwts possiiblc (se(' Fig. 6 ).
of rrlclrlory oi,jccts wllicli plicatiori prograiri.
LVlivit tltcsc syscall iristructioiis are decodcd a t tlic ID stagr of the processor tlcscriptioii, the HSIF calls OSIF cluriiig tlic hlEhl stag(, (Fig. 3 ) .
111 ; i rcal proccwor, iiistriictioii strcarris i~r e flushed after tlic oiitl of a systciri call, hccausc the proccwor hirigs out ;L "rctiirii from cxwptioii'' iiistructioii. Tlicn HSIF siiriulatcs this proccss by exccutiiig a .',jiiirip to tlic iicxt iristructioii after syscall" iristructioii.
Is'ig. ti. Address coilversiuii D a t a coiivcrsiori furictioiis arc givcii 0 1 1 tlic> OSIF, vxcxapt wlicii using a type-convert fiiiictioii provided 1)y tlic sirridator, such as "stdlogic t o iiitcgcr."
At this tiiric, only two itcmis iiccd trmslatiiig, bccaiise tlic. d a t a structurc of the sirriulatiori plxtforiri is siiitilx t o tliosc of tlic proccssor irioclcls. Tlicw items arv:
B. OS intwfiZw
LVhcii tlic OSIF is ciillctl by tlic HSIF, tliv OSIF iiivokcs tciti calls of tliv dwigii platforrri as ii replaceiiiciit for tlic siiriiilatioii of tlic OS by tlit HDL simulator. Tlic OSIF lias tlic followiiig fiiiictioris wltich ciiablc it to call the platforni's os. Additional translation, such as byte endian, can be defined according to the architecture of the processor model. C. Application programs executed by the procesApplication programs which are executed by the processor model are made by the compiler for the new processor, except for the trigger function of the system call which uses the syscall instruction.
In our environment, the processor model simulate the execution of user level codes which include functions of the C-language standard library. Since these user-level codes are independent of a system level architecture, such as an access method of the external 1/0 devices, the new processor's compiler is checked and debugged simultaneously from the early stage of processor design.
The trigger function, which includes syscall instructions, is replaced with an exclusive library in order t o notify the HSIF of the point of the system call. Using this design, we applied the OS interface to the simulation of benchmark programs in the SpecInt.
1.
2.
3.
4.
B. Verification flow
System calls used in target application programs are executed by the OS of the simulation platform through the OS interface. 0 ther instructions including library functions are executed by a processor model which is verified by the HDL simulator. Fig. 8 shows the execution flow of a function "printf" as a typical pattern of the library function's calling sequence. In real systems, the printf library function analyzes its arguments and constructs an output string at the user level, and then outputs the composed string with a "write" system call.
Also, in our simulation environment, the user-level instructions of the printf are executed in the HDL processor model, therefore trace patterns of them are produced by the HDL simulator. Data objects which need translation (cf. Section IV-B.2) are interpreted one by one in the simulation and stored into the OSIF's memory pool. The write system call used in printf, executed by the simulation platform machine's OS and displays the string on the screen of the simulation environment. Fig. 9 shows the simulation process of an "mmap" system call. The rnmap system call establishes a mapping hctwecii a process's atltlrtss sp~icc aiicl aii object, such as iVli(w t hr applicatioii p r o g r~~i i~cc'essrs tlic rnapped regioii, iL kcriicl iiitcrrupts aiid dclivers the coiiterits of tlic object to tlie applicatioii prograrn, iii t h r sirrric way as a 111 tlic siiriulatioii ciiviroiiiricnt, the sirriulatioii platform OS a l s o iiitcrriipts whcii tlie target application prograiri iic'c'c'sscs the i~iappcd rcgioii. However, the actual R~C C S S to tlie itieiriory is excc'utccl by tlic OSIF. The OSIF ruiis iiiidrr tlic siitiiilator's i~dtlrcss sp~ice, aiid tlic prowssor iiiotlol is set t o safety status by the HSIF before it calls OSIF. Thus. iio coiitrirtlictioii is gciieratctl iii spite of the uiicxpecttd iiitcrruptioii occurriiig. tlic coritcr1ts of a rcgular filc or sI1arcd lrlclrlory pages. Thus, with the OSIF, we acliieve a systein-level siiriulatiori eiiviroiiirieiit whicli includes oycratiiig systeiri functions aiitl rriaiiipulatcs facilities of cxtcrrial devices. Iii aclditiori, oiir sirriulatioii eriviroiiiriciit cstablishcs detailed sirriulatioii for the core part of the processor iii coiijuiictioii with rapid siiriulatioii by proxy software, aiid tlicrcfore ir great decrease iii the sirriulatiori tirric is acliicved.
Expcriiriciital rvsults obtained with aii R3000-cornpatiblc proccssor rriotlcl were preseiitcd. Our sirriulatioii eiiviroiirricnt succcssfully vvrifies tlic effkctivciiess of our processor rriodel iri excciitirig SPEC bericlirriark prograrris wliicli requirc various opcratiiig systcrri svrvices. As aii example, witli ir lisp iiitcrpretcr program li, our detailed RTL description for the core part of R.3000 was sirriiilatcd iii less thaii 20 liours oii a 109 MIPS workstation.
