Abstract-A system-level SOC verification method based on hardware accelerator is proposed in this paper. The storage mapping relationship is designed according to storage characteristics of the hardware accelerator, and the whole testbench is put into the accelerator directly, resulting in a quick migration to accelerator and an optimized memory area. In order to ensure the comprehensiveness, complexity and authenticity, the system feature description is extracted from different application scenarios, and the test cases are derived from the paths of data flow and control flow. Additionally, the system debugging is simplified by controlling the acceleration processing with the method of trigger-driver-based state transition diagram. By adopting the system-level verification method proposed in this paper, RTL design of a DSP chip is verified and the experiment results demonstrate an immense acceleration effect and a high accuracy in bug locating. Finally the DSP chip is implemented in 0.18um CMOS process and it works properly. 
I. INTRODUCTION
Verification is essential throughout the IC design flow. With the development of IC technology and the change of market demand, the complexity and scale of SOC become more and more huge which results in the complexity and workload of verification growing exponentially. In this case, the SOC system-level simulation and verification become a bottleneck affecting the progress of the project, accounting for about 70% to 80% of the entire chip development cycle [1] - [3] . Therefore, how to detect errors in the design quickly, ensure the correctness of the design and shorten development cycles are essential to reduce development costs and shorten the Time to Market.
For the verification of the SOC, functional verification is not only the most important but also the most timeconsuming. Especially for large-scale and complex SOC, the method of system-level function verification is a challenge. The method based on the software simulation and FPGA prototype verification cannot simultaneously satisfy the requirements of flexibility and the speed for large-scale design verification [4] . The verification object is a DSP which is used for digital signal processing and system control in this paper. According to the verification platform based on Mentor Graphics Veloce Quattro and its storage characteristics, this article achieves rapid transplantation from the verification platform to the accelerator, and optimizes the mapping area of the memory models combining with the timing of the design requirements. Moreover, it also realizes the system-level verification acceleration and gets an immense acceleration effect based on the critical paths of data flow and control flow for different scenarios of SOC. Meanwhile, in the debugging period, this paper proposes flexible triggers which are used to control the system debugging and improve positioning capability rapidly, as well as error correction capability.
The article is structured as follows: The second section describes the structure of the SOC system. Section III gives the structure of the hardware accelerator platform. The next section provides SOC system-level verification method which is based on the hardware accelerator while Section V presents the final experimental results. The last section is conclusion.
II. SOC SYSTEM ARCHITECTURE SOC system architecture is shown in Fig. 1 . The chip is a single-chip processor for digital signal processing and system control. Function modules include: 16-bit reduced instruction set processor (RISC CPU), data memory and management controller (Data Memory, DMA Control), clock control module (RTC, Timer), watchdog module (Watchdog), interrupt management module (INTC), and peripheral interface module (UART0, UART1, SPI0, SPI1, GPIO, IIC). Data bus is used to implement the data exchange among the various functional modules, including low-speed APB bus, local high-speed AXI bus and a bridge realizing the protocol conversion of the two buses. The main application fields of the chip are digital signal processing and system control of medical electronics, automotive electronics and structural monitoring. Efficient custom instruction set, high integration, low cost and low power consumption are advantages of this chip. The microchip and the test circuit board are given in Fig. 2 . Hardware Accelerator mentioned in this paper is Veloce Quattro platform of Mentor Graphics [5] . It is composed of software and hardware. The software is mainly responsible for the management and maintenance of the hardware, and also is used to communicate with hardware acceleration board. Hardware is mainly composed of specific FPGA array. The maximum capacity of the single board is 8,000,000 gates. Its performance parameters are given in table I. PC terminal is a window to interact with the user. It is used to control the simulation process of the accelerator.
There are two ways for PC terminal to communicate with Linux workstation. One is BATCH mode, the other is GUI mode. Both ways can make the PC terminal send control commands to Linux workstation, and receive the information from the Linux workstation. Linux workstation controls the entire emulation process by interacting with Veloce through the highspeed cable. At compiling phase, it compiles RTL code to generate net-list files and configuration files which can be recognized by the internal logic circuit of Veloce. During the emulation phase, it downloads the test vectors, memories and so on to the accelerator.
IV. THE SYSTEM LEVEL VERIFICATION OF SOC
A. The Partition of Software and Hardware for SOC Verification The emulation time of hardware accelerator is mainly determined by three parts: 1) The running speed and the time h T required for the part in the hardware emulator; 2) The running speed and the time s T required for the part in the software simulator; 3) The time t T for signal and data transmission between the software simulator and hardware emulator. So, the total time T of hardware accelerator emulation is:
because hardware accelerator board is based on FGPA array and its running speed ranges from tens of KHz to several MHz [6] , [7] . Compared with T , the magnitude of h T is very small and negligible. So the emulation time mainly depends on s T and t T . Noted that in the literature [8] , the time of interaction between software and hardware of hardware accelerator platform has a great influence on the whole system emulation time. In [9] [10], the same is achieved that the communication overhead between the software simulator and the hardware emulator is becoming a critical bottleneck. So, it is necessary to try hard to reduce the data interaction between the system software and hardware to shorten the emulation time when the acceleration emulation is done. So, if we can reduce the software tasks or transfer them to the hardware which runs much faster, well, it reduces the time s T of software simulation and the time t T of hardware emulation simultaneously. So, the entire emulation time T decreases.
Verification of the design is based on the partition of the software and hardware by the principle mentioned above. In order to reduce the emulation time, the testbench is put into the hardware accelerator directly that makes the system emulation run in hardware entirely. The RISC of the design uses the custom instruction set of ZSP400. According to the instruction set, it proposes the assembler which converts the instruction set to a binary stream automatically. In the verification phase, we select different application scenarios and paths of data flow and control flow as the verification environment, and write the assembly language program at a high transactionlevel which can be converted into binary data from the assembler as the test stimulus that are put into the accelerator. Thus, there is no time overhead of the software simulation and the data interaction between hardware and software because of putting the whole testbench into the hardware accelerator directly. At the same time, the result is stored in RAM and compared with the reference data stored in another ROM. The partition structure diagram between software and hardware is illustrated in Fig. 4 . 
B. SOC System-Level Verification Scenario Analysis
Application scenario for the SOC is extensive. SOC system-level functional verification is relatively complex and time-consuming. In order to ensure the comprehensiveness of verification and satisfy project schedule urgency request, the scenario analysis of SOC system-level verification becomes increasingly important. In this paper, combining with the design characteristics and the project requirements, according to dividing the different application scenarios, the system-level verification method based on the paths of data flow and control flow is proposed. Four paths of data flow and control flow based on different application scenarios are demonstrated in Fig. 5 .
① RISC CPU--AXI high-speed bus--Protocol conversion bridge--APB low-speed bus--Peripheral Interface UART0, UART1, SPI1, GPIO, IIC. This path is used for reading and writing data or transmitting signals to low-speed peripherals with the control of processor core.
② RISC CPU--AXI high-speed bus--SPI0. This pathway is used to read and write data or transmit signals RISC ③ RISC CPU--AXI high-speed bus--Data Memory. This path is used to read and write data or transmit signals to on-chip memory with processor core.
④ Data Memory or SPI0--AXI high-speed bus--Data transmission channel DMA--APB low-speed bus--Peripheral Interface UART0, UART1, SPI1, GPIO, and IIC. This path is used to transfer large and continuous data between Data Memory or an external SPI Flash Memory connected with SPI0 and low-speed Peripheral Interface UART0, UART1, SPI1, GPIO, IIC etc.
C. Memory Mapping and Optimization
Based on the hardware and software partitioning method mentioned above, the whole testbench is put in accelerator. In order to improve the emulation speed, a lot of memories for storing design and test stimulus are needed. So, how to correctly design and optimize the memory model are very important. Designing the memory model requires to consider three aspects of the problem. 1) Whether meet the requirements of sequence and size for the design. 2) Whether can be synthesized. 3) Memory area optimization problem.
In view of the first and the second questions, a careful analysis of the function for memory in the design is necessary. It involves reading and writing time sequence, storage size, etc. Above these, memory model design should adopt the structure which can be synthesized taking into account the synthesis problems of hardware accelerator. For the third question, studying the internal memory matching principles and characteristics of the accelerator is also imperative. According to the design requirements, the appropriate threshold values and options should be set, so as to optimize area and function. DROM model are shown in Fig. 6 .
D. System Debugging Control
In order to increase the flexibility of the system debugging features and control the emulation process, the trigger state machines driven by state transition conditions are designed. They involve setting the states module DROM1024X16M8 (CLK,CEN,DOUT, AD) Figure 6 . DROM model names, start and end points of tracking data, the state transition conditions, the trigger points of states, the stop points of system and so on. The state machines can control the start and the stop of the system emulation, as well as the emulation tracking data. They also can be used to judge whether the expected states have been reached or not and observe the correctness of the design corresponding the arrivals of the states. The portions of triggers in the system are illustrated in Fig. 7 . For the data path ①, the running light program is used as a test case. RISC CPU writes data to the peripherals GPIO and drives LED to lit running lights. The pipelining interval of running lights mainly relies on the control of internal loop counter in RISC. The pipelining interval is larger, the emulation time is longer. During the emulation of the hardware accelerator, the running light program which is described by high transaction-level is converted to the test stimulus from the assembler mentioned above. After that, put it into the DROM of the RISC I_MEM module. Moreover, add the trigger which is designed above. Then, start the emulation. The waveform is given in Fig. 8 at the end of the emulation. For the data path ②, the bubbling program is used as a test case which procedures for the 5000 random numbers of 16 bits according to the order from the small to the large. When the hardware accelerator emulation starts, in addition to import test stimulus and trigger, it also should put 5000 random numbers of hexadecimal into SRAM. At the end of emulation, the part of waveform and the trigger point are shown in Fig. 9 and Fig. 10 respectively. From table II, for the system-level SOC verification， authentication method, mentioned in this article, wins an immense acceleration effect. As illustration in Fig. 8, 9 and 10, with the state machine transitions, the trigger technique has a strong performance in system emulation control and bug tracking.
VI. CONCLUSION
The system-level verification is critical in a SOC design process due to the IP cores passing certification. Directing at complicated applications and various function features, the SOC system-level verification is segmented into comprehensive test scenarios, according to the paths of data flow and control flow. The SW/ HW partition principle with overall hardware verification acquired an immense acceleration effect. The method of trigger-driver-based state transition diagram is adopted in system debugging to control emulation process, which improves the flexibility of debugging and the veracity of tracking on an emulator. The tape out of a DSP chip designed in a 0.18um CMOS process is successful in this method.
ACKNOWLEDGMENT
This work is supported by Science-Technology Program of Shenzhen, China (Grate NO.JCYJ20120614150044545). Thanks Teng Wang and Yin-Hui Wang in IMS lab of SZPKU for their discussion and advice on correcting errors at compiling and emulation phases.
