A data-acquisition (DAQ) system based upon the SHARC digital-signal-processor (DSP) has been developed. Either one or two DSPs are mounted onto VME boards, WS2126. VME-to-VME data transfer utilizes SHARC links (33.7 MB/s of measured transfer rate), routed onto a front panel of the WS2126 by I/O piggy pack modules, WS9002. The system has been designed for readout of 81,920 FADC ( ash analog-to-digital converter) channels at a maximum trigger rate of 500 Hz. This system could be accomplished in a compact setup of four FADC readout VME crates and one master-control VME crate (for event-building and run-control). The system was tested for readout of the BELLE silicon vertex detector (SVD).
I. Introduction
The DAQ system for the BELLE silicon vertex detector (SVD) 1] consists of 81,920 readout channels. For a high luminosity collider experiment like BELLE, processing time for readout is one of the important issues. Simulation studies for the BELLE SVD showed the necessity of processing large data packages of 5 kbytes within short periods of 200 s 1]. Such high bandwidth requirements can only be handled by fast processors like DSPs.
Although we had several other solutions in constructing the DAQ system, the SHARC DSP (ADSP-21062 with 40 MHz system clock) 2] was chosen as a core processor for readout operation. The SHARC with a high oating point peak performance of 120 MFLOPS has been used in R & D works in various experiments 3, 4, 5, 6, 7] .
Compared to other DSPs, the outstanding feature of the SHARC is the external availability of six link ports (SHARC link), operated with the DSP system clock at 40 MHz. With the help of the links, we could realize a multiprocessor system.
For the DAQ system, a SHARC VME cluster, WS2126 8], developed by Wiese GmbH, Germany, was chosen. The SHARC links for the SHARCs mounted can be utilized by installing SHARC IO piggy packs, WS9002 9] on this board optionally. We have developed a distributed DSP-based compact DAQ system with several WS2126s utilizing fully the SHARC link facility.
In section II, functionality of the WS2126 is discussed, in section III, an example of a DAQ system with the WS2126 is introduced and some performance results are discussed in section IV. Finally in section V we summarize our system and its performance.
II. SHARC DSP VME Cluster, WS2126 Figure 1 shows a block diagram of the WS2126. The board can mount up to six SHARCs. Each SHARC can run independently on the same board. In our case, we used them with either one or two SHARCs. The WS2126 can perform both VME master and slave modes. The VME master access is executed by memory access of a SHARC initiated by rmware installed in a CPLD, Altera MAX9000 10] on the board. The rmware for this CPLD can be loaded via a JTAG connector on the board with Altera ByteBlaster 10].
In the memory space of a SHARC, there is a VME access window. Once a base address of the VME memory space is set in a register (master control register) on the board by a SHARC, it can have access to the VME memory space by (a) CPU driven or (b) DMA driven cycles of a SHARC. In addition, the board can perform block transfer (BLT) in VME transfer.
A SHARC has six link ports, which have the maximum transfer speed of 40 MB/s (catalog value; measured one is 33.7 MB/s). There are single word and DMA transfer modes on the link. We may setup the registers on a WS2126 and a SHARC for the transfer. We may use both modes in usual link-transfer.
A program which runs on SHARCs on a WS2126 is written almost entirely in C-language and partially in assembler-language (e.g. for wait task and VME bus control task). The compiler is distributed from Analog Devices, Inc. and it works on IBM-PC with Windows 95/NT and Sun workstation with SunOS/Solaris. We used the one for Sun workstation with Solaris. In order to download a program to a SHARC via a VME bus, we made a downloader-program for a Unix system (Solaris 2.5) running on a VME Sparc board, Force CPU-7V 11].
A SHARC executable code is downloaded by performing VME write into a host port register of a SHARC on the board. The execution of the DSP code starts automatically after writing the rst 256 words. Basic communication except data acquisition (e.g. printing error messages) between the SHARC and the CPU-7V is performed via the SHARC message registers (8 32 bit), visible from either side. With these message registers, the SHARC program can be put into a halt status (awaiting a speci c bit to be set in one of the message registers), before execution continuation is resumed.
III. DAQ System with DSP-based VME Modules
A DAQ system with WS2126s for the BELLE SVD is shown in Fig. 2 . Four WS2126s with one SHARC each were installed in each FADC crate and two WS2126s with two SHARCs each in the control crate. Each WS2126 in an FADC crate runs independently of those in other crates. SHARC link connection between WS2126s in the control crate and one in each FADC crate was established for communication and data transfer. Length of link cables used was 3 m. Communication between the board and CPU-7V in the both crates was done by polling operation on the CPU-7V.
In the control crate, there is a timing distributor module (TDM) 12] for receiving trigger signals from the central DAQ (CDAQ) system. The trigger signal is distributed to FADC modules 1], which were designed and fabricated for the BELLE SVD, in each FADC crate via trigger timing module (TTM) 1]. A busy signal is generated to the TDM by an FADC board to inhibit next readout sequence during current readout operation.
For the data acquisition, we have two run modes;
global-run mode (data transfer to the CDAQ system) and local-run mode (data transfer to local disk storages or tape drives).
We could operate both modes in the same setup without changing the hardware con guration.
The data ow in an FADC crate is shown in Fig. 3 . The CPU-7V is a bus-arbiter on a VME bus. When the WS2126 starts to have access to a VME bus, the CPU-7V automatically becomes a VME slave, then the WS2126 can act as a bus-master. Each signal from silicon detectors is converted into digital information in the FADC modules. A WS2126 checks whether data of FADCs are ready or not by its polling operation of having access to FADC's status registers. The data on each FADC module are gathered via a VME bus and packed into formatted data by a WS2126 when all the registers indicate that data is ready to be read. For a stability check task, a WS2126 has a task to calculate parity and checksum bits. This facility has not been activated in the actual data taking. In the global-run mode, a polling process on the CPU-7V repeatedly checks, whether an event is prepared for further transmission to the CDAQ. After the WS2126 sets a bit in the slave memory space of the CPU-7V, the CPU-7V reads the data. Subsequently the data on the CPU-7V are transferred to the CDAQ system by a CDAQ eventbuilder transmitter S-bus card (CEBTX) 12]. In this case, each FADC crate is operated in parallel.
While in the local-run mode, the data on each WS2126 are transferred to WS2126s in the local control crate via SHARC links as shown in Fig. 4 . The data in each SHARC in the control crate are sent to a CPU-7V via a VME bus in DMA transfer. Subsequently the CPU-7V stores the data on magnetic tapes or disk storages. 
IV. System Test Results of the DAQ System
A. Basic performance Figure 5 shows comparison of transfer rate di erences with a WS2126 and a CPU-5V 11] over a VME bus for various transfer modes. The gure plots transfer rates as a function of the data length. CPU-5Vs were used for these measurements while the nal system was deployed with CPU-7Vs. These prototype measurements were performed using CPU-5Vs instead of CPU-7Vs (design system VME CPU board), however, both CPUs do only di er concerning CPU performance, not VME transfer rate. For small data transfer (below 1 kbyte), the transfer rate with the WS2126 is higher than that with the CPU-5V in both transfer modes. Overheads of the DMA transfer of WS2126 and CPU-5V are 4.6 s and about 100 s, respectively. The DMA transfer mode was used to read data from FADC boards to WS2126. Measured transfer rates of a SHARC link in core process and DMA were about 9.7 MB/s and 33.7 MB/s, respectively. During the tests, link cables of 3 m long were used. Overheads of core process and DMA were 1.0 s and 1.9 s, respectively. Although the nominal transfer rate of the SHARC link is 40 MB/s, the measured transfer rate in DMA is lower than this rate. We noted that the degradation of the transfer rate was caused by overhead of the link transfer test program.
A stability test of the link transfer was done with two WS2126s. In this test, data were transferred from one WS2126 to another WS2126 with a maximum possible trigger rate which the link could perform. This means the data were transferred with a trigger rate of 34 kHz in the case of 1 kbyte data transfer with DMA transfer of 33.7 MB/s transfer rate. This test was done for several points of a data volume from 100 byte to 1 kbyte. At each point, parity and checksum were checked for a period of about two days. All the tests showed neither parity nor checksum error.
After con rmation of this stability test, the parity and checksum error check facility has not been activated in the actual data taking with the cosmic rays.
B. System test results
To examine the performance of the detector and its DAQ system, we did a system test with cosmic rays. Trigger signals were made by the coincidence of scintillation trigger counters, which were placed at the upper and lower positions of the detector. Trigger rate in this test was about 1/60 Hz.
One of the goals of the system test was the synchronous data routing and collection from four SHARCs in FADC crates to two SHARCs in a control crate via SHARC links. Through this system test, we checked the consistency of data with the distributed DAQ system. The synchronous data transfer between four SHARCs in the DAQ crates and SHARCs in the control crate is a key for this system.
Cosmic ray data were taken to check mainly synchronization of total four FADC crates. Reconstructed cosmic ray tracks were found in three dimensional space consistently from the merged data in the local control crate. The synchronization over whole system was thus validated by this fact.
V. Summary
A DSP-based parallel-processing compact DAQ system has been constructed. To minimize processing time in the readout operation, a WS2126 was selected as a core processor board in the DAQ system.
The VME performance of the board was better than that of a CPU-7V with Solaris 2.5. Compared with the CPU-7V, we could achieve a fast enough VME transfer rate of the WS2126 for short size (below 1 kbyte) data transfer.
The SHARC link is an essential facility in the multimode DAQ system. We have observed the bandwidth of this link to be 9.7 MB/s and 33.7 MB/s for single word transfer and DMA transfer, respectively.
A SHARC DAQ prototype system test with data taking of cosmic rays was successful and showed error free operation of the SHARC links.
Through the system test for the SVD DAQ system, we could con rm that the DAQ system could give enough performance in the real DAQ system.
VI. Acknowledgment

