For the LHD project a long-pulse plasma experiment of one-hour duration is planned. In this quasi steady-state operation, the data acquisition system will be required to continuously transfer the diagnostic data from the digitizer front-end and display them in real-time. The CompactPCI standard is used to replace the conventional CAMAC digitizers in LHD, because it provides good functionality for real-time data streaming and also a connectivity with modern PC technology. The digitizer scheme, interface to the host computer, adoption of data compression, and downstream applications are discussed in detail to design and implement this new real-time data streaming system for LHD plasma diagnostics.
Introduction
In the field of fusion technology research, some works on real-time data acquisition have been reported especially from large or steady-state tokamak devices [1, 2] . In tokamaks, it is very important to detect the plasma disruptions and to study the mechanism of this transition phenomenon by analyzing the corresponding signals around them [3, 4] . Such kind of system behavior is often called event-driven data acquisition [5, 6] . Even though the recent enormous progress in semiconductor technologies, such as computer memories or CCD array sensors, strongly promotes an increase in raw data, its sporadic operation will effectively help to reduce the data size.
As the plasma itself has characteristic fluctuation frequencies, plasma diagnostics in fusion experiments generally require a definite sampling rate. For instance, plasma magneto-hydro-dynamic (MHD) oscillations can be often observed in the few kHzMHz range. These rates are usually much higher than the typical plant controlling or monitoring intervals of Hz-kHz. The VMEbus system running on a real-time OS is very popular in factory automation (FA), however, its data acquisition rate is insufficient for the plasma physics measurements [7] .
Conventional data acquisition systems are usually called batch-processing ones, and they have often applied the CAMAC digitizers in short-pulse plasma experiments [8] . At the end of the post-processing procedures after each discharge, the data acquisition, storage, and visualization are executed sequentially for data produced within the digitizers. In long-pulse experiments whose duration are up to one-hour, however, such a post-processing mechanism will be ineffective because any diagnostic data cannot be seen throughout the duration. Therefore, real-time data acquisition and simultaneous visualization will be indispensable for the longpulse experiments.
On the other hand, the recent growth in information technology has enabled the non-stop generation, transfer and restore of high-bandwidth data within a definite delay. It is known as data streaming technology. The future plan for the newest fusion devices using the superconducting magnets, such as LHD and Wenderstein 7-X [9] , is to hold the quasi steady-state plasma for more than ten minutes. In these circumstances, the data acquisition system also has to run in real-time so that it can display the transient behavior in accordance with the plasma discharge in progress. Thus the acquisition of streamed data will be required quite soon.
Design Requirements
In general, the MHD fluctuation diagnostics will request the maximum A/D convertor specification toward the new digitizer system. For instance, at least 500 kHz sampling rate is required for the LHD electron cyclotron emission measurement even if the plasma duration becomes longer. Nowadays, for the 2-or 3-dimensional spatial distribution measurements, average diagnostics often have over 100 digitizer channels. As a result, the necessary condition for a new digitizer front-end (DFE) is that it should have 100 channel ADCs with a sampling rate of 500 kHz in one chassis. It means that each DFE will produce a continuous 100 MB/s data stream.
The basic specification for one DFE chassis can be summarized as follows. The ADC resolution is assumed to be 12-16 bits, that is, 2 byte/sample.
(1) Data transfer rate should be above 100 MB/s in non-stop continuous operation. (2) One should contain about 100 digitizer channels. (3) Linkage between the DFE and the host computer should be electrically iso-lated and the distance (about 500 m) extendable by using the optical fibers. (4) High connectivity with the PC/EWS host computer is required.
Although the CAMAC standard has been applied as a substantial digitizer system in a conventional plasma experiment, it has no functionality for real-time data transfer and even its maximum bandwidth of 3 MB/s is quite insufficient now. Among present PC and EWS technologies, the PCI bus is conceived to be the most substantial standard bus that can satisfy the 100 MB/s bandwidth requirement.
As the PCI bus has been designed just for the PC extension slot, it has no wide frontpanel to implement multiple coaxial connectors for analogue signal inputs, and neither does it have the capability of electrical insulation from PC main body which contains many noise sources. It also has a logical limitation of the maximum 8 slots or devices in one PCI bus. The CompactPCI standard designed for the modular front-end would resolve such restrictions, with keeping the PCI compatibility. A detailed discussion of its suitableness is given in the following section.
Scheme for Digitizer Front-end
The DFE design structure can be arranged primarily into one of two categories. In one, the whole DFE chassis is made to be a computer like VMEbus system, and act autonomously to provide some functionality through the computer network such as the Ethernet. The other way is to make the DFE to be a computer's peripheral device, whose examples are the CAMAC and SCSI devices. It is connected locally to the computer's peripheral interface and becomes functional under the control of the host computer. The former has the disadvantage that the software development cost of both the DFE client and server sides would be doubled, though it simultaneously has a wide flexibility to be able to modify the DFE behavior on software which governs the CPU or I/O modules. Also using the network communication will be another shortcoming because it results in higher overheads and higher CPU load in comparison with general I/O data transfer. As a peripheral device, the DFE behaves more mechanically and the host computer has only to prepare its device driver code.
From another viewpoint, the computer-type DFE would have a tendency to become obsolete as computer technology progresses. It is very significant, on the other hand, that the device-type DFE can survive longer because it is just connected through some standard I/F to the host computers and thus can be almost independent from them. The CAMAC equipment, which was launched in 1970 and has survived until now, is a good example of the latter-type peripheral device.
For the design of the new DFE scheme, therefore, it is preferable that the DFE system should be a peripheral device. Because the LHD diagnostics have to continue using a number of DFE systems during about ten years, the longer lifetime and smaller burden of software maintenance are quite significant.
Data Transferring Interface
To examine the optimal link media between the peripheral DFE and the host computer, in Table 1 we enumerate possible candidates that are presently popular, reliable, or capable of 100 MB/s data transfer.
In addition to the 100 MB/s bandwidth and the peripheral device structure, another essential condition is the distant expansion and the electric isolation by using optical fibers. Taking account of the easiness to obtain commercial products, only the SCSI optical transceivers or the FibreChannel are available at the present time. The SCSI interface, however, was not originally designed for distant transfer, and the extension becomes more difficult at higher rates of Ultra2Wide (80 MB/s) or Ultra160 (160 MB/s). We finally discovered that the FibreChannel standard has the unique possibility of providing the capability of 100 MB/s data transfer. Even it will be insufficient for our maximum stream of non-stop 100 MB/s, when considering the system overheads. Now consider that one CompactPCI subrack should contains 100 digitizer channels. Because the CompactPCI geometry has the same size as the Eurocard standard 3U or 6U, it can implement at most 6-8 connectors per module due to the front panel size. Another limitation is the maximum 8 devices in one PCI bus. Popular commercial CompactPCI chassis are usually equipped with (8-1)*2=14 slots by using a PCI-to-PCI bridge in its back plane. In these cases, there can be maximum 8*14=112 channels in two PCI buses and their data production rate will be 112 MB/s. So it is realistic enough to transfer a 100 MB/s data stream from these 2 PCI buses if we apply one FibreChannel I/F to each of them. Fig. 1 shows a schematic view of this CompactPCI DFE system. For the popular 32-bit PCI bus that has 132 MB/s bandwidth, it is very severe to carry out a 100 MB/s continuous data transfer, with considering the I/O overheads. Thus, we need to adopt the 64-bit PCI bus that is usually applied in Gigabit Ethernet and Fibre Channel I/F cards. The 66 MHz 64-bit PCI bus with 528 MB/s bandwidth would be preferable, however, it has more restrictions especially for the Compact-PCI system bus controller.
Using data compression, it might be possible to reduce the data traffic for realizing a high-performance data streaming system. We can envisage some options where the compression and decompression method can be introduced as follows. In the following discussion, all the compression methods are assumed to maintain complete data reversibility, that is, to be a loss-less method.
Hardware compression just after ADC. Pre-programmed DSP or custom-made compression chips will be directly wired to AD converter outputs on every module or channel. It can avoid consuming the CPU resource and also reduces the bus traffic on the CompactPCI backplane. For further flexibility, introducing the CPU might be possible instead of the DSP, however, the estimation of the processing time-steps will be indispensable. I/O stream or packet compression is a similar method that is often used in PPP packet communication for dial-up remote access. At both ends of the DFE bus controller and the host computer, packet compression and decompression will be automatically done. It can be implemented in either the hardware or software processing. In the former case, a custom-made compression/decompression chip can be made and installed on I/F cards, and in the latter case the host's and DFE controller's CPU can be applied to the data calculations. The latter needs a more rigid examination of the processing speed and memory consumption. No compression for data streaming is even a possible choice. The software compression/decompression method will result in a heavy calculation load. The hardware compression, on the other hand, cannot achieve a good compression rate under the limited calculation time. Compression just before archiving into storage. The compression method will be introduced only for reducing the occupation size in data storage. It also requires a fast enough speed to follow the streaming data rate from the DFE, but the technical flexibility might be wider in its implementation. A definite estimation of the working memory and the calculation speed will be necessary.
The compression method might improve the effective data transfer rate, however, the double pre-/post-processing before/after data transfer would cause a too heavy calculation overhead on the total throughput. The development burden of both-end software would also be double, and that means a less flexible system for future modifications. As a preliminary conclusion of the above discussion, it has been decided not to adopt any compression method in the new real-time streaming system except for data archiving.
Downstream Applications
The streaming data transferred from the DFE to the host computer should be processed by these two mechanisms:
(1) Display: not the time evolutional waveforms of raw data but the time slices of highly transformed spatial profiles etc. (2) Storage: preferable with some data reduction or compression especially for the massive-sized storage system.
The performance comparison among some storage medias is listed in Table 2 . It clearly shows that only the magnetic hard-disk (HDD) could easily reach the 100 MB/s effective transfer rate by adopting the RAID (redundant array of integrated disks) formation. Other media like magnetic tapes (MT) or magneto-optical disks (MO) cannot achieve the demanded I/O rate even though they apply both array formation and a data compression.
Visualization of the real-time data stream will be more complicated. Although the original sampling rate is 500 kHz, it is meaningless for human recognition to x-t plot such a high-frequency raw waveform in real-time. What gives good awareness is the real-time display of slower plasma behavior, which will be drawn from highly analyzed and reconstructed dimensional profiles or shapes. To fix a typical aim for the real-time display of the streamed data, the time evolutional pictures should be refreshed near the video frame rate of 30 Hz. The time margin for the data transfer and the picture reconstruction will be limited within each frame interval of 33 ms.
The real-time display for each diagnostics are expected to be quite different from one another, and their calculation routines also have a great variety. The implementation design of the real-time visualization, therefore, has to settle a display image by studying the every diagnostic situation. Moreover, the calculation routine provided by the diagnostician in charge must be surveyed for its speed, because its elapsed time will closely depend on the data types.
Summary
In this study, the CompactPCI standard has been selected for replacing the conventional CAMAC or VMEbus front-end. This is because the CompactPCI has logically and electrically the same specification as the PCI bus that is the de facto standard PC extension bus, and its 132 or 264 MB/s bandwidth is fast enough for real-time massive data transferring. The investigation for a new design of the CompactPCI digitizer system has proved that it has the capability to compose the streaming digitizer front-end. The CompactPCI DFE will be operated as a usual PC peripheral device or an extension part of the PC mainbody's PCI bus. 
List of Figures

