Digital pulse processing ͑DPP͒ systems are known to have better performance than analog ones for neutron and/or gamma-ray pulses. DPP can synthesize almost any pulse response shape without the associated signal degradation which happens in a complex analog path. Measuring techniques involving detectors/spectrometers for fusion diagnostics rely on real-time algorithms, implemented in field programmable gate array ͑FPGA͒, for pulse height analysis, pulse shape discrimination, and pileup rejection of digitized pulses in real time for reduced data throughput, monitoring, and control. This article describes a data acquisition system for real-time pulse analysis based on ATCA and contains a 6 GFLPOS i ϫ 86-based control unit and a number of transient recorder ͑TR͒ modules interconnected through PCI Express links. Each TR module features ͑i͒ eight channels of 12 bit resolution with accuracy equal or higher than 10 bits, ͑ii͒ 200 Msamples/ s of sampling rate achieving 400 Msamples/ s in an interleaved architecture, ͑iii͒ 2 or 4 Gbytes of local memory, and ͑iv͒ two field FPGAs able to perform real-time processing algorithms.
I. INTRODUCTION
Thermonuclear fusion products such as neutrons and gamma rays have a dedicated set of diagnostics based on detectors sensitive to both neutrons and gamma rays ͑for instance, NE213 liquid scintillators͒: ͑i͒ neutron spectrometers, ͑ii͒ gamma spectrometers, and ͑iii͒ neutron profile monitor sorting out neutron and gamma-ray emission profiles. 1 These plasma diagnostics usually rely on the statistical algorithmic analysis of the digital equivalent of the collected pulses height ͓pulse height analysis ͑PHA͔͒, pulse shape ͓pulse shape discrimination ͑PSD͔͒, and pileup rejection ͑PUR͒.
2,3
This article describes a data acquisition system for realtime pulse analysis where real-time PHA, PSD, and PUR algorithms are implemented in field programmable gate arrays ͑FPGAs͒. The PHA algorithm is based on a trapezoidal shaped finite impulse response ͑FIR͒ filter, with variable gap and peaking time, directly applied to the acquired data reducing noise and the sensitivity to signal rise time variations. 4 The PSD algorithm is based on the fact that detectors sensitive to both neutron and gamma-ray particles present a two component response: ͑i͒ a fast pulse response similar for neutrons and gamma rays and ͑ii͒ a slower decaying tail differentiating neutron induced events from gamma-ray induced events. For events of the size fast pulse response ͑same amplitude͒ a bigger tail for a neutron event than for a gamma-ray event is expected. 5 Concerning PUR algorithm a comparison between the expected trapezoidal length and the obtained one is evaluated.
The data acquisition system has several transient recorder modules with autotrigger functionality to digitize and store the input pulses in a digital format. The accuracy of the stored digital analog of the pulses is important for getting the best results on later analysis.
II. DESCRIPTION OF THE DATA ACQUISITION SYSTEM
A data acquisition platform will be described in the following sections and it is engineered to ͑i͒ implement digital pulse processing ͑DPP͒ functions such as PHA and PSD for data reduction and real-time rough assessment of plasma parameters, ͑ii͒ target the specificities of the experiment diagnostics, ͑iii͒ have a high number of channels per board and consequently lower the cost per channel, ͑iv͒ have an architecture designed for upgradeability and scalability, and ͑v͒ be ready to be integrated on Joint European Torus ͑JET͒ control and data acquisition system ͑CODAS͒ and on real-time network ͑RTN͒.
A. System architecture
The system hardware architecture based on the PICMG 3.0 Advanced Telecommunications Architecture, 6 ͑ATCA͒ standard is composed by a board and 14 slot shelf ͑subrack͒ sharing a common backplane with interconnections based on a full mesh of serial gigabit communication links. Each slot is interconnected to all others through ϫ4 PCI Express ͑PCIe͒ links. 7 The controller, a low-cost ATX motherboard ͑providing a i ϫ 86 family processor with a processing power around 6 GFLOP/ s͒ mounted on an ATCA carrier module, is connected to the ATCA backplane through a ϫ16 PCIe link ͑3.2 Gbytes/ s͒, capable of processing the data from four ATCA cards, each one streaming data at a throughput of To comply with the low time latencies of data transfer in this setup, a real-time operating system must be used. Currently two options are available: VXWORKS and LINUX with real-time applications interface ͑RTAI͒ extensions ported to this specific hardware platform. The unit is interfaced to control data acquisition system through 100 Mbit/ 1 Gbit Ethernet port. A PCIe to advanced switching interconnect or asynchronous transfer mode interface will be provided to connect the system to the JET RTN for real-time monitoring or control purposes.
B. ATCA transient recorder module
The ATCA transient recorder module diagram block is depicted in Fig. 1 . The ATCA module form factor is large enough to accommodate eight acquisition channels per module. The module contains eight 12 bit resolution analog-todigital converters ͑ADCs͒, to cope with the expected signalto-noise ratio of the input pulses, providing eight analog channels at 200 mega samples per second ͑MSPS͒ or, using a pair of digitizing channels shifted in phase ͑180°͒, four channels at 400 MSPS, for accurate pulse shaping. Circuitry for using four ADCs to provide one channel at 800 MSPS is also included, although the effective number of bits is difficult to estimate. 8 The analog inputs are designed to achieve a 3 dB bandwidth higher than 400 MHz ͑input characteristics are based on the AD9430 ͑Ref. 9͒/THS4509 ͑Ref. 10͒ front-end combination͒.
The module contains two Xilinx® FPGAs from Virtex 4 family ͑XC4VFX60-10FF1152͒, each controlling a memory bank of 1 Gbyte, and four acquisition channels while performing the following main tasks: ͑i͒ data storage management, ͑ii͒ 4 ϫ 2.5 GHz PCIe communications interface, ͑iii͒ data processing: PHA, PSD, and PUR, ͑iv͒ complex managing modes of triggering, and ͑v͒ local timing unit for tagging the pulses time with a resolution up to 1.25 ns and at least one day of time span, see Fig. 2 . The memory is implemented with a double data rate ͑DDR2͒ synchronous dynamic random access memory ͑SDRAM͒ providing up to 2 Gbytes/module ͑upgradeable to 4 Gbytes͒. Only raw data ͑array of sampled pulses͒ are stored in memory. A continuous stream of processed data will be sent to the ATCA controller for further processing and subsequent use for monitoring and control.
For a pulse width of 256 samples a maximum sustained hit-rate per channel, for raw data, is of 780 KHz for the eight channel operating mode or 1.56 MHz for the four channel operating mode.
III. PULSE PROCESSING
Digital pulse processing can occur both in the FPGA and the i ϫ 86 processor. The former excels performing high throughput parallel simple operations directly on the highrate sampled data and the last supports more complex algorithms at moderate rates. For instance, the pulse energy is obtained through the use of a digital filter implemented in the FPGA and the output is then passed on to the processor in order to build an energy histogram. FPGA programming is accomplished using the Xilinx ISE and embedded development kit toolset for the data flow circuitry and the SystemC for the DPP part.
Each ADC channel is directly connected to a pretrigger circular buffer and to a trigger detection digital level comparator with configurable threshold level. Once a trigger is detected, during a configurable pulse length ͑e.g., 256 or 512 samples͒ the trigger detection is disabled. Every time a trigger occurs, the ADC data n is integrated from a pretrigger configurable number of samples till the end of the pulse length while the maximum amplitude is being calculated, A n . At the same time a trapezoidal filter is being applied to the FIG. 3 . FPGA data processing algorithms resolve ͑i͒ PHA, ͑ii͒ PUR, and ͑iii͒ PSD. output of the pulse integrator, S n . The pulse energy E n is given by the peak of the FIR output, dependent on the configurable gap G and length L variables. Simultaneously, the output filter width is being resolved by another comparator and counter and if the width is greater than a predefined value a pileup has occurred and a pileup counter is initialized. Concerning PSD algorithm, based on the fact that at the same experiment the pulses induced by neutron and gammaray events have the same distribution, calibration curves previously resolved for several gamma/neutron distributions define the correspondent slope values for each distribution and are inputted into a lookup table. The discrimination energy E D for a specific calculated A n is obtained and compared with E W n the weighted pulse energies. If E D is smaller, a neutron induced event has occurred, if it is greater a gammaray event has occurred. Figure 3 depicts the above mentioned FPGA data processing algorithms.
IV. DISCUSSION
The Virtex-4 XtremeDSP feature allows higher performance multiplication and arithmetic capabilities specifically designed to enhance the use of FIR filters in FPGA-based DSP, increasing flexibility and performance as well allowing very high throughput due to the real-time stream processing capability.
Literature describing PSD methods 5 asserts that the methods are well defined for pulse energies above a 0.5 MeV. Lower energy pulse discrimination is expected with this data acquisition system.
ACKNOWLEDGMENTS
This work has been carried out in the frame of the Contract of Association between the European Atomic Energy Community and Instituto Superior Técnico ͑IST͒ and of the Contract of Associated Laboratory between Fundação para a Ciência e Tecnologia ͑FCT͒ and IST. The content of the publication is the sole responsibility of the authors and it does not necessarily represent the views of the Commission of the European Union or FCT or their services.
