A novel two-dimensional multichannel scaler is described, providing an inexpensive method for recording double modulated time-of-flight spectra generated by a 16-element multi-detector array. The design uses a hardware first stage implemented with programmable logic and a software second stage, running on a digital signal processing board. The hardware stage can count at a peak event rate of 20 MHz with a time resolution (minimum bin width) of 50 ns. The maximum sustained event rate is determined by the speed of the software processing. The flexibility of the design makes it suitable for a wide range of applications in which counts are recorded as a function of two variables.
Introduction
Multichannel scaling (MCS) is an established experimental technique whereby, after an initial stimulus, the frequency of events is measured as a function of time. Typically the number of events is recorded in a series of bins, each corresponding to a discrete time interval. MCS functionality is provided by a number of commercially available devices. In this paper we describe two extensions of the multichannel scaler: first, to record events as a function of time with respect to two independent stimuli; and second, to record events generated simultaneously by a number of detectors. An example, which requires the additional functionality, is provided by future inelastic helium atom scattering experiments, for which timeof-flight (TOF) measurements using a double chopper modulated beam together with a 16-anode multi-detector array are required [1, 2] .
Recording in two dimensions, defined by times t 1 and t 2 , gives data acquisition requirements qualitatively different from those of a one-dimensional MCS. For a two-dimensional spectrum, many more bins are used (N 2 instead of N , where typically N 1000), giving a much lower rate of events per bin and much more frequent advances to consecutive bins. We record events as binary numbers which encode the values of t 1 and t 2 , so that a histogram can be reconstructed at a later stage. Within this scheme, recording data from multiple element detector arrays can be achieved in a straightforward way by using a second binary number to encode the detector channel of an event. The binary numbers representing the detector channel and times t 1 and t 2 are concatenated to give a complete description of the event.
The object of the present design is to record data generated by a 16-element multi-detector array, with count rates up to 1 MHz and time resolution better than 1 µs in t 1 and t 2 . In meeting these specifications, the device is suitable for a wide range of counting applications. The design, implementation and testing of a data acquisition system based on the principles outlined above is described in the remaining parts of this article. signal conditioning unit, here a 16-channel preamplifier and discriminator (LeCroy model 2735PC) giving digital outputs (DATA). Time resolution is provided by two sets of externally generated START and ADVANCE signals, in our application the output of chopper control circuits. START signals are given at times t 1 = 0 and t 2 = 0, whereas ADVANCE signals are produced at a multiple of the chopper rotation frequency. The 2D MCS unit consists of two main blocks (figure 1). High-speed digital processing is provided by the custom hardware which takes DATA, START and ADVANCE pulses as its input and outputs a binary number encoding each event. The two-dimensional histogram is reconstructed in the memory of a digital signal processor (DSP) board housed in a PC expansion slot. For each event, the corresponding binary number is transferred to the DSP. The binary number is then interpreted as a memory address and the event is recorded by incrementing the memory contents at that address. Overall control of the experiment is provided by the host PC. Control of the DSP board and transfer of data uses a memory mapped location in the address space of the PC. At the end of an experiment, the histogrammed data array is transferred from memory on the processor board to the PC for storage and analysis.
An overview of the 2D multichannel scaler
The processor boards connect to the PC via the ISA bus; initial testing used a TMS320-C50 evaluation module (Texas Instruments), which provides a highly cost-effective implementation and performance that is adequate for many applications. Later development used a QPC/C40B board (Loughborough Sound Instruments). The QPC/C40B board houses up to four TIM40 modules, each containing a TMS320-C40 processor (Texas Instruments) with 8 Mb EDRAM. Communication between the hardware and the TMS320-C40 processor is via a high-speed parallel communication port. The use of DSP boards provides a convenient means of housing a dedicated CPU in a PC expansion slot and it is also possible to take advantage of efficient DSP instructions to perform initial data processing.
Custom hardware
The main hardware components are shown in figure 2. They consist of two programmable logic devices (AMD MACH445) and three 9-bit wide 256 deep FIFO memories (AMD Am7200). PLD 1, shown in the upper left-hand corner, encodes the data channel into a 4-bit binary number and provides the FIFO read and write control logic. PLD 2 is an implementation of two high-speed counters. Both data (D0-15) and START/ADVANCE inputs are synchronized to the clock (CLK). Two innovative aspects of the design are the use of a priority encoder to queue simultaneous data pulses on multiple channels and the implementation of counters using pseudo-random binary sequence generators, described below in more detail. Figure 3 shows the logic implemented using PLD 1, which priority encodes the incoming data and synchronizes to the local clock. Data pulses (D0-15) set the D flip-flop outputs AD0-15. On the rising clock edge after a data pulse, AD0-15 are synchronized to the clock giving QD0-15, the inputs of the priority encoder. The output of the priority encoder (DP0-3) is a 4-bit binary number indicating the highest numbered input (QD0-15) in a HIGH state. DP0-3 are used to reset the corresponding input D flip-flop on the falling clock edge. DP0-3 are transferred to the FIFO inputs (LDP0-3) on the second rising clock edge after the data pulse. This delay of two clock cycles results in the data channel outputs (LDP0-3 on PLD 1, see figure 2 ) and counter outputs (QA0-15 and QB0-15 in figure 5 later) being valid simultaneously at the FIFO inputs. Figure 4 shows a timing diagram illustrating the operation of the priority encoder, for the case of two simultaneous data pulses. Data pulses D1 and D0 arrive simultaneously, but are processed in consecutive clock cycles, D1 first and D0 second.
The priority encoder
The FIFO read and write control logic, in the lower part of figure 3 , is also implemented in PLD 1. The write pulse (FIFOWCLK) causes the FIFO to store the data channel (LDP0-3) and the counter outputs (QA0-11 and QB0-11). The write control logic asserts the ERROR signal high if the FIFO overflows. The FIFO read control logic generates read pulses (FIFORCLK) to control data transfer from the FIFO to the processor board, asserting the EMPTY signal low if there are no data to be read. NMR (an active low reset) and DATA REQUEST are signals generated by the DSP board. DATA REQUEST is low while data are being read by the DSP board, with the rising edge triggering the next FIFO read operation. NFF and NEF are active and low signals generated by the FIFO to indicate full and empty conditions, respectively.
Pseudo-random sequence counters
Linear shift register feedback circuits (described in [3, 4] ) are used to generate pseudo-random binary sequences (PRBS), providing a simple and efficient implementation of a synchronous counter. PRBS generators have recently been used to construct a high-speed 128-channel counter in an independently developed application [5] . An n-bit pseudo-random sequence cycles through 2 n − 1 output values before repeating. Thus, using the shift clock as the counter input updates the counter outputs with a single ripple delay. Storing the data in pseudo-random order does not give a significant disadvantage because re-ordering of the data is easily implemented in software. Figure 5 shows the logical structure of PLD 2 which contains the two PRBS counters. The ADVANCE and START signals are synchronized with the clock (CLK), incorporating a delay of two clock periods. The data pulses are subject to an identical delay (see section 3.1), so the timing pulses and event count data remain precisely synchronized. The length of the PRBS generated is determined by the location of the taps used in the feedback network: sequences are selectable by use of dip switches controlling inputs SELA and SELB in figure 5.
Testing and evaluation
Correct operation of the 2D multichannel scalar hardware and software was verified using both static and dynamic testing. Outputs were checked for a range of input signals at low clock rates, confirming correct operation of the logic. High-speed testing was also performed to ensure that propagation delays and other dynamical effects did not cause errors. A TOF data simulator was constructed, which produced a realistic single-channel output of random data pulses with variable data rates and variable time delays between gating by the two choppers [2] . Analysis of the simulated double-modulation TOF spectrum demonstrated that the system operated correctly. A slight bias is necessarily introduced by the priority encoder scheme, whereby, for high count rates, events on lower priority channels tend to be delayed by the queuing process. This effect was investigated by computer simulation of the circuit with random event times at equal average rates on all 16 data channels. The bias is both small and predictable. If we select an average delay in the lowest priority channel of 0.5 clock periods as the value that adds negligible uncertainty to the time quantization, then the results show that data rates up to 0.33 per clock period are possible (6.6 MHz with the current clock rate).
Using the low-cost TMS320-C50 DSP, count rates of up to 1 MHz were recorded. Speed was limited by the software part of the processing. Higher count rates are possible with the faster TMS320-C40 processor. The approach gives an enhancement of flexibility, making the design suitable for a wide range of applications in twodimensional spectrometry.
