Abstract: The University of Washington developed a Firewire based data acquisition system for the MiCES small animal PET scanner.
I. Introduction W e are designing a second generation data acquisition system to support several positron emission tomograph (PET) designs being developed at the University of Washington. It is based on our experience with the original MiCES electronics concepts [1] . However, the new system will be more compact and able to be used with PET system inserts for Magnetic Resonance scanners (MRI). The new electronics is also being designed to support both our continuous detector development efforts (cMiCES) and our discrete crystal depth-of-interaction detector designs (dMiCES) [2] [3] [4] .
Many groups have exploited the use of FPGAs for pulse processing in scanner electronics [e.g, [6] [7] [8] [9] .
In our own laboratory, the FPGAs are routinely used for pulse integration [1, 12] and we are developing implementations for statistical event localization [2] and timing [5] . With the continued advances in FPGA capacity and speed, including the ability to include embedded processors tailored to the needs of specific applications (e.g., NIOS II soft core processor for Altera devices), we have elected to put as much as possible within the FPGA for this revision of our Firewire based dataacquisition systems. Others have utilized Firewire for a wide variety of data acquisition applications. One of the earliest in medical imaging was the work of Rillbert, Stenstrom et. ale [10] [11] in developing a modular system for SPECT and PET 1998-2000. The system we have developed shares many of the goals of these earlier systems, but is able to handle more data channels and takes advantage of higher transfer rates due to the advances in the Firewire standard as well as the FPGAs.
II. System design and implementation
The system is built around a multi-purpose board design -the acquisition node board (ANB). This board contains the analog-to-digital converters (ADC) for the detector signals, the Firewire transceiver integrated circuit (IC), various communication and control lines, and the FPGA with additional external memories. Since'board'designs are a major undertaking, we have decided to develop the ANB to support a variety of functions -taking on several different roles in the scanner system depending on the code loaded into the FPGA. In this way, one major board design is needed rather than three or four different designs. This boad adapts the basic detector signals to the acquisition node board.
To use the system with specific detector designs, we also implement a detector adaptor board (DAB) for each of our detectors. The basic layout of a DAB is shown in Figure 1 . Our current DAB designs allow for photosensors of any size from single elements to 20x20 arrays of sensors. In nodes that can be supported. The current 1394b Firewire standard supports a bus bandwidth of 800 megabits/second. Best utilization of the bus is achieved using data packets of 4096 bytes -so the FPGA includes large buffers to allow collection of data into 4096 byte packets for transmission and continues collecting data while a data packet is being transmitted. This is the same basic scheme as used for the original MiCES system except that the data packet size is larger. Since it is a serial bus, each event is timed-stamped so that the host computer can rebin the data into coincidence pairs. Normally, we acquire all data in list mode and rebin the data into coincidence pairs with final timing and energy windows after acquisition is complete. Depending on the detector system being supported, event positioning is done either in the FPGA (cMiCES) [12] or during the image reconstruction process (dMiCES) . [2] . For the cMiCE modules, each photosensor is connected to the ANB directly (after being converted to a differential signal). For the dMiCE system, the DAB performs row/column summing and uses 40 differential pairs to send the data to the ANB. In both cases, the DAB provides a timing signal. For cMiCE the timing signal comes from a common anode while for dMiCE it is derived from the bias voltage supply side to a silicon photomultiplier array (a MAPD array from Zecotek Photonics). To provide support for configurable devices on the DAB (e.g., gain variable amplifiers; signal switching or summing options; control of application specific integrated circuits -ASIC), we can support a variety of serial control buses. In our current designs, we use the Small Peripheral Instrument bus (SPI) with up to six lines for device selection.
The basic ANB function is to digitize the detector signals, determine pulse timing and perform pulse integration, and then send the data to the host computer via the Firewire bus. The basic components of an ANB are shown in Figure 2 and the board input/output connections are depicted in Figure 3 . Supporting the central FPGA is a configuration device that shares a 1 gigabit flash RAM with the FPGA. The Flash RAM contains the FPGA code that is loaded at power on. Once the the FPGA is loaded, the NIOS II processor proceeds to load its operating system and application code into its memory (SDRAM) from the Flash RAM. The NOIS II processor then initializes any tables needed in the FPGA SDRAM (e.g., the lookup tables for the position estimator for the cMiCE version) and enters its basic runtime loop. The board also includes a 1394b transceiver integrated circuit along with a dual port optical interface to interconnect the boards over the 1394b bus. The various command serial buses (e.g., the SPI buses) shown in Figures 1 and 3 are all implemented with code in the FPGA. Four switches and eight light emitting diodes (LED) are included for local diagnostic purposes.
Some restrictions for any design using a serial bus for sending data to the host are data bandwidth and the number of In our current designs, the maximum number of sensor channels is 64 plus an additional channel for timing. For the 64 sensor channels, we are using serial ADCs to reduce the number of I/O pins required and to make the board layout easier. In selecting the serial ADCs, we needed to take into account the maximum serial input rates of the FPGAs selected for the design (Alteria Stratix III) as well as power consumption and cost. For the current board, we are using Texas instrument 8 channel devices running at 65 MHz.
While we have obtained good timing for signals sampled at 65 MHz with our FPGA timing algorithm [5] targeted for our pre-clinical imaging systems, we have also included an additional input channel to allow faster timing. One approach for the timing with our FPGA algorithms would be to sum all of the input channels within the FPGA and then determine the timing. That approach has potential complications with small signal delays between the channels that can decrease the timing accuracy. Since all of our current detector designs include options for a timing channel signal, we have included one parallel ADC (speed selected based on the application). Our current board design is using a 10 bit, 300 MHZ ADC for the timing channel.
1394b bus
As mentioned above, we are basing the design around the Altera Stratix III series of FPGAs. These device are both fast and power efficient. They also have common packages (devices size and pinouts) so that with the same printed circuit boards, several different versions of the FPGA can be chosen to balance between cost and capacity of the device. For our applications, we need enough dedicated phase-lock-loop (PLL) modules in the FPGA to support our serial ADCs, and the memories. The number needed led us to select the Stratix III EP3SL200 FPGA for our current boards. We are also using the Altera NOIS II soft embedded processor (implemented by code in the FPGA) for local control and support of the Firewire and command bus interfaces. All of our FPGA code is written in the industry standard Verilog language. By using Verilog modules and a C library from Mindready Solutions Inc. (Saint-Laurent, Qc, Canada) , we are able to implement the Firewire 1394b totally within the FPGA with the exception of the basic 1394b transceiver integrated circuit.
The FPGA code we have developed for the original system for pulse integration, time stamping, coincidence control, and local tuning/testing of the detector modules was all written in Verilog and is easily transferable to our Phase II system. We are extending that code to include pulse timing based on the algorithm presented at the IEEE NSS/MIC meeting in 2007 [5] . We are also selecting components that can function within the environment of a MRI (initially a 3 Tesla system).
As in the original MiCES scanner, the system assumes that there is a coincidence controller that imposes a coarse coincidence to reduce the amount of data transfered to the host via the Firewire bus. For the Phase II implementation, this function is provided by an ANB with different firmware and no ADCs in order to reuse the board design -the ADC inputs are replaced by event strobes from the detector nodes.
The basic Firewire bus can support up to 64 nodes. In our case, one is the host computer and one is the coincidence controller, so we can support up to 62 detector nodes. If more than 62 are needed, we can either go to more than one Firewire bus, or use a master slave option. We are building the master slave option into our Phase II system (Figure 4) . The master slave option allows an ANB to function as a local master (no ADCs) and take data via high speed serial links from up to 4 detector nodes. In principle, this layered approach could be extended to yet more nodes by replacing the ANB detector nodes in Figure 3 with another set of local masters. However, for our purposes, one layer is sufficient. The direct serial links utilize the high speed serial tools built into the FPGAs.
The ANB configured as the coincidence controller functions as a master controller for the scanner. Commands from the host are sent via the Firewire bus (eliminating the serial link between the host and the master controller in the original MiCES scanner system). As in the original MiCES system, each Firewire node is configured with two logical units -one for control and one for data.
In this way, asynchronous command/response exchanges between the nodes and the host can be performed while large data blocks are being transfered.
The host data acquisition software developed for MiCES will support the Phase II electronics with essentially no modifications needed (it already has code in place for all control and command functions to be sent via Firewire). This code acquires list mode data from each node on the Firewire bus. It automatically configures at startup to determine how many nodes are active and what version of firmware is installed on the nodes. The major difference for the Phase II system is the use of the 1394b standard, which increases the basic bus bandwidth from 400 megabits per second (mbps) to 800 mbps. As a consequence, the most efficient use of the bus is obtained with a data packets size of 4096 bytes instead of the 2048 used in the original MiCES system. In its most basic mode of operation, the acquisition code simply stores the data blocks from each node into a data file. A subsequent software tool reprocesses the list mode data files into coincidence pairs based on the time stamp data in the list data and a configuration file for the specific geometry of the system the data is from.
During this list processing, timing corrections, prompt and randoms timing windows, block map decoding, and energy windows are normally applied. The data is then normally reconstructed using a 3D list mode algorithm. The data structure the data is formatted into is dependent on the detector module type being supported.
The basic structure is shown in Table I . We have changed it from the original MiCES structure to allow easy support of the various detector modules to be supported.
Note that the MiCES acquisition software is generally unaffected since it is storing the data blocks directly into list mode buffers. The first nibble is a data start flag (and byte sex flag) set to a value of 3. the An updated version of our MiCES electronics is needed to support our newer detector module designs. The system described here builds on our experiences with our first Firewire based system and the requirements of our new detector designs.
Since we can reuse both our host III. Discussion Table I : Basic data format scheme for the Phase II electronics. The number of sensor entries varies depending on the detector being supported and the amount of data processing done in the FPGA. scalers can easily be corrected for or are not significant. There will be a master sync command to assure that the various scalers start together and to allow re-syncing during long runs if needed. Another difference is that the serial command bus is used for more functions than in the original MiCES scanner. It is implemented as an SPI bus using a basic store and forward approach for sending commands to the various modules.
Commands include systemwide commands (e.g., master stop, master start, etc.) and node specific commands (such as status, calibrate, select specific detector elements, etc). Some of these commands can also be sent via the 1394b bus from the master, but we have implemented a rich set on the serial command bus for diagnostic as well as startup configuration purposes. Some systemwide commands such as the master start command (start all the time stamp scalers and pulse integration functions) require adjustments for propagation delays as the command is sent around the scanner. Given the potential complexity with local masters (when we are servicing more than 62 detector modules), we are developing a start sequence that allows the various devices to automatically configure themselves and include the required corrections for delays in the serial command propagation by "knowing" where they are in relation to the coincidence/ master controller. The timing data is split into components. The first is the time stamp which is the value of the time scaler at the moment an event is detected. This value is driven by a scaler that is adjusted to the requirements of a given application. For example, in the MiCES scanner, it is driven by a 62.5 MHz clock. The second component of the timing data is the result of our timing algorithm expressed as the number of timing intervals since the last scaler clock event. We usually set the number of timing intervals between clock events to 256.
The final entries in the data block is dependent on both the detector module being supported and the amount of data processing being done in the FPGA. For example, if we are supporting the original MiCES module, we digitize the 12 outputs from the 6+6 position sensitive photomultiplier and then do the weighted sum of the outputs in the FPGA (replacing the analog summing board of the original MiCES system [1] ). In that case, there are four sensor data entries. For the cMiCES system, if we wish to look at the raw data we have 64 sensor entries. However, if we are running in the event processing mode, the FPGA does the position estimation and the data block includes four entries (the x-y position of the event, the depth-of-interaction estimate, and the energy). For the dMiCES detectors [2] , the normal mode will be to perform row/column summing and for the FPGA to pick the 6 largest signals (each crystal pair will produce 3 signals) and put them in the sensor fields with the top 5 bits encoding the row/ column position and the lower 11 bits the integrated ADC value.
Obviously, this format allows a wide variety of representations of the detector data depending on how much processing is done within the FPGA and the number of data sensors used in the detector module.
Another major difference between the original MiCES electronics and our new design is the elimination of a central clock that is distributed to the different modules. Instead, we are utilizing fast enough clocks for the scalers in the FPGAs that the small amount of time skew between the time stamp second nibble is for encoding the type of data format (e.g., how many sensors are included in the block). Each ANB that is connected to the 1394b bus has a node ID that is set by a set of jumpers on the board (0 ..62). When an ANB is serving as a local master, then the slave acquisition boards that generated the event is encoded in the detector ID field. Each ANB also has a scaler for the single event rates and the most significant 16 bits are included in the data block.
1394b bus acquisition software and the extensive Verilog modules we have developed for the original system, the time to implement this new system will be considerably reduced.
Part of our goal is to develop a system that supports our current needs but can also be used within an MRI system. Thus, part of our design is the selection of components that can stand the magnetic and RF fields, as well as selection of clock frequencies that will minimize interference with the MRI RF system. We also have focused on the problem of large cable plants and making it easy to use a PET insert within an MRI system. Based on the preferences of our local research MRI group, the design of our new electronics will allow us to build a PET insert for an MRI scanner with only an optical cable, two power cables, one ground cable, and cooling air in/out hoses. Only the fiber optic cable will need to penetrate the MRI room shield to connect to our data acquisition computer.
The goal of developing a single basic board design to take on at least three different roles in our systems (coincidence controller, acquisition node, local master) was due to the costs of board development. The number of ADC channels and other I/O requirements coupled with issues of signal propagation matching has lead us to use a board design of at least 20 layers. Further, such a flexible design goal requires a considerable effort in the development of the design documents to assure that the boards can support all the functions and possible additional requirements envisioned for full scanner development. For example, as development efforts for our dMiCE detector module design have progressed with silicon photomultiplier (SiPM) arrays, the necessity for developing an ASIC for the row/column summing operation became apparent. Further, that device also needs to be able to connect single elements in the SiPM array to the data electronics for amplifier gain adjustments and general diagnostics. Those requirements lead to the need for a full serial control bus to the detector adapter boards to assure that those with ASICs have a method to control the ASIC functions.
Since we also wish to operate the electronics within a magnetic resonance imaging system for one of our applications, we elected to have all of the connections between the host computer and the electronics be via fiber optic. The 1394b standard supports optical fiber and we have adopted optical fiber for all of the 1394b connections between the ANBs and a optical to copper 1394b converter at the host computer (the boards do have pads to support the more conventional copper 1394b cable connectors if the application does not need the fiber optic connection scheme). All system wide commands from the host computer to the Phase II electronics is then sent over the 1394b bus to the coincidence/ master controller. As in the original MiCES scheme, during data acquisition each of the 1394b nodes act as independent data sources -each assigned to its own list data buffer by the host computer.
The FPGA we have selected should have more than adequate space for all of the various application modules we are using. The Altera devices offer several different levels of performance and capacity in the same basic physical package, so that we can further customize boards for specific functions and optimization of power consumption by selecting the appropriate version of the FPGA. One future option includes a faster version of the Firewire bus (1394b). The IEEE approved the 1394-2008 specification in July 2008 and the extended standard supports both 1600 and 3200 mbps speeds using the same basic connectors as the current 1394b standard. Further, the new standard uses the same basic packet and protocol definitions as the 1394b standard, thus we can expect our existing FPGA code to work with the faster speeds. We should only have to change the 1394b transceiver chip to implement the 1394-2008 specification and adjust our data packet sizes to optimize the bus bandwidth utilization.
As of October 2008, we have completed the board specifications, component selections, prototype software testing, and basic pin assignments for the FPGA. The boards are now in the final design phase and the first version should be fabricated early in 2009. The system design has taken into account all of the various detector modules we are working with and, we believe, are adaptable to a very wide variety of module designs. This new version of our basic Firewirebased data-acquisition system will certainly meet our current needs and will provide us with a flexible, easily configurable system that can also work within an MRI system environment.
IV. References

