High-resolution high-purity germanium (HPGe) spectrometers are needed for Safeguards applications such as spent fuel assay and uranium hexafluoride cylinder verification. In addition, these spectrometers would be applicable to other high-rate applications such as non-destructive assay of nuclear materials using nuclear resonance florescence. Count-rate limitations of today's HPGe technologies, however, lead to concessions in their use and reduction in their efficacy. Large-volume, very high-rate HPGe spectrometers are needed to enable a new generation of nondestructive assay systems. The Ultra-High Rate Germanium (UHRGe) project is developing HPGe spectrometer systems capable of operating at unprecedented rates, 10 to 100 times those available today. This report documents current status of developments in the analog electronics and analysis software.
Figures
: Point contact detector. Note that there is no bore hole as is found in a semi-coax detector and that the aspect ratio (thickness to diameter) is larger than is typical for a planar detector. The electric fields are quasi-hemispherical in these detectors rather than radial (coax) or axial (planar) 
Introduction
The UHRGe project is developing the analog readout, digital signal processing, and data reconstruction software necessary to operate HPGe at unprecedented rates (~1 MHz). The project is taking a 3-step approach to achieving, or exceeding, this goal. The first step will utilize a conventional coaxial HPGe detector as a test bed for new preamplifier designs. The coaxial detector a that will be used is relatively small (37% relative efficiency, 62 mm diameter by 45mm long) and has charge collection times of ~300 nsec and a cold front end package with resistive feedback (2 GOhm resistance). This will allow us to evaluate resolution performance at lower rates using this well-characterized detector and comparing directly with the commercial preamplifier which produces typical laboratory quality noise and resolution performance. While we may explore running the feedback circuit at high voltage, the rate limitation for this test bed will come from the steady-state current flow through the large feedback resistor due to the charge deposition in the detector. Assuming a 15V feedback voltage range, this limits the rate to roughly 200 kHz for 662 keV gammas. In parallel, a new digitizer system capable of continuously streaming data to disk at 400 MHz will be used to collect digitized data for offline data analysis and algorithms development in ROOT. This system has a dedicated Virtex 5 FPGA for user algorithms that ultimately can host the digital filtering routines that are developed offline. The second step we will take is to move the electronics developed with the coaxial detector to a p-type point contact (PPC) detector ( Figure 1 ) which has much lower capacitance than the coaxial detector, but long drift times that may adversely impact resolution at high rates. Finally, PHDs Co and PNNL are collaborating on the development of a multi-contact detector ( Figure 2 ) that will have 7 pads readout independently, each of which will have low capacitance (~1 pF) and short drift times (<100 nsec) for the entire detector volume. Figure 1 : Point contact detector. Note that there is no bore hole as is found in a semi-coax detector and that the aspect ratio (thickness to diameter) is larger than is typical for a planar detector. The electric fields are quasi-hemispherical in these detectors rather than radial (coax) or axial (planar).
a This is one of the first MARS detectors that remain in the original vendor dipstick cryostat. It has been retained in this configuration to provide a reference system for electronics and DAQ development for that program. The use on UHRGe is parasitic to the needs of the MARS program. 
Analog Electronics
The analog electronics of a typical HPGe system are divided into two parts, a front-end package and a preamplifier. The front-end package is typically located inside the vacuum cryostat and is operated at low temperature to reduce thermal noise in the critical first stage amplification components. There are two classes of circuits employed: pulsed reset and resistive feedback. In the former the output voltage rises in steps as each charge pulse (photon interaction) enters the system until the maximum output (rail) voltage is reached. At this point a reset pulse is delivered that clamps the output node to ground. This results in some ringing in the system and a settling time is required before the system is stable and ready for receiving inputs again. During the reset and settling period the system is dead. In such a system the dead time rises steadily with rate until the inter-event spacing is comparable to the rest time, at which point the dead time approaches 100%. Previous attempts [1, 2] to achieve high-throughput systems employed this front-end configuration and were limited to throughputs of ~50-100 kHz where the higher throughput was only achieved with low energy X-ray or gamma-ray sources which deposit less total charge per unit time. These systems do not "fail gracefully" but rather they saturate and the throughput drops to near zero when the input rate is increased. This is easy to understand as the deposited charge during the settling period raises the output voltage to the rail voltage and another reset is issued before the system becomes live again, resulting in 100% dead time. The alternative is resistive feedback. In this configuration a resistor continuously bleeds the deposited charge off the detector. The main drawback of this design is that the resistor thermal noise contributes to the system noise. In order to keep this noise term manageable, a typical resistor value of 2 GOhm is used. As discussed in the introduction, at high rates the instantaneous current flow through this resistor leads to a voltage drop across the resistor that exceeds the supply (rail) voltage. For a conventional system with a 15 V rail and 2 GOhm resistor, this limit is around 200 kHz for 662 keV gamma-rays. In practice, the rate limit is more typically 20-50 kHz with the limit coming from pileup of the typical 50 microsecond tail pulses from the preamplifiers. Our development path is to reduce the tail pulse length in the shaping stage of the preamplifier to a few microseconds and use more complex digital filtering to resolve pileup events rather than discarding them to eliminate this dead time limit. Next, the resistive feedback must be adapted to accommodate more instantaneous current flow. Two approaches are possible. The first is to simply tie the feedback circuit to a high voltage source. This method has been employed previously, and successfully, to make measurements in the Hanford tank farm using a 250 V rail voltage. The second path is to replace the physical resistor with a dynamic resistance produced using FETs. This will allow the resistance to fall at higher currents and hence reduce the required voltage. We anticipate employing both techniques to achieve our most aggressive rate goals. It should be noted that the nature of the noise from the FET-based feedback "resistor" is different than that of a conventional resistor (it is no longer thermal noise) and will favor shorter digital filters, which is precisely what is needed for high rates. However, the achievable noise may be higher than that of a conventional resistor, which will impact achievable detector resolution. The schematic of a generic front-end preamplifier (Figure 3 ) includes the detector, the feedback network (in this case resistive) and the single FET charge amplifier that are normally inside the cryogenic housing. The FET bias current circuitry, the first stage gain amplifier, the pole-zero network, and the output amplifier constitute the primary components on the preamplifier and are located on the outside of the detector housing. The feedback signal is taken from the output of the first gain amplifier stage and is routed through the feedback network to create a closed-loop network. In a mature variant of a custom pre-amplifier, the blocks above are realized with various sub-circuits that achieve the lower-frequency functionality. The circuit schematic (Figure 4) is overlaid with the location of each functional block. Figure 4 : The SPICE schematic of a custom made traditional preamplifier with various functional blocks individually highlighted
The current control circuit uses the fixed current through a C170 LED which has a 25 mA DC forward current that is primarily dropped across a 13 kΩ resistor to create a reference voltage into the base of a 2N5087 BJT transistor with the same diode drop properties as the LED. The emitter of the BJT has a potentiometer that can be adjusted to control the current flow through the BJT from about 1 to 20 mA. This structure acts as a current source that drives current into the drain of the cold FET used as a preamplifier for the detector signal. The output from the FET detector amplifier is driven into a gain amplifier before feedback. In the case of this circuit, the signal is driven into the high impedance positive terminal of an ADA4898-1 operational amplifier (op-amp). This is done for two reasons. First, the negative and positive inputs of an op-amp are at the same voltage. The negative terminal of the ADA4898 op-amp is biased to a specific voltage (about 4V) so that the positive terminal is also biased at 4V. This maintains the voltage bias on the cold FET amplifier to keep it at the correct level for proper functionality. The second reason the signal is sent into the positive terminal is that the high-impedance nature of the op-amp prevents current flow. This maintains the integrity of the current biasing network and reduces potential variability caused by a lowimpedance amplifier. A small feedback network is sent from the output of the gain op-amp to the negative input terminal. This feedback network consists of a 14 MΩ resistor and a 5-30 pF capacitor. This acts as a basic signal integrator and adjusts the shape of the input as can be seen in Figure 5 . The output from this stage is fed back to the FET through the resistive feedback network and is also impacted by the pole-zero (PZ) network resistor capacitor (RC) network that further conditions the signal. The RC PZ network utilizes a resistor to ground, a capacitor, and another resistor to ground in a modified pi configuration. The capacitor blocks DC and low frequency voltages from transferring across the PZ filter and a sub-circuit is used to set the DC voltage of the output of the PZ filter while the high frequency content is passed across the capacitor to the output amplifier. This filter is put in place to prevent oscillations and the PZ network defines the decay time for a pulse tail. This decay time ultimately affects the frequency or rate of pulses that can be detected before pile-up occurs and the detector gets 'swamped' with too many signals. The custom circuit was designed to operate at traditional frequencies, but a modification was made to reduce the decay time down from the typical ~50 µs to about 10 µs. This change allowed the pre-amplifier to process a greater number of events, but also significantly increased the noise when used with pulse-counters and shorter shaping times. The results of pre-amplifier alteration exhibited trends that correlated to the same parameters that were simulated. The modification process can be considered a mixed success, with the primary takeaway confirming the need to develop a new preamplifier that is specifically designed to reduce noise with short tail pulses.
The development of a new FAST pre-amplifier was done with the constraints of using the current detectors, FET charge amplifier, and resistive feedback network. As such, the circuit architecture is very similar to the previous traditional preamplifier that was built with some modifications to reduce noise and improve performance. The circuit ( Figure 6 ) uses a fixed current source and cascaded current mirrors to mitigate some of the noise that can enter the first stage of the amplifier through non-idealities in the current source. Additionally, the first operational amplifier has been replaced with a LT4168 which has better frequency and noise performance than the ADA4898-1 used in previous designs. The feedback network around the first op-amp was changed from an integrator architecture (with a 14 MΩ bleed resistor) to a more complicated hybrid network with a 42 pF capacitor in series with a 1 kΩ resistor to reduce thermal noise voltage from the large capacitors. The pole-zero network was designed to allow for full isolation and adjustable potentiometers on each side to allow adjustments to prevent current biasing from diminishing performance. Lastly, a third amplifier was added as a buffer stage to allow for adjustments to the magnitude of the output to match the dynamic range of the analog-to-digital converter in the digitizer or multi-channel analyzer used to readout the signals. Figure 6 : The SPICE schematic of the first generation FAST preamplifier
The first performance metric that was simulated was the transient response of the output for an event.
One priority in the design of the FAST pre-amplifier was to dramatically reduce the decay rate of the pulse shapes. In order to improve from the typical detection rates that are pileup limited at 20-50 kHz to detection rates that exceed 1 MHz, the decay time needs to be reduced by two orders of magnitude. The FAST pre-amplifier has a typical decay time of 800 ns as compared to the 80 µs that the previous preamplifier demonstrated (Figure 7) . The signal appears to have some oscillation on the leading edge, but this same high frequency oscillation is exhibited on the significantly slower signals of other amplifiers as well. It is not perceived as prominent because it occurs over a 50 ns time scale that is often ignored by traditional detection tools. This transient is likely to be reduced or may not appear for real signals that have a leading edge rise time which will roll off the high frequency component at the input. Additionally, the FAST response magnitude for the same input signal is very close to the magnitude of the traditional pre-amplifier. This was done intentionally for ease of integration of the pre-amplifier with the additional detector components. The second metric that is important in design is a consideration for the noise that the circuit contributes to the final signal. With these types of detectors and pre-amplifiers, the two most influential noise factors are the input referred current noise and the output referred voltage noise. The frequency response of the circuit impacts the rise time and the decay time of the amplifier. As a result of moving toward faster response times, the frequencies that contribute to noise are increased. To properly compensate for this, a circuit needs to be designed with lower total noise contributions at these higher frequencies. The mature traditional design and the FAST pre-amp were both simulated in LTspice to determine higher frequency noise contributions (Figure 8 ). The most important noise contributor is input referred current noise. It is apparent that this noise contribution from a traditional amplifier dramatically increases at frequencies greater than 1 MHz. This was empirically noted when one such pre-amp was modified to run with faster decay times. In contrast, the FAST preamplifier is specifically designed to minimize noise at higher frequencies. The performance is no better than the traditional amplifier at frequencies below 2 MHz, but simulations suggest it will exhibit almost two orders of magnitude less noise at frequencies near 10 MHz. This noise improvement should allow for the necessary signal clarity for high speed operations.
The FAST pre-amp design represents an intermediary step in achieving very high speed detector electronics. The design, fabrication, and testing cycle currently underway are serving two purposes. The first purpose is to develop a faster detector using the current feedback and FET amplifiers to determine challenges associated with aspects of high speed detection that are unrelated to the preamplifier. The second purpose is to refine the simulation and development tools to more accurately and more quickly develop complicated cutting edge circuits for nuclear detectors.
The path forward starts with two or three development cycles of the FAST preamplifier. This includes design, simulation, fabrication, and testing. In each subsequent cycle, the design and simulation tools will be refined. Once development of a FAST preamplifier with acceptable performance is completed, several parallel actions will occur simultaneously to dramatically improve higher speed performance. The tools developed during the FAST design stages will be used to design new 'cold side' electronics that include the feedback network, the FET amplifier, and the arrangement for multiple channels. One such expectation is the replacement of the resistive feedback network with a transistor. This design will be done concurrently with a new pre-amplifier designed to work with the electronics and detector changes made on the cold side. A third path will be undertaken in which the signal processing for the FAST detector will be improved and initially implemented in a computer. The improvements from this signal processing will then be moved from a post-processing environment toward real-time high speed multichannel detector data by incorporated them into a field programmable gate array (FPGA) on the digitizer hardware.
Digital Electronics
The Signatec data recorder system selected will utilize the PX1440 high-speed digitizer board for signal collection. The model selected contains two high-end Xilinx Virtex-5 FPGAs; one is used for system control and data handling, while the other is fully programmable by the user and targeted for more advanced digital signal processing. When used for signal processing, the data stream will pass through both FPGAs before going out on the PCIe bus to the data storage. A firmware development kit is provided by Signatec.
In the initial configuration, the signal processing FPGA (Xilinx Virtex-5 SX50T) will support multiple FIFOs transferring high-speed serial data between the data handling FPGA and 512MB of dedicated DDR2 SDRAM. The FPGA also contains 288 DSP "slices", each with its own multiplier and 48-bit math support. These slices are capable of performing complex mathematical operations on streaming data, including logic operations, high-performance filters, and cascaded operations.
Initially, this functionality is expected to be utilized to provide filtering on the data before it is recorded, possibly either reducing the data size through decimation (bit reduction) or simple event rejection. However, a significant strength of FPGAs is their capability to perform multiple calculations simultaneously. For example, two different filters could be applied to incoming data and both results stored to disk. Alternatively, if each filter was optimized for a different type of detection event, the FPGA could compare the output of the two filters and select the most applicable one to store, flagging the data accordingly before writing. Another possible FPGA application would be keeping a continuous pipeline of calculations in operation as the data streams in, only dumping the data when a trigger event is received.
Analysis Software
Analysis software is written in C++, with utilization of free open-source software whenever available and appropriate. The ROOT package [3] is freely available and is the standard package for data handling, analysis and visualization in the fields of nuclear and particle physics. We have chosen it because of our own long and positive experience with it as well as its natural integration with GEANT4 [4, 5] . GEANT4 is a standard package for simulation of radiation transport problems which we will use in the future to model the UHRGe apparatus.
The current analysis is focused on using maximum likelihood fits of known pulse shapes to those in the data. The traditional method using so-called trapezoidal filtering [6, 7] cannot be used to determine a pulse height spectrum given our targeted event rates and the speed of available digital and analog electronics. However, trapezoidal filtering can be used as a trigger to give a very good estimate of the start times of detector pulses as shown for simulated pulses in Figure 9 . Our simulated pulses have the form
This represents a pulse with a fast exponential rise characterize by time constant !, followed by a much slower exponential decay with constant τ. As long as ! << τ then the latter approximation holds for times more than a few ! from a rising edge. The pulses arrive randomly at a mean rate r and the preamp output is sampled continuously at a rate s. A fast trapezoidal filter with rise time L and flat top G is run recursively. The raw preamp output (black) is input for the first iteration, producing the (green) trapezoidal outputs. This first filter output is then used as input to the same filter producing the (red) bipolar pulses. Zero crossings of this second filter output flag the beginning of the steep rising edges of the preamp output pulses, delayed by two filter lengths. Piled-up pulses are resolved as long as they are separated by two filter lengths.
While the filtering described above is currently done in software, available hardware could do it in real time even at event rates ~1 MHz. We therefore assume the input to our software analysis tools (whether those tools run offline or are embedded as real-time analysis in the FPGA) will be a continuously sampled detector output tagged with the start times of individual pulses. A fitting algorithm using built-in ROOT capabilities has been developed (a similar algorithm has also been implemented in MATLAB). The voltage signal in the region of the Nth pulse in a continuous sample has the form
The digitized preamp output (black), the result of the filter applied to the preamp output (green), and the result of the same filter applied to the first filter output (red). The lower plot is a magnification around the rising edge of the first pulse in the upper plot. The trapezoid has rise time L=0.5 µs and flat top G=0. The start time of the pulse, marked by a dashed red line, is 2L ahead of the zero crossing in the second filter output. This data is simulated for s=100 MHz, r=10 kHz, τ=10 µs and ! =50 ns with 1 mV r.m.s. white noise.
The first term is just the Nth pulse while the second term represents the persistent tails of all past pulses. There is one variable (t) and five parameters: the start time of the Nth pulse t N , the amplitude of the Nth pulse Q N , some past time t past , the value of the voltage waveform V(t past ) at that time and the characteristic decay time τ. Real time digital electronic filters can be used to determine t N up to pile-up limitations as described above. The decay constant τ is easily measured in offline calibrations and will not change during a measurement. The past time t past is arbitrary so long as t N−1 < t past < t N and it is not so close to a rising edge to violate conditions for neglecting !. Once a choice of t past is made, V(t past ) is determined simply by sampling the data. So out of five possible free fit parameters, only one (Q N ) really needs to be free. However, one must consider systematic and statistical uncertainties associated with the way these other parameters are determined. Software tools have been developed to study these issues. The computational complexity of a maximum likelihood fit is (at least) exponential in the number of free parameters so it is highly desirable to minimize them, especially for a system like UHRGe that runs at high rate and needs results in near real time. The result of running the algorithm on a 10 ms sample of simulated pulses with Q = 1 V at r = 10 kHz and other conditions as in the caption of Figure 9 is shown in Figure 10 . Studies of the performance of this algorithm under various conditions (event rate, sample rate, r.m.s. noise etc.) are ongoing.
Conclusions
The UHRGe project is developing a HPGe detector which can operate at >1 MHz throughput. The initial phase of the development effort has focused on preamplifier designs that will provide short tail pulses to the digitizer. A very high performance digitizer has been ordered that can stream data to disk at 400 kHz rate. A simulation framework has been constructed to generate simulated pulse trains as input for developing digital filter techniques. Development has begun, in ROOT, on software tools to identify each pulse start time and then extract the energy by fitting the surrounding digitized data from the prior pulse to the subsequent pulse. Finally, a multi-anode HPGe detector 
