This paper will present the design, implementation, performance analysis of an open source readout system for arrays of microwave kinetic inductance detectors (MKID) for mm/submm astronomy. The readout system will perform frequency domain multiplexed real-time complex microwave transmission measurements in order to monitor the instantaneous resonance frequency and dissipation of superconducting microresonators. Each readout unit will be able to cover up to 550 MHz bandwidth and readout 256 complex frequency channels simultaneously. The digital electronics include the customized DAC, ADC, IF system and the FPGA based signal processing hardware developed by CASPER group.
INTRODUCTION
Since the MKID technology was first introduced in Caltech/JPL, it has been fast developing due to its numerous advantages and potential applications. [8] [9] [10] [11] [12] [13] [14] [15] One of the most important advantage of MKID is that it allow superconducting microresonators (which serve as detectors) to be multiplexed in frequency domain at microwave frequency band. Since the transmission far away from the resonance frequency will not be affected by resonators, we can multiplex many MKID off a single transmission line by setting each MKID resonant frequencies to be slightly different with lithography.
The idea to readout all the MKID resonators is using IQ homodyne mixing, which is essentially a dual-phase lock-in detection technique: One generated a comb of probe frequencies for each resonator, this comb is then sent through the MKID array, where probe signal is modified in both amplitude and phase direction based on the change of surface impedance of superconductor which is caused by the incident photons. After amplified by the cryostat high electron mobility transistor(HEMT) amplifier, the comb is sent through room temperature electronics to digitize and analyze. Aside the HEMT and MKID itself, there is no other cryogenic components, which bring the complexity and challenge of readout to room temperature electronics system. Some microwave frequency multiplexing and digital readout were demonstrated in different group. [15] [16] [17] [18] [19] In our proposed readout system, we design and fabricate both digital to analog (DAC) and analog to digital (ADC) converter board that can be easily connected with our FPGA processing board through Zdok connecter. Each board contains two DAC and ADC chips and those ADC and DAC chips can be easily replaced to allow us take the advantage of latest semiconductor technology. In this implementation, we use two 16 bit DAC to play both in phase and quadrature phase pre-computed look up table(LUT) which contain carrier comb. IQ mixer is then used to fully utilize the full sampling rate and up convert the baseband comb tones to resonator frequencies in IF band. After going through the device, the signal is first amplified by HEMT amplifier inside cryostat. Then room temperature electronics also provide further amplification and then down convert the signal to the baseband in phase and quadrature phase data stream. Baseband amplifier is used to adjust the power level in front of the ADC to fully utilize the ADC dynamic range. Anti-aliasing low pass filter with cut off at nyquist rate are used at both DAC output and ADC input. The reprogrammable firmware running on FPGA will channelize the data, select the resontor bins, low pass the spectrum, decimate into correct output rate then send those interested data through 10Gbit ethernet at designed 100Hz rate to data acquisition pipeline.
This proposed MKID readout technology will benefit in many fields, including mm/submm, IR, Optical-UV, X-Ray, Dark Matter detectors, MSQUID, bioinformatics and quantum computing etc. And comparing with other readout system, the open source MKID readout proposed in the paper shows many advantages: successfully readout large MKID array; both hardware and software have been successfully demonstrated on sky at Caltech submillimeter observatory (CSO); and system meets all the synchronization, SNR, bandwidth, resolution requirement for MKID readout; proposed readout system is easy to reprogram, implement. And it is very flexible on different system requirement, e.g. different channlizing method or size, different sampling rate, different probe signal type etc; by reprogramming the software, same readout system can be used for different camera, application and requirement; same industry standard for hardware and software can be easily shared; system upgrade is easy and cost efficient; The whole system, include the hardware and software we have been developed is open sourced. It allow us to share the development effort with collaborators all over the world.
Using MKID readout as an example, in this paper we will present the design, implementation and performance of the full large scale readout system which includes FPGA processing board, customized DAC, ADC and IF system etc digital electronics and software/firmware developed for MKID application.
READOUT DESIGN AND DEVELOPMENT

Readout Requirement
The proposed MKID readout system was successfully tested with MKID camera at CSO in June 2010. The readout electronics have the general task of performing multiple real-time complex microwave transmission measurements, in order to monitor the instantaneous resonance frequency and dissipation of the superconducting microresonators that serve as mm/submm photon detectors. The full camera array will have total 576 spatial pixels, and each pixel will simultaneously cover 4 different frequency bands. And the total 2304 detectors will be divided into 16 tiles, each MKID readout unit will be used to readout 1 tiles which is 144 frequency multiplexed resonators. After perform data channelizing, MKID readout system will output the complex S21 measurement result at 100Hz.
The readout electronics are designed so that it will not add any additional noise to the system. Noise will be dominated by cryogenic HEMT amplifier, which has noise temperature around 2 to 5 K. Besides the HEMT, ADC chip will be the next limiting factor for the noise performance of the readout.
Based on the physical frequency spacing of all the resonators, the sampling rate are chosen to match the resonator bandwidth. Sampling rate of proposed readout system can be flexible. Right now it is up to 550 MHz which is the limit of ADC chip. We have been developing new ADC board using lasted high SNR(12-16 bits), high sampling rate (gsps) ADCs.
Hardware and IF System
We use the open source reconfigurable open architecture computing hardware (ROACH) 20 from CASPER group as a FPGA process board and developed our own DAC and ADC board and software. In order to synchronize the system, we add a synchronization port on ADC board to lock the FPGA with GPS. Both the 16 bit DAC and 12 bit ADC have been proved to meet specification on datasheet e.g. SNR, SFDR, IMD etc.. DAC is able to work up to 1Gsps with SNR 75dBFS; ADC is able to work up to 550Msps with SNR 60 dBFS. With the developing of new ADC chip, e.g. e2v has announce 12 bit, 3Gsps ADC chip, we will continuously develop new ADC board.
The centerpiece of ROACH board is Xlinx virtex 5 FGPA, and an independent power PC runs Linux is used to control the FPGA board. Beside the memory on FPGA, one DDR DRAM and two QDR SRAM are used to provide extra memory capacity. 2 Zdok connectors allow DAC, ADC or other interface to attach to FPGA. Four CX4 connectors provide up to 40Gbit per second data rate to download data to computer or connect multiple ROACH together.
20
Commercial IQ mixers are used to convert the baseband signal into resonator frequency. And frequency comb are carefully designed to avoid the inter-modulation caused by the mixers.
The IF system configuration is shown in figure 1 and each component in the IF system are selected and configured carefully so that: all the amplifier and mixer are working in the optimal range; noise level reach ADC will dominate by the HEMT noise( other component in system, e.g. amplifiers, ADC etc will not add any noise); two synthesizer, FPGA and DAC/ADC are all locked with same frequency standard to avoid frequency drift; DAC and ADC dynamic range are fully used; the probe signal power level and frequency reaching MKID device are optimized for each individual resonator across the whole readout bandwidth.
After DAC board, there is LPF, IQ up converter, digital attenuator and then goes into the dewar. DAC full range will give output 2 dBm power. In general, if there is 126 carriers, each carrier will have average power -19dBm. To be more accurate, we use FPGA network analyzer mode (another firmware we developed, it is discussed in section 2.3.5) to record roll off pattern of the each frequency bins for all 131K bins. By adjusting the DAC LUT and digital attenuator, we can make sure the resonator power level and frequency are what we expected.
We can generally consider HEMT and LNA have gain 34-35 dB; baseband amplifier has gain 20 dB. And taking into account all the attenuation, IQ conversion lass, LPF loss etc through the readout, we can calculate the signal and noise level through the system. To be more accurate on signal level reach ADC, we look at digitized ADC time stream data to make sure first we do not overflow ADC, and then try to use as much ADC dynamic range as possible.
Based on noise temperature of 2 -5 K, there is total 61 dB gain from HEMT to ADC input. HEMT noise temperature at ADC input is around 2.5e6 K where ADC full scale is 4.1 mw. So the HEMT noise at the ADC results in SNR around 55-59 dB where the ADC has SNR 64. The whole readout system will dominate by the HEMT noise and still has 5-9 dB margin. And this is also confirmed by measured results. We already start to design and fabricate the customized IF board which include most IF system components : e.g. IQ modulator, digital attenuator, amplifier, VCO synthesizer, filter etc into a single compact board to directly connect to our ADC, DAC and ROACH boards.
FPGA Core
We have successfully design and implement a FPGA firmware with 131K point channelizer, 2600 Hz resolution and corresponding FIR filter, band selection, timestamp function etc on FPGA. The firmware running on FPGA can be divided into following parts:
Comb Lookup Table Generation
The look up table for DAC are directly stored on FPGA to enable fast and stable access compare with other memory on ROACH, and on FPGA memory also allow it to be fast re-uploaded through power PC on ROACH even when FPGA channelizing is running. And LUT is designed to have the same length or integer multiple of channelizer size to get consistent phase for each bin. All the resonator tones will sum up together to play back in the LUT buffer. An avoid-clipping program is used to maximize each resonator power by transfer the clipping (due to summing) into the off resonator bins(power are sent out in both on and off carrier bins). In order to get optimal power level for each resonator frequency, effect of LPF at DAC output, IQ mixer, DAC output transformer, impedance mismatch and the DAC intrinsic SINC function roll off are all taken into account and compromised when the buffer is generated to make sure both power and frequency is optimized across the whole readout bandwidth when reaching MKID device.
In the current implement, we use DAC frequency step size of 2.6 KHz (there is still plenty extra memory on FPGA, which allow the readout to have DAC step size even below 50 Hz), same as the FPGA channelizer bin width. And the buffer size, frequency resolution, sampling rate of the ADC and DAC chip, and how many tones we want to play back in the buffer are all programmable from the firmware and can be easily modified based on different requirement.
ADC digitization
Above 500 MHz clock rate for FPGA is too fast even for state-of-art FGPA with large design on it. In order to process the data from ADC and DAC in real time, we will first deserialize data on FPGA, which reduce the FPGA clock rate by 2 in current design, at the cost of twice the amount of logic cells.
As requirement of bandwidth and ADC sampling rate increase, we will be able to control serializer / deserializer level to match high rate.
Channelizing design
Different channelizing implementation in FPGA, e.g. digital down converter (DDC), directly 131K FFT; FFT zoom, polyphase filter bank and co-addition mode etc are considered and compared, we eventually implement an combination of 2048 polyphase filter band and 64 FFT.
7 Direct 131K FFT will be too large to merge with other part of the design and does not have large flexibility to expand to large size for future challenging requirement. DDC would not be able to process many channels as number of resonator increase; simple FFT zoom program (cascade 2 FFT) will have significant power leaking in the first FFT stage especially when the frequency is not on the bin of first FFT stage. The combination of polyphase filter band and FFT improve the leaking significantly.
In this design ,we use a 4 tap 2048 point hamming window complex polyphase filter bank, followed by a transpose function implement on the QDR of ROACH, and then a 64 complex FFT. 2048 point PFB together with 64 FFT will give us equivalent 131K point channels. Compare with directly 131072 point FFT, combined PFB and FFT solution will save a large amount of logic cell on FPGA which is critical for large design like this. And this design also allow us to expand the size of channelizer up to 16 million points on a single ROACH(even larger by combining RAOCH together through 10Gbit connector on ROACH) which will be useful for future MKID application.
After channelizing, only the 126 bins that carry resonator information are selected out of 131072 bins based on another LUT we implement on FPGA. This LUT which contain the position information of the resonator bins are generated together when the LUT for DAC are generated and will be able to update automatically once the LUT for DAC are changed. The ability to reprogram LUT buffer on FPGA when channlizing is running is important for real observation: MKID probe frequency and power need to be optimized based on different sky loading, LUT will need to be changed all the time when point to different part of the sky.
After resonator bins selection, only 126 bins that carry the resonator information are further processed and stored: instead of simple co-addition or averaging, each selected resonator bins data stream will go through a 126 channels 52 tap hamming window FIR filter and then decimate the data rate into 100Hz to give better noise performance.
Synchronization
1pps signal are imported to FGPA from GPS locked frequency standard to provide TTL signal with raising edge on the second boundary. Both DAC and channelizer will start at exactly same edge of a second to make sure we get consistent phase for all the carrier bins. In order to synchronize with the absolute time of the day, a c program running on power PC is written to transfer the current unix time on PPC (which is locked to network time protocol server) to the FPGA, and the 1pps locked counter will start counting the integer seconds from that time in FPGA. Another counter that running at FPGA clock rate will be reset by the 1pps signal continuously, this counter will provide the fractional part of the seconds for the data package which is accurate up to 1/FPGA clock rate second( in the level of 1e-8 second). Internal delay inside the FPGA between the signal received at ADC and the 100Hz final output are also taken into account. Delay in the IF system are measured with FPGA network analyzer mode we designed and also taken into account in calculation.
So each data package will contain a timestamp (both seconds and fractional seconds), a header and 126 complex resonator data. And the data packages are send out through 10Gbit Ethernet at 100Hz to DAQ computer.
As shown in figure 1 , the FPGA clock is imported from ADC board and shared with ADC and DAC, this can make sure all the synthesizer, ADC, DAC, FPGA are synchronized together. 
network analyzer mode of FPGA
Besides the normal channelizing mode for observation, we also designed a FPGA firmware to make the whole readout system works as a network analyzer: send out chirp signal or white noise, then co-add the ADC digitized time domain data and store to computer. Network analyzer mode allow us to quickly make resonator sweep; by comparing the phase of each frequency bin, we can calculate the cable delay of the readout setup; we could also use it to check the current ADC dynamic range and do system check. More importantly, all different designs can be implemented by simply reprogram the FPGA firmware without any hardware change.
Discussion
Noise source in the whole system includes: readout electronics, HEMT, sky noise and MKID device itself.
In the readout electronics, we also studied the noise performance, especially at the low frequency range (below 10Hz): from the 100Hz audio stream output of readout system, we see noise in both amplitude and phase direction rising up for the frequency below 1Hz. We already know some component in readout electronic e.g. voltage regulator, ADC/DAC chips itself has such 1 over f noise behavior; and any clock jitter or aperture jitter in synthesizer, ADC or DAC will also appear as low frequency phase noise.
We are using frequency standard locked low phase noise option of the synthesizer, and use same synthesizer for both output and input signal of the readout system, noise contributed by the synthesizer phase noise are largely canceled out when signal loop back from DAC to ADC.
we are currently developing second generation of the DAC and ADC board to use low noise common voltage regulator design and stable external reference for the DAC and ADC chip to completely solve the low frequency noise.
For the current setup, we can clearly see very high correlation between all 126 tones in both amplitude and phase direction (average correlation coefficient greater than 0.93)in figure 5 . we can easily recover the signal that suffer from 1/f readout electronics noise by comparing multiple carrier tones which were sent out and processed at same time, and experience same 1/f noise. An example is shown in figure 4 . Black dot line is indicating the measured noise floor of the readout system which is also agreed very well with the theatrical calculation: -195 dBFS. And red color shows the signal noise floor after 1/f noise removal, it also match with the -195 dB noise floor very well.
Summary
We have successfully run the full readout system test at CSO, with 126 complex carriers being analyzed simultaneously. We cover 340 MHz bandwidth in IF band centered around 3 GHz (bandwidth and LO frequency of this run are designed to match the current device), with DAC step size and channelizer bin width 2600 Hz; whole system is synchronized and all the output data are time stamped; And we also achieve the noise level dominated by the HEMT amplifier.
The sampling rate, frequency step size and channlizer bin width are special designed for this implementation. Sampling rate, channlizer size/type and other signal process algorithm could also be easily expanded or changed based on different requirement.
A summarized results in shown in table 2.
DISSCUSSION AND FUTURE PLAN
We have successfully demonstrate the readout electronics at CSO for 2 weeks, and almost all the requirement for readout system were achieved and tested.
We are currently developing 2nd generation of the ADC and DAC board to use better ADC, DAC chips and better design on low frequency electronics noise. We are also working on the IF board development. and all the new hardware should has prototype for testing by this summer.
Beside the readout system, we also successfully demonstrated the data acquisition pipeline; optimized banddefining filter; improved magnetic shielding etc at CSO. More results of readout system and observed scientific data using this MKID readout system can be found in other proceedings paper from our group.
9-14
