This article describes the development of a full-field range imaging system employing a high frequency amplitude modulated light source and image sensor. Depth images are produced at video frame rates in which each pixel in the image represents distance from the sensor to objects in the scene. The various hardware subsystems are described as are the details about the firmware and software implementation for processing the images in realtime. The system is flexible in that precision can be traded off for decreased acquisition time. Results are reported to illustrate this versatility for both high-speed (reduced precision) and high-precision operating modes.
INTRODUCTION
Range Imaging is a field of electronics where a digital image of a scene is acquired, and every pixel in that digital representation of the scene records both the intensity and distance from the imaging system to the object. The technique described here achieves this by actively illuminating the scene and measuring the time-of-flight (TOF) for light to reflect back to an image sensor [1] [2] [3] [4] [5] . Alternative techniques for forming range or depth data include stereo vision, active triangulation, patterned light, point scanners and pulsed TOF [6] [7] [8] .
Applications of range imaging can be found in mobile robotics, machine vision, 3D modelling, facial recognition and security systems.
This article describes the development of a full-field range imaging system assembled from off-the-shelf components. The system employs a modulated light source and shutter set up in a heterodyning configuration [9] as described in Section 2. Sections 3 and 4 introduce the various hardware components of the system and the software and firmware used to process the image sequence in real-time and provide control by the user [10] . A summary of the system is presented in Section 5, illustrating experimental results of captures taken both at high-speed (reduced precision) for a dynamic scene and also with high-precision in a static scene.
TIME-OF-FLIGHT RANGE IMAGING THEORY
The full-field range imaging technique described here utilises an Amplitude Modulated Continuous Wave (AMCW) light source and image sensor to indirectly measure the time for light to travel to and from the target scene. In this section we describe two techniques, homodyne modulation and heterodyne modulation, that we employ for indirect acquisition of time-of-flight. In both cases the target scene is illuminated by an AMCW light source with frequency, f M , typically between 10 and 100 MHz.
Homodyne Modulation
When illuminated by the AMCW light source, objects in the scene reflect the light back to an image sensor that is amplitude modulated at the same frequency, f M as the light source. Due to the time taken for the light to travel to and from the scene a phase delay, , is evident in the modulation envelope of the received signal. Fig. 1 . Demonstration of pixel intensity as a function of phase delay for light returned from two objects at different distances. The reflected signals are mixed with a copy of the original modulation waveform producing pixel 1 and pixel 2 . The total shaded area is proportional to the returned intensity of the pixels, I 1 and I 2 .
The modulated sensor effectively multiplies (or mixes) the returned light waveform with the original modulation signal. This is then integrated within the pixel to give an averaged intensity value relative to the phase offset. Figure 1 shows example waveforms with two objects at different distances, giving phase shifts θ 1 and θ 2 for the returned light which produce two different intensities I 1 and I 2 .
From a single frame it is impossible to know if reduced pixel intensity is a result of the phase offset of a distant object or due to other parameters such as object reflectance or simply the returned light intensity dropping with distance by the relationship I/d 2 . To resolve this ambiguity, N frames are taken with an artificial phase step introduced in the modulated sensor signal relative to the light signal incrementing by 2π/N radians between each frame. The phase delay can now be calculated using a single bin Discrete Fourier Transform as [11] 
where c is the speed of light and f M is the modulation frequency.
Heterodyne Modulation
Heterodyne modulation differs from homodyne modulation in that the illumination and image sensor are modulated at slightly different frequencies. The illumination is modulated at the base frequency, f M , and the sensor is modulated at f M + f B where the beat frequency, f B is lower than the sampling rate of the system. The sampling rate, f S , is ideally an integer multiple of f B as calculated by Equation 2.5.
The effective result is a continuous phase shift rather than the discrete phase steps between samples of the homodyne method. The phase shift during the sensor integration period reduces the signal amplitude, in particular attenuating higher frequency harmonic components which can contaminate the phase measurement, producing nonlinear range measurements. Through careful selection of N and the length of the integration time, the heterodyne method minimizes the effects of harmonics better than is normally possible with a homodyne system. This enhances measurement accuracy [13] .
Another advantage is the ability of the system to easily alter the value of N from three to several hundred simply by changing the beat frequency. This allows the user to select between high-speed, reduced-precision measurements with few frames (N small) or high-precision measurements with many frames (N large).
HARDWARE COMPONENTS
A working prototype of a heterodyne ranging system has been constructed at the University of Waikato (Hamilton, New Zealand) from off-the-shelf components. This consists of an illumination subsystem, industrial digital video camera, image intensifier unit to provide the high-speed sensor modulation, high-frequency signal generation and support electronics for real-time range processing and control. Fig. 2 shows an overview of the hardware configuration. 
Camera
A Dalsa Pantera TF 1M60 [14] digital camera is used to record the image sequence. It has a 12-bit ADC and a resolution of up to 1024 × 1024 pixels at a frame rate of up to 60 Hz. For this application the camera is set to operate in 8 × 8 binning mode improving the per pixel signal to noise ratio and increasing the maximum frame rate to 220 Hz. The reduced 128 × 128 resolution also eases the real-time processing memory requirements. The video data stream is provided from the camera via the industry standard CameraLink interface [15] .
Image Intensifier
A Photek 25 mm single microchannel plate (MCP) image intensifier is used to provide high-speed gain modulation of the received light. Gated image intensifier applications typically pulse the photocathode voltage with 250 V p-p at low repetition rates to provide good contrast between on and off shutter states, but this is not feasible for continuous modulation at frequencies up to 100 MHz. Instead the photocathode is modulated with a 50 V p-p amplitude signal that can be switched at the desired high frequencies, but at the expense of reduced gain and reduced image resolution [16] . In this particular application, the reduction in imaging resolution does not impede system performance because the camera is also operated in a reduced resolution mode (8 × 8 binning mode).
Laser Illumination
Illumination is provided by a bank of four Mitsubishi ML120G21 658 nm wavelength laser diodes. These diodes are fibre-optically coupled to an illumination mounting ring surrounding the lens of the image intensifier. This enables the light source to be treated as having originated co-axially from the same axis as the camera lens and reduces the effect of shadowing. The fibre-optics also act as a mode scrambler providing a circular illumination pattern, with divergence controlled by small lenses mounted at the end of each fibre. Maximum optical output power per diode is 80 mW which is dispersed to illuminate the whole scene.
Modulation signal generation
The heterodyne ranging technique requires two high frequency signals with a very stable small difference in frequency. In order to calculate absolute range values, an additional third signal is required to synchronise the camera with the beat signal so that samples are taken at known phase offsets.
A circuit board containing three Analog Devices AD9952 Direct Digital Synthesizer (DDS) chips is used to meet these requirements [17] . The board uses a 20 MHz temperature-controlled oscillator which each DDS IC multiplies internally to generate a 400 MHz system clock. The DDS ICs also have the ability to synchronise their 400 MHz system clocks to remove any phase offset between the outputs.
The output signals are sinusoidal with each frequency programmed by a 32-bit Frequency Tuning Word, FTW, via the SPI interface of an Atmel 89LS8252 microcontroller. The output frequency, f, is calculated as
where f SYS is the 400 MHz DDS System clock frequency [18] . This implies a frequency resolution of 0.093 Hz corresponding to the minimal increment of one in the FTW.
The three signals generated by the DDS board are used for: 1) the laser diode illumination, f M , 2) the modulation of the image intensifier, f M + f B and 3) the camera frame trigger multiplied by a constant, mf S . The DDS outputs derived from the same master clock ensures appropriate frequency stability between the beat signal, f B , and the camera trigger mf S .
A Xilinx Spartan 2 Field Programmable Gate Array (FPGA) receives the sinusoidal mf S signal, and using a digital counter divides the signal down to produce a CMOS signal of f S used to trigger the camera. This produces less jitter than that produced by the alternative of passing the less than 200 Hz sinusoidal signal directly to a comparator to effect a sinusoidal to digital conversion.
Control Hardware
The Spartan 2 board controls the analogue gain of the image intensifier and the laser diodes using a pair of Digital to Analogue Converter ICs.
The gain control of the laser diodes is used to ensure that they are slowly ramped on over a period of several minutes to protect them from overpower damage before they reach the stable operating temperature (operating in constant current mode). Controlling the gain of the image intensifier ensures the full dynamic range of the camera sensor is utilised while preventing pixel saturation.
Top level control of the system is provided through an Altera Stratix II FPGA resident on an Altera Nios II Development Kit connected via a JTAG connection to a PC. Control instructions are processed by a Nios II soft processor core and written to either special function registers within the Stratix II FPGA or passed on via RS232 to the microcontroller of the DDS board. The Spartan 2 board is controlled through an 8-bit port of the DDS board microcontroller.
Processing Hardware
In addition to providing top level control of the various hardware components of the system, the Stratix II FPGA is used to process the image sequence and calculate the range data using Equation 2.2.
Frames are directly transferred from the Dalsa camera over the CameraLink interface to the FPGA. A daughter board utilising a National Semiconductor DS90CR286 ChannelLink Receiver IC is used to convert from the serial LVDS CameraLink format to a parallel CMOS format more acceptable for the general purpose I/O pins of the FPGA.
The camera has two data taps, each streaming pixel information at a rate of 40 MHz. The images from the camera are processed internally within the FPGA and temporarily stored in on-chip block RAM ready to be presented to the user.
User Interfaces
The system is controlled through the Altera Nios II terminal program, which communicates with the Nios II processor through a JTAG interface. This is used to set system parameters including frame rate, modulation frequency, beat frequency, number of frames per beat, image intensifier gain, laser diode gain and various other special function registers.
For real-time range data display, a daughter board with a Texas Instruments THS8134 Triple 8-bit DAC IC is used to drive a standard VGA monitor. For longer term storage and higher level processing the range data are transferred to a PC using a SMSC LAN91C11 Ethernet MAC/PHY chip resident on the Nios II Development Kit. The embedded Nios II software handles the fetching of the temporarily stored range data from the block RAM and sends it out through a TCP connection to a host PC.
SOFTWARE AND FIRMWARE COMPONENTS
There are three distinct sets of software involved in operating the system, namely:
1) Firmware for the real-time calculation of range data on the Stratix II FPGA, 2) Software on the Nios II processor and 3) Host software on the PC.
Range Processing Firmware
The data stream coming from the camera consists of two pixel channels at 40 MHz each. It is very difficult to calculate the phase shift of each pixel using Equation 2.2 in real-time with current sequential processors, whereas the configurable hardware of the FPGA is ideally suited to this task as it can perform a large number of basic operations in parallel at the same clock rate as the incoming pixels. The range processing firmware can be divided into four sections: The CameraLink interface to the Dalsa Camera, the multiply and accumulate calculation, the arc tangent calculation, and the interface to the Nios II processor. All of the range calculation firmware is instantiated twice to independently process each of the two output taps of the camera. The process is fully pipelined with a latency of 24 clock cycles, and a theoretical maximum frequency of 144 MHz. A 125 MHz system clock is used for all elements internal to the FPGA.
The firmware for this project is designed in VHDL and simulated using Aldec's Riviera Pro software. Altera's Quartus II software is used to synthesise the design and download it to the board.
The CameraLink Interface
The interface to the Dalsa camera consists of 27 parallel inputs with a 40 MHz clock input for providing the pixel data, one output for the frame trigger of the camera, and one line in each direction for UART serial communication [15] .
The frame trigger is provided directly from the Spartan 2 board and simply passes through the interface to the camera. The serial communication lines are handled by the Nios II processor.
All of the 40 MHz data lines are sampled at the 125 MHz FPGA system clock. The lval and fval control inputs are used to specify when a valid line and frame of data are being received respectively. These are monitored by a state machine as shown in Figure 4 to give feedback to the system when a new frame is complete.
The state machine first ensures that it is aware of which part of the transfer sequence is currently underway, either during integration while the frame trigger is high or during readout while fval is high. While the frame and line valid signals are both high, valid pixel data are sampled on the falling edge of the 40 MHz clock. Once the frame and line valid signals are both low the readout is complete and a counter is started to introduce a delay before the frame increment signal is pulsed high. This allows the processing electronics time to complete before various registers and look-uptables are updated for the next frame. Figure 5 shows the timing of the signals within the CameraLink interface. Fig. 4 . CameraLink finite state machine for tracking incoming frames. Data readout is initiated by a falling edge of trigger and is read from the camera during the FVAL_HIGH state. A high pulse of frame_inc is generated a short time after readout is complete to prepare the system for the next frame. The sine and cosine operations are pre-calculated based on the number of frames per beat, N, and stored in the look-up-tables (LUTs) sin_lut and cos_lut respectively. These values can be changed by the user through control registers when the desired value of N is changed. The values for real_old and imag_old are the values of real_acc and imag_acc from the previous frame which have been stored in block RAM.
Each operation is implemented as a dedicated block of hardware as shown by Figure 6 . Operations are pipelined such that the calculation for any newly received pixel can begin before the calculation of the previous pixel is fully completed. Not shown is the circuitry to reset the value of i to 0 after N frames have been processed. At the start of each beat sequence (i = 0) the values of real_old and imag_old are taken as 0 to clear the value of the accumulator. The block RAM is dual ported, which allows values to be written and read simultaneously.
For real-time processing, the amount of on-chip block RAM is the greatest limiting factor in the resolution of the range images, because every pixel of the image requires enough storage for both the real and imaginary accumulated values. In the current implementation using an Altera Stratix II EP2S60 FPGA this restricts the resolution to 128 × 128 pixels. Fig. 7 . Calculation of arc tangent using one eighth of the LUT requirements. Using the sign and relative magnitude of the real and imaginary inputs determines which region the result will be in. The index into the LUT is the scaled result of the smallest of the two divided by the largest. This is added to or subtracted from a constant to find the final arc tangent approximation.
Arc Tangent Calculation
The arc tangent is approximated using a look-up-table (LUT). Figure 7 shows how the symmetric nature of the arc tangent function is utilised to reduce the RAM requirements of the LUT to one eighth of the full cycle. The sector, out of the eight possible sectors, of the final result is determined by the sign and magnitude of the real and imaginary components. The division is calculated and used to index the LUT, and the intermediate result is added to or subtracted from a constant to give the final result.
The advantage of calculating the arc tangent in this way is that the result of the division is always a positive number between 0 and 1. This scales easily as an index into the look-up-table. With a 16-bit by 1024 point LUT, the arc tangent can be approximated to better than 0.8 milliradians.
Interface to Nios II processor
The output frame from the arc tangent function is stored in a portion of triple port block RAM. The three ports are designated as one write-only port for the result of the arc tangent calculator and two read-only ports: one for the VGA display driver and the other for the Nios II processor.
The Nios II processor addresses 32k memory locations in the hardware. The lower 16k reference the output RAM, and the upper 16k reference special function control registers. These registers control everything within the Stratix II FPGA such as sine and cosine LUT values, frames per beat and a synchronous reset. Two control registers are set up in relation to the transfer of data to the processor:
• OR_RDY : a 1-bit read-only register which is set high by hardware when a processed frame is available in the output RAM.
• OR_ACK : a 1-bit write-only register which is used to signal to the hardware that reading of the output RAM is complete. While the OR_RDY bit is high, writing into the output RAM is disabled. This ensures that a frame being transferred out is not inadvertently overwritten.
Nios II Software
The Nios II processor is a reconfigurable microprocessor core provided by Altera. The features of the processor can be selected using the SOPC builder available with the Quartus II software. The Nios II processor is programmed in C using the Nios II IDE software. The main tasks performed by the processor are to:
1) Parse text commands from the user and either a) read or write on-chip control registers b) forward the message on to the DDS board microcontroller c) forward the message on to the Dalsa camera 2) Respond to the OR_RDY signal generated by the processing hardware, and 3) Handle the TCP connection for transferring frames to the PC.
A multi-threaded program is used for these tasks with the TCP packet handler having top priority. While the multi-threaded environment does add extra complexity and overhead to the code, an Altera example design with ethernet and TCP/IP configuration functionality was simply modified to include the additional required tasks.
PC Software
Software on the PC is programmed in C# using Microsoft Visual Studio. A basic terminal program establishes a TCP connection with the Nios II processor of the Stratix II FPGA, listening for incoming packets and writing them to a binary file.
The Nios II IDE software also includes a terminal for communicating with the Nios II processor through JTAG. This interface is used for sending text commands to the Nios II processor for setting various control parameters.
Results and Summary
The following results give an indication of ranging performance for two operating modes of the system: a capture of a dynamic scene for highspeed range images at the expense of depth precision, and a capture of a static scene for high precision measurements.
Test Capture of a Dynamic Scene
To demonstrate the operation of the range imaging system with video-rate output a scene has been set up with a number of moving objects. These are a pendulum swinging in a circular motion, a bear figurine rotating on a turntable and a roll of paper towels rolling down a ramp towards the camera. System parameters for this capture are f M = 40 MHz, f S = 125 Hz and N = 5, resulting in 25 range images per second. Figure 8 shows the first frame of the sequence with a group of labelled test pixels. These are locations in the capture showing 1) the area above the bear imaging a cardboard background, 2) a region of the far wall of the room and 3) to 5) the path of the rolling paper towels. The rotating bear and the pendulum, which is the dark spot to the right of the bear are not analysed in this capture Figure 9 shows a plot of the test pixels as their measured range changes throughout the 40 range image capture. As expected for stationary objects, the range at pixels 1 and 2 does not change throughout the capture and give an indication of the precision of the system in this operating mode. Test pixel 1 has a mean of 3.11 m and standard deviation of 4.0 cm and test pixel 2 has a mean of 1.27 m and standard deviation of 3.6 cm.
Although test pixel 2 represents the far wall, it is measured to be a distance similar to pixel 5. This is a consequence of phase wrapping where objects beyond the unambiguous range of c / 2f M = 3.75 m are measured incorrectly. It is possible to correct for this ambiguity [9] but this functionality is not currently incorporated into this system. The plot also shows an interconnecting slope between test pixels 3 to 5 where the paper towels roll towards the camera through these pixels. For example, test pixel 4 is initially returning the range of a location on the ramp at approximately 2.4 m. Between frames 17 and 24 the paper towels roll towards the camera through the location imaged by this pixel evident as a decreasing range measurement. After the paper towels have gone past the location of pixel 4 the value returns to the initial range of 2.4 m. 
Test Capture of a Static Scene

Summary
This article has described the construction of a heterodyne range imaging system with video-rate depth image output. The system has the following parameters:
• x-y resolution of 128 × 128 pixels • variable output frame rate with real-time display up to 60 Hz • variable frames per beat, N • variable modulation frequency, f M , up to 80 MHz • data capture using a common ethernet interface • system parameters adjustable through a command line interface With these easily adjustable system parameters it is possible to configure the system to meet the needs of a wide variety of applications from 3D modeling to mobile robotics.
