Abstract. Synthetic aperture radar, SAR, is a high resolution imaging radar. The direct back-projection algorithm allows for a precise SAR output image reconstruction and can compensate for deviations in the flight track of airborne radars. Often graphic processing units, GPUs are used for data processing as the back-projection algorithm is computationally expensive and highly parallel. However, GPUs may not be an appropriate solution for applications with strictly constrained space and power requirements.
Introduction
Synthetic aperture radar, SAR, is a form of imaging radar that provides high quality mapping independent of light and weather conditions. SAR is used across a wide range of scientific and military applications including environmental monitoring, earth-resource mapping, surveillance, and reconnaissance. The principle of SAR operation is that a radar antenna is attached to an aircraft or spacecraft. The antenna transmits electromagnetic pulses and records their echoes.
An output image is reconstructed from echoed data that is interpreted as a set of projections. The direct back-projection algorithm provides a precise output image reconstruction and can compensate for deviations in the flight track. A very high number of operations is required to reconstruct the output image because each pixel contains data of hundreds of projections. Therefore, graphic processing units, GPUs, are often used for this type of SAR data processing. However, for applications with strict space and power requirements GPUs may not be an appropriate solution. For example, small unmanned aircraft systems may want to use the direct back-projection algorithm to compensate for deviations in the flight track but do not provide space and power for a computing system with a high performance GPU.
In this paper, we describe how we map a specific SAR data processing application to a multi-core system on an FPGA. We design a scalable multi-core system consisting of Tinuso processor cores [10] and a 2D mesh interconnect. We evaluate the system by simulating data processing of the airborne POLARIS SAR [5] . This radar is currently used in the evaluation process of the European Space Agency's, ESA, BIOMASS candidate mission [6] . This mission aims for a P-band SAR satellite that provides global scale estimates for forest biomass.
To the best of our knowledge, we are the first ones using a multi-core on an FPGA for SAR data processing with direct back-projection algorithm. The proposed system provides a number of advantages including system integration, power, scalability, customization, and the use of industrial and space grade components. As the power efficiency and logic capacity of FPGAs increases, they become an attractive technology for use in low-volume, large-scale systems. For example, Xilinx's Virtex-7 family comes with devices up to two million logic cells. These devices allow for combining the processing power of hundreds of processor cores on a single FPGA. Moreover, the same device can also host the digital front-end used for SAR signal processing. FPGAs provide flexible I/O that allows for connecting a multitude of data links and memory units to a single device. We propose and advocate for a multi-core system because it raises the abstraction level for the application programmer without facing the current performance drawbacks of high-level synthesis [9] . The proposed system provides the ability for customizations at all levels. For example, it is possible to add processor cores, define special instructions, and change the interconnect link-width. FPGAs are available in industrial and space grade, which permits the use in rough environments and in space.
We make the following contributions:
-We design a scalable multi-core system consisting of Tinuso processor cores and a high throughput, low latency network-on-chip, NoC, -We demonstrate that we can integrate 64 processor cores on a single FPGA and clock the system at 300MHz on a Xilinx Virtex-7 device. -We evaluate the system by simulating the POLARIS SAR application that is based on direct back-projection. We achieve real-time data processing for a 3000m wide area with a resolution of 2x2 meters. The multi-core fabric consisting of 64 processor cores and 2D mesh network-on-chip utilizes 60% of the hardware resources of a Xilinx Virtex-7 device with 550 thousand logic cells and consumes about 10 watt. -We apply software pipelining by distributing subtasks to dedicated processing elements. This hides memory latency and reduces the hardware resources by 14%.
