Abstract. Satellite remote sensing (SRS) needs to make use of up-to-date technology. The trend in SRS missions has always been towards using hardware devices with smaller size, lower cost, more flexibility, and higher computational power. Therefore, Field Programmable Gate Array (FPGA) technology is highly used by the scientific community and hardware developers to implement different SRS algorithms. Thus, FPGAs offer well-suited architectural features; large number of programmable logic elements, distributed and block RAMs, DSP and register slices and look up tables. This paper describes an approach to the implementation of the Land Surface Temperature Split-Window (LST-SW) algorithm structure based on FPGA technology.
Introduction
Recently, Field Programmable Gate Array (FPGA) technology has become a viable target for the implementation of different algorithms in different fields. These developments are offering new opportunities especially in satellite remote sensing. Digital sensors mounted onboard the remote sensing satellites scan vast areas of the earth's surface during day and night and beam this data to the satellite ground stations for further processing and usage in different applications: Digital Signal Processing (DSP), video processing, control systems engineering, bioinformatics, aerospace and defense systems. With their continuous development and improvement, remote sensing satellites [1] are increasingly being utilized and use of the latest available technological components. The emergence of peripherals such as Field Programmable Gate Arrays (FPGAs) [2] , which can bridge the gap in on-board and real-time analysis of remote sensing data. Some FPGAs incorporate a large number of arithmetic blocks that can be low-complexity blocks such as simple multipliers or can be relatively more complex like the Digital Signal Processing (DSP) units which consist of combinations of various components like multipliers, adders, accumulators, shift registers. A DSP unit significantly accelerates the FPGA's performance and allows achieving greater productivity and flexibility, while decreasing cost and power consumption [3] .
FPGAs are fully reconfigurable, a technological feature that combines the flexibility of traditional microprocessors with the performance of Application Specific Integrated Circuits (ASIC) devices [4] . Also, the reconfigurable hardware was certified by international remote sensing agencies, which was frequently used in remote sensing missions and space-borne earth observation missions [5] . FPGA also offers exact results with compact size, and low cost which makes the reconfigurable system interesting for onboard data processing. The FPGA architecture makes it possible to implement any combinational and sequential circuit, which can range from a simple logic function to a high-end soft-processor [6] . In the past, FPGAs have been mainly used in signal processing and network packet analysis. However, thanks to the high-speed embedded resources included in the FPGA such as DSP slices and fast memories, they are now also utilized for algorithm acceleration either as coprocessors or standalone systems, System-On-Chips (SoCs). Moreover, several vendors reserve space in the FPGAs to custom soft hard-IP (HIPs) processors.
Recently, FPGAs have become a viable target technology for implementation different algorithm in remote sensing such us: A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft-Core Processor [7] , FPGA Implementation of an Algorithm for Automatically Detecting Targets [8] , FPGA Implementation of the N-FINDR Algorithm [9] , On-Board Ortho-Rectification for images Based on an FPGA [10] , FPGA Implementation of JPEG-LS Remote Sensing Image Coding Algorithm [11] . These computing systems combine the flexibility of general-purpose processors with the speed of application-specific processors. The objective of this work is to develop an FPGA-based hardware version of the Land Surface Temperature Split-Window (LST-SW) algorithm structure [12] on FPGA.
As image sizes and bit depths grow larger, software has become less useful in the images processing realm. Recently developed hybrid FPGAs, such as the Xilinx Virtex-5 [13] , offer the versatility of running diverse software applications on embedded processors while at the same time taking advantage of reconfigurable hardware resources, all on the same chip package. In fact, radiation hardened FPGAs are in great demand for military and space applications [14] [15] . Xilinx FPGAs have been used in more than 50 missions. In this work, we use a Xilinx Virtex-5 XC5VLX50T FPGA as a baseline architecture since it is similar to the existing FPGAs [8] that have been certified by several international agencies for remote sensing applications. They are based on the same architecture so we could immediately implement our design on them.
Role of FPGA in satellite remote sensing applications
The trend in remote sensing missions has always been towards using hardware devices with smaller size, lower cost, more flexibility, and higher computational power [16] [17] . Reconfigurable Hardware (RH) provides a flexible medium to implement hardware circuits.
The RH resources are reconfigurable post-fabrication, allowing a single-based hardware design to implement a variety of circuits. FPGAs can be reconfigured to avoid hardware faults [18] , whether they result from fabrication or the environment.
Reconfigurable hardware offers the necessary flexibility and performance with reduced energy consumption compared to other high-performance processors. By mapping functionality to FPGAs, the computer designer can optimize the hardware for a specific application resulting in acceleration rates of several orders of magnitude over general purpose computers. Furthermore, these devices are characterized by lower form/wrap factors compared to parallel platforms and by higher flexibility than ASIC solutions. Moreover, satellite-based remote sensing instruments can only include chips that had been certified for space conditions. This is because space-based systems must operate in an environment in which radiation effects have an adverse impact on integrated circuit operation [19] . Ionizing radiation can cause softerrors in the static cells used to hold the configuration data. This will affect the circuit functionality and can cause system failure. Therefore, it requires special FPGAs that provide on-chip reconfiguration error-detection and/or correction circuitry. High-speed, radiationhardened FPGA chips with million gate densities have recently emerged can support the high throughput requirements for the remote sensing applications. Radiation-hardened FPGAs are in great demand for military and space applications. For instance, industrial partners such as Microsemi Corporation or Xilinx (www.xilinx.com) have been producing radiation-tolerant antifuse FPGAs for several years for high-reliability space-flight systems. Microsemi FPGAs have been on board more than 100 launches and Xilinx FPGAs have been used in more than 50 missions.
Description of LST Split-Window algorithm
Land Surface Temperature (LST) has a significant role in the land surface characters on local and global scale and his one of most key parameters in the biophysics of land surface processes [20] . LST plays an important role in the cycle of the natural ecosystem evolution and global change. LST determination from satellite data is mainly influenced by the atmosphere and surface emissivity.
LST is defined as the surface radiometric temperature corresponding to the instantaneous field-of-view of the sensor or more precisely [21] , as the `ensemble directional radiometric surface temperature [22] . The term "ensemble" depicts the bulk contribution of an inhomogeneous pixel. For a given sensor viewing direction, LST depends on the distribution of temperature and emissivity within a pixel and the spectral channel of measurement [23] .
In 
where T4 and T5 in (K) are the brightness temperatures measured in the Advanced Very HighResolution Radiometer (AVHRR) Channels 4 and 5 on board the National Oceanic and Atmospheric Administration (NOAA) satellite series, W is the total amount of atmospheric water vapor in (g cm-2). ε and Δε are, respectively, the average effective emissivity in both channels and the spectral variation of emissivity. Also, T4, T5, W, ε are binary images contains pixels of images coded in 16 bits.
Proposed Hardware FPGA architecture for the LST-SW algorithm
The architecture proposed of the hardware used to implement the LST algorithm (see Figure 1 ), in our implementation we have devices our algorithm in two parts (see equations (2) and equations (3) 
For data input, we use a hard disk (PC) to store our binary data. In addition, our architecture contains two FIFO: the first FIFO to transmit the binary images T4, W and also the result of the first part. The second FIFO to transfer the binary images T5, ε and the result of the second part. Finally, we transmit each part in the two FIFOS: FIFO 1 (part1) and FIFO 2 (part2). The LST module is used to implement our version of the LST algorithm.
Fig. 1.Hardware architecture proposed for implementation of the LST S-W algorithm
To calculate LST1 we send two images: T4 through FIFO1 and T5 into FIFO 2 then the calculation happened in the LST module (see Figure 1. [a] ). The last result of part1 stored in the hard disk. For LST2 we send two images: W through FIFO1 and ε into FIFO 2 then the calculation happened in the LST module (see Figure 1. [b] ). The last result of part2 stored in the hard disk. Finally, to calculate LST we send both parts that stored in memory through the two FIFOs: LST1 through FIFO1 and LST2 into FIFO2 (see Figure 1. [c]) then we add this result and send it to the hard disk.
LST hardware module is the heart of our system, and we took special care in its definition using VHDL as the hardware description Language. This module has set as top module of our system, because he contains different hardware part such us, registers, arithmetic-logic units. Also, all the calculations of our algorithm are done in this module: addition, subtraction, multiply. In our implementation, we multiply 100 with the coefficient of the algorithm (see equation (4) and (5)) to avoid the floating-point and their difficulty in their implementation, and when we got the result, we divided the coefficients by 100 to get back to our equations. 
The LST-SW algorithm has been described using VHDL (VHSIC Hardware Description Language), and the design has been synthesized and implemented using Xilinx ISE 9.2i tool. As we said, T4, T5, W, ε are binary images generated by ENVI software in form of. Png files, but our Xilinx ISE cannot support this type of files. A program was written using MATLAB to support design verification through file manipulation. Therefore, we have written a program to convert this png files to text file format which was read by the VHDL.
Implementation results

Input Data
In our proposed implementation we have used 4 images (format: 512 columns, 512 rows). we chose to work with images similar to the real satellite images (T4, T5, W, ε) to check the operation of our system Then we have converted this image to text file contains pixels of images coded in 16 bits.
Test bench generation
The most time-consuming part of a system design of this nature is verifying that the VHDL code operates as intended. This is done by executing the device code in a simulator. The simulation of the device is only meaningful if the inputs to the system are manipulated in a way that causes the device to operate in its intended manner. This is easily done using VHDL to create a test bench for the entity under test. One of the major improvements made as test bench development progressed was the adoption of file I/O for loading processing parameters and image data into the system.
We have creating a test bench to circulate satellite data into system, which are the latter text format files contains pixel values that what we called the file Input/Output. Figure 2 show the Register Transfer Level of the first part and second part and the result of our algorithm, as we can see this figures shows the input signal which are the outlar (T4), outlar 2(T5), outlar 3(W),outlar 4(ε), Also the LST1 and LST2 and they are data satellite as text file contain pixel coded in 16 bits, and the output that contain the result of the first part LST 1 and second part LST 2 and the LST result.
Register Transfer Level (RTL) schematic
Fig. 2.Register Transfer Logic for the system
The Figures 3,4 and 5 shows the simulation results of the proposed implementation of the LST-SW algorithm in each part. In this simulation we see all the data that be sending into system as binary number coded in 16 bits using the test bench for reads images file and it was written specifically to replace the acquisition of images. This data buffering by FIFO, it is a structure used in hardware application when we need to buffer a data and it's like a pipe where the first element entering into the pipe is the first element that output from the pipe. Therefore, for LST1 (see Figure 3) , we send the first element of our image T4 (outlar) into FIFO 1 and the first element of image T5 (outlar2) into FIFO 2 and we calculated the first element it means the first pixel in each data together then we stored this result in the first line, this operation continue until the last element.
Fig. 3.Simulation Results for the LST1
The same way, to calculated LST2 (see Figure 4) , we send the first element of our image W (outlar3) into FIFO 1 and the first element of ε(outlar4) into FIFO 2 and we calculate the first element it means the first pixel in each data together then we stored this result in the first line, this operation continues until the last element. Figure 5 shows the result of the addition of the two parts, with the same way, we transfer the result of the LST1 into FIFO 1 and the result of LST 2 into FIFO 2, this calculation is done like the earlier ones, we add the first element pixel in each FIFO and we memorized, this operation continues until the last pixel. 
Fig. 4.Simulation Results for the LST2
LST-SW FPGA implementation results
In this subsection, we conduct an experimental result of the computational performance of the proposed FPGA implementation. Table 1 shows the resources used for our hardware implementation of the proposed LST algorithm design, conducted on the Virtex-5 LX50T FPGA. This FPGA has a total of 4706 slices, 28800 slice registers, and 28800 four input lookup tables available. In addition, the FPGA includes some heterogeneous resources, such as 48 DSP48s.In our implementation, we took advantage of these resources to optimize the design, The Block Slice Registers are used to implement the FIFOs without using block RAMs, with others slices are used for the implementation of the LST-SW algorithm with the DSP48s multiplier. Logically, the number of DSP is higher because of the existence of the calculation in each part (LST1 and LST2) that consumes much of the resources of the FPGA.
Conclusion
The choice of an architecture for the embedded images processing is not trivial and that a compromise must certainly be found between flexibility, consumption, performance, cost and speed of design in terms of TTM (Time-To-Market). Also, the number of remote sensing applications requiring fast response of algorithm analysis has been growing exponentially in recent years. In this paper, we have proposed an architecture for the FPGA implementation of the Land Surface Temperature Split Window (LST-SW) algorithm, one of the most wellknown approaches for determination of Land Surface Temperature from satellite data in the remote sensing community. The system was designed using VHDL in a Hight level design method. All part of the design has been simulated and implementing using Xilinx tools with target FPGA Virtex-5 LX50T.
