Abstract. The plenoptic wavefront sensor combines measurements at pupil and image planes in order to obtain simultaneously wavefront information from different points of view, being capable to sample the volume above the telescope to extract the tomographic information of the atmospheric turbulence. The advantages of this sensor are presented elsewhere at this conference (José M. Rodríguez-Ramos et al). This paper will concentrate in the processing required for pupil plane phase recovery, and its computation in real time using FPGAs (Field Programmable Gate Arrays). This technology eases the implementation of massive parallel processing and allows tailoring the system to the requirements, maintaining flexibility, speed and cost figures.
Introduction
The plenoptic wavefront sensor, also known as the plenoptic camera or the CAFADIS camera, was originally created to allow the capture of the "Light Field" (LF) [1] , a concept extremely useful in computer graphics where most of the optics are treated using exclusively the geometric paradigm. The use of plenoptic optics for wavefront measurement was described by Clare and Lane (2005) [4] , for the case of point sources, and has been proposed for extended sources by our group [5] . After a brief description of the sensor itself, we describe in this paper the processing required for the use of the plenoptic camera as a wavefront sensor, especially oriented for computing the tomography of the atmospheric turbulence for adaptive optics in solar telescopes. A conceptual design is provided for its implementation in real-time using FPGA technology, along with estimated figures of the relevant parameters involved, obtained both from assessment and from pilot developments completed with real components.
Description of the Plenoptic Wavefront Sensor
A microlens array with the same f-number than the telescope is placed at its focus (fig 1) , in such a way that many pupil images are obtained at the detector, each of them representing a slightly different point of view. The use of the same f-number guarantees that the size of the image of the pupil is as big as possible, without overlapping with its neighbour, providing thus optimum use of the detector surface. In order to understand the behavior of the plenoptic wavefront sensor, it is important to identify the direct relationship existing between every pupil point and its corresponding image for each microlens. Every pupil coordinate is imaged through each microlens in a way that all rays passing through the indicated square will arrive to one of the pupil images, depending only on the angle of arrival. This fact clearly indicates that the image that would be obtained if the pupil were restricted to this small square can be reconstructed by post-processing the plenoptic image, selecting the value of the corresponding coordinate at every microlens and building an image with all of them. Figure 2 outlines the image obtained at the detector toghether with the relevant parameters involved: The absolute number of detector pixels (N pixel ) 2 , the number of microlenses (N µL ) 2 , and number of pupil samples (N pupil ) 2 , being the three of them roughly related by N pupil = N pixel /N µL . It also shows the pixel closest to the pupil coordinate being sampled and its eight neighbours involved in the interpolation described below. : :
Correspondence between pupil and pupil images. Every pupil coordinate is re-imaged on the corresponding position of each pupil image, depending on the arriving angle of the incoming ray. Relevant parameters are shown.
Pupil wavefront computation
Recovering the pupil plane phase distribution will generally start by correcting detector images for zero level and gain equalization (Bias and flat). The figure depicts the images of the pupil of the VTT Telescope at the Observatorio del Teide, Canary Islands, where the spider and some vigneting can be easily recognised.
Image recomposition will follow in order to extract images coming from every pupil coordinate. Pixel rearrangement should consist in collecting all pixels located at the same relative position in the Phase recovery (Reconstruction) Fig. 3 . Pupil wavefront processing outline pupil, with some optional interpolation, and building an image with them. This image (roughly) is the image formed by the rays passing through this part of the aperture and thus it represents the phase at this aperture point. The figure shows copies of the image of solar granulation with the intensity reduced by the presence of the spider, as a clue for the reader to understand the processing.
Slopes in the wavefront can be computed by estimating the relative displacement between images, using basically correlation due to the extended nature of the solar image and the lack of contrast. One of the images will be used as a reference, preferably the one which is readout first. Finally, the reconstruction step from the slopes to the wavefront will be addressed at a separate work at this conference (José J. Díaz et al).
Image recomposition
All modules have been conceptually designed to deal with the stream of pixels coming out from the detector following a conventional column/line scheme, in order to minimize the overall latency and the amount of memory needed for intermediate results. Processing is organised to be done as soon as the data is available, and to be finished slightly later than the last pixel is read out.
The image recomposition is planned to implement a 3x3 kernel interpolation in order to cope with the rather probable lack of registering between the microlenses and the detector pixels. This capability will strongly relax the alignment procedures when installing the microlens array, and also will allow using commercially available combinations of microlenses pitches and detector sizes. With this conceptual scheme, every pupil sample will be computed using nine pixels to interpolate nearby the theoretical position of the pupil coordinate at the every detector image of the pupil. The positions of the centers of the microlens image will be measured for every pupil and the weighting factors for every 3x3 interpolating filter will be computed off-line and stored in a dynamic RAM memory, because a rather big amount will be needed.
The figure shows the conceptual design of the real time interpolator, capable of handling the readout stream and performing the interpolation at pixel rate. Every new pixel will "participate" in the nine (3x3) interpolation computations of the neighbours. The processing is divided in a row-based scheme, being simultaneously computed the present row, the previous one and the next one. The pixel data is fed to three multiply-accumulator modules, where the interpolation is computed with the offline loaded weight and intermediate results are updated. Once all neighbours have been accounted for, the interpolation is available at the module output. The relevant memory sizes involved are included in the figure at their respective box. 
Correlation-based slope estimation
The correlation system (Fig. 5 ) has been designed to compute 5x5 samples of the cross correlation function nearby its peak. (Quadratic interpolation will produce ±1.5 pixels accurately and up to ±2.5 pixels with progressively diminished linearity). Every pupil sample will only be involved in 25 correlation values, for each pair of cross correlation functions. A two-line memory accepts samples of the reference and the images. The Multiply-Accumulator module will (asynchronously) review whether any two matching samples are available, and then will multiply them, accumulate and clear the position. The processing has been divided in five simultaneous calculators because the pixel rate is expected to be only a few times slower than the FPGA clock, and 25 mac operations is considered too much effort to be completed in a pixel period. Quadratic interpolation is needed to obtain subpixel displacement information, involving the peak value and its closest neighbours in both Cartesian axes. As the quadratic interpolation requires a division, a number of parallel modules may be needed to cope with the latency requirements, obtained as the result of a compromise between speed and FPGA resources used. FIFO memories are recommended to decouple both inputs and outputs of the quadratic interpolators.
Pilot development
In order to verify the viability of the proposed concept, an also for gaining insight in the nature of the practical problems involved, a pilot laboratory development was undertaken using real components: CCD camera PULNIX RM-4200GE, (2048x2048 pixels, GigE output), microlens array of 130 microns pitch and 8.3 mm focal length (SUSS MicroOptics) and imaging lens of 300 mm. FPGA processing was developed using a Xilinx ML401 board with a SX-35 Virtex-4 chip. Modules for accepting the GigE input stream and the array of microlens centers were developed, the image recomposition and the real-time direct display using the available VGA output. Figure 6 shows a screen capture of the VGA display. On the top left corner the plenoptic image is showed, reduced from its original size to 256 x 256 pixels by numerical binning. On the top right corner an user-selectable fragment subwindow of the original plenoptic image is displayed. In this case, it matches with the center. The centers of the microlenses are also displayed on the image (using red color in the original display). Finally, five images are recomposed and displayed in real time from five different pupil coordinates, located at the optical axis (center) and 80% of displacement towards the right, left, top and bottom. It can be observed that the center image has more light than the others, as expected.
Conclusions
A conceptual design for the real-time processing required by the plenoptic wavefront sensor has been outlined, driven by the need to obtain minimum latency with a reasonable amount of memory and FPGA resources. The estimation depicted at the results table gives the main figures for a typical system, and absolute numbers have been added in the last column for the case of a 2048x2048 detector with 200 microlenses per line, with a readout pixel speed of several tens of MHz and a FPGA clock (t ck ) of 100 to 150 MHz.
