Real-Time Refocusing using an FPGA-based Standard Plenoptic Camera by Hahne, Christopher et al.
IEE
E P
ro
of
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 1
Real-Time Refocusing Using an FPGA-Based
Standard Plenoptic Camera
1
2
Christopher Hahne , Andrew Lumsdaine, Senior Member, IEEE, Amar Aggoun,
and Vladan Velisavljevic, Senior Member, IEEE
3
4
Abstract—Plenoptic cameras are receiving increased5
attention in scientific and commercial applications because6
they capture the entire structure of light in a scene, en-7
abling optical transforms (such as focusing) to be applied8
computationally after the fact, rather than once and for all at9
the time a picture is taken. In many settings, real-time inter-10
active performance is also desired, which in turn requires11
significant computational power due to the large amount12
of data required to represent a plenoptic image. Although13
GPUs have been shown to provide acceptable performance14
for real-time plenoptic rendering, their cost and power15
requirements make them prohibitive for embedded uses16
(such as in-camera). On the other hand, the computation17
to accomplish plenoptic rendering is well structured,18
suggesting the use of specialized hardware. Accordingly,19
this paper presents an array of switch-driven finite impulse20
response filters, implemented with FPGA to accomplish
Q1
21
high-throughput spatial-domain rendering. The proposed22
architecture provides a power-efficient rendering hardware23
design suitable for full-video applications as required in24
broadcasting or cinematography. A benchmark assess-25
ment of the proposed hardware implementation shows that26
real-time performance can readily be achieved, with a one27
order of magnitude performance improvement over a GPU28
implementation and three orders of magnitude performance29
improvement over a general-purpose CPU implementation.
Q2
Q3
30
Index Terms—.31
I. INTRODUCTION32
OVER the last two decades, several studies have reported33 methods to computationally render varyingly focused im-34
ages from a single lightfield photograph [1]–[8]. In addition to35
Manuscript received June 11, 2017; revised January 10, 2018; ac-
cepted March 1, 2018. This work was supported by the EU’s 7th
Framework Programme under Grant EU-FP7 ICT-2010-248420. (Cor-
responding author: Christopher Hahne.)
C. Hahne is with the trinamiX GmbH (BASF), Ludwigshafen 67063,
Germany (e-mail: info@christopherhahne.de).
A. Lumsdaine is with the Northwest Institute for Advanced Comput-
ing, University of Washington, Seattle, WA 98195 USA (e-mail: al75@
uw.edu).
A. Aggoun is with the School of Mathematics and Computer Science,
University of Wolverhampton, Wolverhampton WV1 1LZ, U.K. (e-mail:
a.aggoun@wlv.ac.uk).
V. Velisavljevic is with the School of Computer Science and Technol-
ogy, University of Bedfordshire, Luton LU1 3JU, U.K. (e-mail: vladan.
velisavljevic@beds.ac.uk).
This paper has supplementary downloadable material available at
http://ieeexplore.ieee.org, provided by the author.
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIE.2018.2818644
spatial information, lightfields contain directional information, 36
acquired by capturing an array of two-dimensional (2-D) spatial 37
images with either multiple conventional cameras [1], [9]–[11] 38
or by attaching a micro lens array (MLA) to a single image 39
recording device [2], [12], [13]. In science, lightfield cameras 40
are also known as plenoptic cameras derived from the Latin 41
and Greek roots meaning “full view” [13], [14]. For industrial 42
applications, MLAs are preferred to simple pinholes or coded- 43
aperture patterns due to improved light-gather capability and 44
to multiaperture systems due to compact form-factor. A study 45
carried out by Ng et al. [15] has found that the maximum direc- 46
tional information is recorded when placing the microlenses one 47
focal length away from the image sensor. However, a follow-up 48
study reinvestigated this and showed that it is possible to flex- 49
ibly tradeoff directional and spatial resolution by shifting the 50
MLA with respect to the sensor [4], [16]. In this paper, we refer 51
to the former design as the standard plenoptic camera (SPC) 52
and the latter as the focused plenoptic camera (FPC). While re- 53
searchers have developed a number of approaches to plenoptic 54
camera design [17], [18], the rendering (or focusing) process 55
remains computationally intensive, posing a core challenge to 56
the computer vision field. 57
One motivating industrial performance-sensitive application 58
for plenoptic cameras is in cinematography, where the use of 59
plenoptic source video can greatly enhance the flexibility and 60
creativity in capture and production. For example, since the opti- 61
cal parameters are not irrevocably set at the time the video is cap- 62
tured, focus or depth of field can easily be adjusted in postpro- 63
duction. Moreover, new creative effects can be applied, includ- 64
ing nonphysical optical effects. Plenoptic video can also be used 65
to create stereo pairs for three-dimensional (3-D) viewing—with 66
the important advantage over stereo capture that different videos 67
can be created for different devices, each having parallax suited 68
for the particular device [19]. Finally, 2-D and 3-D production 69
can use significantly different effects for directing the viewer’s 70
attention (depth of field is not as useful in 3-D as 2-D, for exam- 71
ple). With plenoptic source video, 2-D and 3-D can be rendered 72
from the same source, with different creative effects for each. 73
We note that Lytro, one of the earliest manufacturers of plenop- 74
tic cameras, has recently announced a video lightfield camera to 75
the broadcast and cinematography market [20]. In any of these 76
scenarios, high rendering performance is essential. For preview 77
and for postproduction, rendering of each video frame must be 78
accomplished at the video frame rate, regardless of the effects 79
and adjustments being applied. 80
0278-0046 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
IEE
E P
ro
of
2 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
An early attempt at high-performance rendering was based81
on the projection slice theorem, which rendered images with82
lower dimensional slices of the lightfield in the Fourier do-83
main [3], [21]. This procedure is also known as Fourier slice84
photography (FSP). Although FSP has the potential to be effi-85
cient when rendering a large number of focused images from86
the same lightfield, there are significant overheads in this ap-87
proach that limit its practical application. Real-time rendering88
in the spatial-domain has been achieved with graphical pro-89
cessing units (GPUs) [22], but the cost and power associated90
with GPUs make their use in embedded settings (for example)91
impractical. Accordingly, it is the goal of this study to devise92
and demonstrate a special-purpose hardware architecture that93
performs real-time rendering in the spatial-domain based on se-94
rially incoming video frames. We propose an array of semisys-95
tolic finite impulse response (FIR) filters designed for high data96
throughput. Moreover, we realize the rendering convolution ker-97
nel in FIR fashion by introducing switches to the filter distribu-98
tion network. For power efficiency and configuration flexibility,99
the proposed design is implemented with a field programmable100
gate array (FPGA). As distinguished from previous studies, our101
hardware design accomplishes a computation time of less than102
100 μs for a single refocused frame with 3201-by-3201 pixel res-103
olution when running at 100-MHz pixel clock frequency. This104
outperforms earlier studies in the field, which we further demon-105
strate with benchmarks against a GPU and a CPU MATLAB106
implementation.107
The organization of this paper is as follows. Section II presents108
recent developments in the field of FSP and SPC lightfield mod-109
eling to serve as a starting point for refocusing in spatial-domain.110
Section III imposes requirements on the filter module architec-111
ture and presents a solution based on switch-driven FIR filters.112
The proposed hardware design is examined in Section IV, us-113
ing a hardware description language (HDL) for FPGAs (see114
supplementary material) and by benchmarks with an alternative115
GPU-based implementation. Conclusions and suggestions for116
further work are presented in Section V.117
II. RELATED WORK118
A. Background119
A lightfield can be retrieved by light rays intersecting two120
consecutively-placed 2-D planes of known relative position [9].121
Intersections of a single ray at two 2-D planes yield four co-122
ordinates in total, thus making up a four-dimensional (4-D)123
light ray parametrization. Because of its simplicity, this concep-124
tual model has gained popularity among scientists in the field of125
computer vision. A related one-plane parameterization based on126
position and angle can also be used [4], [16]. In the celebrated127
work by Ng et al. [3], a raw captured 4-D lightfield is trans-128
formed to the Fourier domain to achieve refocusing using the129
projection-slice theorem. Unfortunately, the process of taking130
Fourier transforms, interpolating for slicing, and then taking in-131
verse transforms introduces significant computational overhead,132
making FSP unsuitable for real-time rendering. This assump-133
tion was confirmed by Mhabary et al. [21], who have worked to134
advance FSP by employing a fractional Fourier transform. How-135
ever, the authors conclude that the integral projection operator 136
in the spatial-domain is faster when computing only a single 137
refocused image from a lightfield. The suitability of refocusing 138
in the spatial-domain was further confirmed by Lumsdaine et al. 139
who demonstrated real-time rendering performance using GPU 140
hardware [22]. For these reasons, our approach in this paper is 141
based on rendering in the spatial-domain. 142
The main concept of computation time improvements using 143
FPGAs builds on the principle of parallelization and pipelin- 144
ing [23]. A pipeline comprises chained processor blocks fed 145
with serialized data that are processed sequentially. Speed up 146
is obtained by processing data chunks in one processor unit 147
while subsequent data chunks are handled in preceding units. 148
Hence, the benefit of pipelining is that serialized data chunks 149
are processed at the same time while processor units perform 150
different tasks. While data serialization limits a specific task 151
to be computed with one single operation at a time, e.g., one 152
pixel after another, parallelized data streams allow a comput- 153
ing system to perform at least two operations of the same type 154
simultaneously. Parallelization can be thought of as duplicat- 155
ing processor pipelines, which requires synchronized parallel 156
data streams as input signals. Letting the degree of paralleliza- 157
tion be ι, the computation time in image processing may be 158
minimized to O (K2/ι) if 2-D image dimensions consist of K 159
samples each and provided that both computation systems run at 160
the same clock frequency. Consequently, the one-dimensional 161
(1-D) parallelization limit is reached where ι = L for image 162
rows and ι = K for image columns, which is the ideal scenario 163
in terms of parallelizing data processes. 164
Early work in the field of embedded plenoptic imaging was 165
reported by Rodrı´guez-Ramos et al. [24], who employed an 166
FPGA to process plenoptic data with the aim of analyzing wave- 167
front measurements. Another interesting approach, reported by 168
Wimalagunarathne et al. [25], proposed a design to render com- 169
putationally focused photographs from a set of multiview im- 170
ages using infinite impulse response filters. Work on real-time 171
rendering from FPC captures was presented in [22]. The first 172
reported hardware design for performing real-time rendering 173
from SPC captures was presented by Hahne et al. [6]. Shortly 174
thereafter, Pe´rez et al. [7] published an article addressing the 175
same topic. The authors demonstrated significant computation 176
time improvements compared with run times based on a cen- 177
tral processing unit (CPU) system that was programmed using 178
an object-oriented language. A theoretical comparison of our 179
method with that of Pe´rez et al. [7] is carried out at the end of 180
Section III. 181
B. SPC Ray Model 182
Development of a computationally efficient refocusing algo- 183
rithm requires knowledge about the ray geometrical properties 184
in a plenoptic camera. To conceive a refocusing hardware archi- 185
tecture in spatial-domain, we employ a ray model reported by 186
Hahne et al. [8], which is based on paraxial optics. The model 187
is depicted in Fig. 1 and builds on the assumption that image 188
sensor plane and MLA are separated by one focal length fs such 189
that the MLA is focused to infinity, which is in accordance with 190
IEE
E P
ro
of
HAHNE et al.: REAL-TIME REFOCUSING USING AN FPGA-BASED STANDARD PLENOPTIC CAMERA 3
Fig. 1. SPC ray model (borrowed from [8]) with microlens chief rays
traveling through the MLA plane s and main lens plane U , which is
depicted as a thin lens. Lightfield intensities captured at the sensor plane
are denoted as Efs [sj , uc+ i ] for the 1-D case. Chief ray colors in a
microimage indicate angular samples uc+ i .
Ng’s concept of a plenoptic camera [15]. To understand light-191
field imaging in an SPC, as in the Lytro setup [20], one may192
regard a main lens image of an object plane to be focused on193
the MLA plane. In this case, the focused light rays converge to194
the microlens and diverge when leaving it to form a microimage195
(see Fig. 1). A pixelated light-sensitive detector placed behind196
the MLA captures angular portions of the incident-divergent197
beam. Each angular sample in this microimage corresponds to198
the same focused spatial point in space observed from different199
views. This point’s intensity is recovered when integrating all200
microimage samples.201
We denote a lightfield captured by an SPC in the follow-202
ing way. For clarity, only the horizontal cross-section is re-203
garded hereafter. In the angular domain u, we start counting204
samples from microimage centers (MICs), which serve as a ref-205
erence positions c = (M − 1)/2 where M denotes a consistent206
total number of samples for each microimage in one dimen-207
sion. Microimages are seen to be radially symmetric and hor-208
izontally indexed by c + i, with i ∈ [−c .. c]. Horizontal light-209
field positions are then given as [sj , uc+i ] with j as the 1-D210
Fig. 2. Processing requirements for the hardware architecture. The
diagram shows exemplary input illuminance values Efs (see Fig. 1)
subdivided into microimages sj and synthesized output values E ′a at a
desired refocused image plane a.
index of a respective micro lens sj . All microimages together 211
form a light field image with its cross-sectional representa- 212
tion Efs [sj , uc+i ] where Efs denotes a pixel’s illuminance. 213
As demonstrated in [8], a horizontal cross-section of a lightfield 214
image can be refocused by employing 215
E ′a [sj ] =
c∑
i=−c
1
M
Efs
[
sj+a(c−i) , uc+i
]
, a ∈ Q (1)
where a adjusts the synthetic focus. Equation (1) can also be 216
applied to the vertical dimension. 217
Since images acquired by an SPC do not feature the 218
Efs [sj , uc+i ] notation, it is convenient to define an index trans- 219
lation formula considering the lightfield photograph to be of two 220
regular sensor dimensions [xk , yl ] as if taken by a conventional 221
sensor. Indices are then converted by 222
k = j × M + c + i (2)
in the horizontal dimension meaning that [xk ] is formed by 223
[xj×M +c+i ] to replace [sj , uc+i ]. This concept of index trans- 224
lation may be similarly extended to the vertical domain. 225
III. FILTER DESIGN 226
An efficient hardware design that enables an FPGA to 227
refocus in real-time may be conceptualized on the basis of the 228
lightfield ray model presented in Section II. The upper data 229
line of Fig. 2 depicts discrete and quantized illuminance values 230
Efs [xk ] of a single horizontal row that is part of a calibrated 231
lightfield image. Lightfield calibration implies MIC detection 232
and rendering procedures to obtain a consistent microimage 233
size (M ). The computational refocusing synthesis given in 234
Section II reveals that pixels involved in the integration process 235
expose interleaved neighborhood relations, which exclusively 236
depend on a. This phenomenon is illustrated by the data flow 237
diagram in Fig. 2, where respective pixels are highlighted for 238
two exemplary refocusing settings: a = 0/3 and a = 2/3. Here, 239
each color corresponds to a chief ray in the model in Fig. 1, 240
with M = 3 where yellow represents the MIC pixel. In this 241
section, a hardware architecture is devised that accomplishes 242
signal processing according to (1) as depicted in Fig. 2. 243
On the supposition that a horizontal cross-section of a cap- 244
tured lightfield Efs [xk ] is a linear, time-invariant system, the 245
integral projection in (1) may be represented as a discrete FIR 246
IEE
E P
ro
of
4 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
convolution formula. Following the [sj , uc+i ] to [xk ] translation247
in Section II, 1-D refocusing can be given by248
E ′a [xk ] =
M −1∑
i=0
1
M
Efs
[
xk ′+i(aM −1)
]
, a ∈ Z (3)
with249
k′ = (k + 1) × M − 1 (4)
taking care of a correct integral projection, which inevitably250
reduces the number of samples in the rendered output image.251
Equation (3) aims at complying with the classical FIR filter no-252
tation, however with indices in subscripts for consistency rea-253
sons and to let x signify the domain and coordinate direction.254
Upon closer examination, one may notice that the impulse re-255
sponse is represented by a constant coefficient 1/M , which is a256
consequence of weighting pixels equally during the integration257
process. Note that i ∈ [0 ..M − 1] in the following.258
In contrast to (3), we seek to reproduce an output image with259
a resolution numerically equal to that of the raw sensor image.260
To compensate for sample reduction in the integral projection261
process, the overall sensor resolution may be retained by up-262
sampling the spatial-domain during image formation. Besides,263
it will be shown hereafter that our proposed upsampling scheme264
enables interpolation of refocused depth planes.265
To break down the complexity, we devise one filtering func-266
tion per refocusing slice a that qualifies for FIR filter implemen-267
tation. Regardless of the microimage resolution M , a filter that268
computes a refocusing slice with a = 0 in horizontal direction269
reads270
E ′0/M [xk ] =
M −1∑
i=0
1
M
Efs
[
xk−i− mod (k+1, M )
] (5)
when k ∈ {0, . . . ,K − 1}. Term mod(k + 1, M) comprises a271
nearest-neighbor (NN) interpolation ensuring that the numerical272
output image resolution matches that of the input. A syntheti-273
cally focused image where a = 1 is formed by274
E ′M/M [xk ] =
M −1∑
i=0
1
M
Efs
[
xk+i(m−1)
]
. (6)
Synthesis equations for different a = a′/M are retrieved by275
reverse-engineering. Probably, the most straightforward refo-276
cusing filter kernel function is given by277
E ′1/M [xk ] =
M −1∑
i=0
1
M
Efs [xk−i ] (7)
which computes refocusing slice a = 1/M . When implement-278
ing (7) as an FIR filter, it becomes obvious that the number of279
filter taps amounts to M . A VHDL implementation using this280
filter type with M = 5 is provided in supplementary material. In281
the following, we demonstrate a refocusing hardware architec-282
ture that is adapted to an SPC with M = 3. Then, a photograph283
refocused with a = 2/3 is computed by284
E ′2/3 [xk ] =
3−1∑
i=0
1
3
Efs
[
xk−i+ | mod (k+1, 3)/3−1|×(i−1)
] (8)
Fig. 3. 1-D semisystolic FIR filter for sub-pixel shift a = 0/3.
where · is the ceiling and | · | the absolute value operator. An 285
exemplary step in the computation of E ′2/3 [xk ] would be 286
E ′2/3 [x3 ] =
1
3
Efs [x3 ] +
1
3
Efs [x2 ] +
1
3
Efs [x1 ]. (9)
Here, fractions 1/3 can be regarded as multipliers, denoted ash0 , 287
which are identical for each pixel such that h0 = 1/M . On the 288
condition that incoming images are underexposed and clipping 289
is prevented, it is noteworthy that multipliers are redundant and 290
thus can be left out. 291
A. Semisystolic Modules 292
Equations (5)–(8) are implemented with a systolic filter de- 293
sign. Systolic arrays broadcast input data to many processing 294
elements (PEs). As shown, all wired connections in a systolic 295
filter contain at least one latch driven by the same clock signal. 296
semisystolic designs omit these latches. All of the remaining 297
designs that we consider are semisystolic, but latches can be 298
added for systolic FPGA implementation purposes. Descriptive 299
information about systolic arrangements can be found in [26]. 300
A positive side effect of the systolic filter is that it can be 301
exploited for an NN-interpolation in microimages. By letting 302
the upsampling factor be the number of microimage samples 303
M , the resolution loss in integral projection is compensated, 304
since incoming and outgoing resolution are the same. Naturally, 305
the interpolation method can be more sophisticated, which in 306
turn requires intermediate calculations, causing delays and an 307
increasing number of occupied logic gates. Closer inspection of 308
(6) reveals that pixels that need to be integrated are interlaced. 309
Thereby, gaps between merged pixels grow with ascending a 310
and extend the filter length. The omission of pixels within gaps 311
is realized with switches. A switch-controlled semisystolic FIR 312
filter design of (5) with multiplier h0 is depicted in Fig. 3. In 313
this design, switch states are controlled by bits in a 2-D vector 314
field denoted as s(a, w , p) that is given by 315
s(0/3, w , p) =
⎡
⎣
1 0 0
0 1 0
0 0 1
⎤
⎦ (10)
if a = 0/3. Depending on refocusing parameter a, switch state 316
matrices s(a, w , p) contain binary numbers with columns indexed 317
by w for the state of each switch in the FIR filter and with rows 318
indexed by p, which loads a new row of switch states when 319
IEE
E P
ro
of
HAHNE et al.: REAL-TIME REFOCUSING USING AN FPGA-BASED STANDARD PLENOPTIC CAMERA 5
Fig. 4. Timing diagram of FIR filter module with a = 0/3.
Fig. 5. 1-D semisystolic FIR filter for sub-pixel shift a = 1/3.
incremented. In addition, a write enable switch helps to prevent320
intermediate falsified values from being streamed out.321
For better comprehension, a timing diagram in Fig. 4 visual-322
izes the computational concept of the FIR design from Fig. 3.323
Here, the pixel clock signal is given as PCLK. Furthermore, the324
proposed architecture employs the doubled pixel clock PCLKx2325
with a time period TPCLKx2 = TPCLK/2 to shift and add pixel val-326
ues in a single pixel clock cycle TPCLK. It is also seen that a new327
row of switch states is called by incrementing p every pixel328
clock cycle. Numbers in the data streams represent unsigned329
decimal 8-bit gray-scale values, which are multiplied with h0 =330
1/3. Pixel colors match those of the SPC ray model in Fig. 1331
representing chief ray positions in microimages with M = 3.332
Orange color highlights interim results and red signifies 1-D re-333
focused output data. Oval circles indicate that the sum of divided334
microimage pixels is reflected in the output pixel E′0/3 [xk ]. The335
filter includes an NN-interpolation upsampling the micro image336
resolution by factor 3. To refocus with a = 1/3, another FIR337
filter module is conceived based on (7) and depicted in Fig. 5. In338
reference to the previous FIR filter where a = 0/3, it becomes339
obvious that the arrangements are identical except for different340
switch states. The switch state matrix s(1/3, w , p) is given by341
s(1/3, w , p) =
⎡
⎣
1 1 1
1 1 1
1 1 1
⎤
⎦ (11)
which means that switches remain closed at all times. A cor-342
responding timing diagram is shown in Fig. 6. Fig. 7 depicts343
Fig. 6. Timing diagram of FIR filter module with a = 1/3.
Fig. 7. 1-D semisystolic FIR filter for sub-pixel shift a = 2/3.
Fig. 8. Timing diagram of FIR filter module with a = 2/3.
an FIR filter according to (8), which occupies more PEs due 344
to the fact that the distance between added pixels grows. The 345
corresponding switch state matrix s(2/3, w , p) is as follows: 346
s(2/3, w , p) =
⎡
⎣
0 0 1 1 1
0 1 1 1 0
1 1 1 0 0
⎤
⎦ (12)
IEE
E P
ro
of
6 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
Fig. 9. Parallelized 2-D processing module array with ι = 3.
producing a filter behavior shown in Fig. 8. As Fig. 7 demon-347
strates, a large 1-D semisystolic filter module may imply long348
wires when broadcasting multiplier outputs. Long wires would349
cause a low-pass filter behavior in the signal transmission, which350
affects the readability of falling and rising edges and therefore351
has to be avoided. To keep wires short in the broadcast net,352
incoming bit words can be distributed to several synchronized353
latches (buffers) before being merged in adders.354
B. 2-D Module Array355
The proposed FIR filter modules process data in 1-D and thus356
in horizontal or vertical directions only. Fig. 9 shows a 2-D357
construct of 1-D semisystolic processor modules to accomplish358
refocusing by processing data in both dimensions. In this exam-359
ple, the degree of parallelization amounts to ι = 3, but could be360
scaled as desired until limits are reached (ι = L for image rows,361
ι = K for image columns).362
The data flow in Fig. 9 is described in the following. First,363
pixels coming from the sensor are fed into horizontal processor364
blocks representing semisystolic FIR filter modules as proposed365
in the previous section. All semisystolic processor modules are366
identical whereas the type relies on the refocusing parameter a.367
In the second stage, horizontally processed data rows E′a [xk , yl ]368
are delayed using skewed registers and assigned to another ar-369
rangement of semisystolic modules making it possible to form370
an incoming image column (e.g., E′a [x0 , yl ]). Here, demulti-371
plexers are driven by a pixel counter to assist in the correct372
assignment of pixels values. This assures that pixels from dif-373
ferent rows sharing index k are sent to the same vertical pro-374
cessing unit that produces an image column (e.g., E′′a [x0 , yl ])375
of the final refocused image. For synchronization purposes, an376
additional array of skewed registers can be optionally placed377
behind column processor blocks.378
In order to estimate the computation time, it is assumed here-379
after that the hardware system refers to the ideal case of maxi-380
mum parallelization where ι = L or ι = K for each dimension,381
respectively. Besides, it is supposed that color channels are also382
parallelized causing no extra time delay. The shift and integra-383
tion for a single output pixel refocused with a = 1/M takes M384
pixel clock cycles in 1-D when using twice the pixel clock to385
process them. Taking this as an example, the overall number of386
steps η to compute a single image E′′1/3 with K-by-L resolution387
TABLE I
BENCHMARK OF PROPOSED ARCHITECTURE
is given by 388
η = 2(Λ + M) + 2(K − 1) + L − 1 (13)
where Λ represents a single clock cycle step to compute the 389
mathematical product of an incoming pixel value. The total 390
computation time O for a single image can be obtained by 391
O(η) = η × TPCLK . (14)
This duration reflects the theoretical time that elapsed from the 392
moment the first pixel Efs [xk , yl ] entered the logic gate until 393
the final output pixel E ′′a [xk , yl ] is available. When pipelining 394
the data stream, output pixels of a subsequent image arrive di- 395
rectly after that letting the overall computation time for a single 396
frame be represented by the delay time of the computational fo- 397
cusing system. Once the first refocused photograph is received, 398
the number of remaining computational steps ηsub for every 399
following image amounts to: 400
ηsub = L − 1 + K − 1 . (15)
To assess performance limits of the presented architecture, we 401
performed a benchmark comparison between this approach, 402
the FPGA-based implementation of Pe´rez et al. [7], and a 403
GPU-based approach [22]. In this comparison, a 3201-by-3201 404
pixel image (K = L = 3201) with 291-by-291 microlenses was 405
computationally refocused in 105.9 ms at 100-MHz clock fre- 406
quency. Thereby, the microimage resolution is M = 11 and 407
the output image resolution amounts to 589-by-589, which 408
is less than 1/6 of the incoming image. Conversely, the 409
proposed semisystolic method numerically preserves the in- 410
coming spatial resolution by employing an NN-interpolation 411
in η = 1 + 11 + 3200 + 1 + 11 + 3200 + 3200 steps yielding 412
O(η) = 96.2μs computation time for a single frame when run- 413
ning at 100 MHz pixel clock. Each subsequent frame, how- 414
ever, can be processed in ηsub = 3200 + 3200 steps, which is 415
available at every O(ηsub) = 64μs. In comparison, an iden- 416
tical implementation based on the GPU implementation by 417
Lumsdaine et al. [22] takes approximately 1.38 ms on aver- 418
age, whereas a MATLAB implementation takes approximately 419
12.1 s per image on average as seen in the overview in Table I. 420
In this comparison, we employed the Spartan-6 XC6SLX45 421
chip using the ISE WebPACK design software from Xilinx. 422
The refocusing shader were executed on a Fermi architecture 423
GeForce 480M GTX with 2 GB of GDDR5 RAM running at 424
1200 MHz, connected to a 256 bit bus [22]. For the CPU en- 425
vironment, we used MATLAB 7.11.0.584 (R2010b) on an Intel 426
Core i7-3770 CPU @ 3.40 GHz without multithreading. 427
IV. VALIDATION 428
In this section, we evaluate the functionality of the proposed 429
FPGA-based refocusing hardware design. For that purpose, the 430
IEE
E P
ro
of
HAHNE et al.: REAL-TIME REFOCUSING USING AN FPGA-BASED STANDARD PLENOPTIC CAMERA 7
Fig. 10. Block diagram (borrowed from [6]) for experimental validation.
Single arrows denote serialized whereas three arrows indicate paral-
lelized data streams. Row buffers are employed to simulate data paral-
lelization in the experiment.
TABLE II
UTILIZATION SUMMARY FOR XC6SLX45–CSG324
VHSIC HDL (VHDL) is used to configure the FPGA where VH-431
SIC stands for very high speed integrated circuit. A schematic432
file, generated from a VHDL compiler, is then flashed onto433
the FPGA chip model XC6SLX45. Fig. 10 contains a block434
diagram illustrating the implemented processing architecture435
used to validate the design proposed in the previous section.436
The FPGA board features high-definition multimedia interface437
(HDMI) connectors such that video frame transmission is ac-438
complished using the transition minimized differential signaling439
(TMDS) protocol. TMDS receiver and transmitter designs have440
been integrated on the FPGA to fulfill deserialization, serial-441
ization just as decoding and encoding tasks. Off-chip memory442
is used for buffering decoded and serialized video frames out-443
side the FPGA since the amount of image data exceeds internal444
memory storage.445
In our implementation, a row of switch settings is loaded446
from a look-up table (LUT) every clock cycle starting from447
the first row again after the last one is reached. The switch-448
state LUTs can be stored in block random-access memorys449
(BRAMs), which are part of the FPGA. The integration of mul-450
tiplier h0 is also achieved using on-chip memory, making it451
called stored product. In accordance with the TMDS protocol452
specification, a decoded pixel value is of 8-bit depth per color453
channel, which yields a manageable number of 256 possible454
results when dividing by M . Thus, quotients can be precalcu-455
lated for a specific divisor M and stored in one BRAM per456
color channel for each image row. Note that these BRAMs are457
read-only memories.458
Fig. 11. Timing diagram example from ISE simulator.
Fig. 12. Refocused photographs using the proposed architecture. (a)
E ′0/3 . (b) E ′5/3 . (c) E ′′0/3 . (d) E ′′5/3 . (e) E ′′0/5 . (f) E ′′8/5 . Input and output
spatial image resolutions amount to 843-by-561 pixels with M = 3 in
(a)–(d). Intermediate horizontally processed images are shown in (a)
and (b) whereas (c) and (d) depict fully refocused images after horizontal
and vertical processing with varying a. In comparison, output images in
(e) and (f) with 1405-by-935 pixel resolution expose improved synthetic
blur by using a linear interpolation of whole microimages with M = 5.
Reducing a lightfield’s angular sampling rate M extends the depth of
field [8] and leads to blur aliasing in case of angular undersampling [15].
A screenshot from an exemplary timing diagram simula- 459
tion where a = 1/3 and TPCLK = 60 ns is provided in Fig. 11 460
with the code attached to this article. This VHDL-implemented 461
hardware simulation shows that the filter behaves as expected, 462
justifying the conceived architecture. PCLKx2 can be obtained 463
with a phase-locked loop (PLL). An overview of the imple- 464
mented design comprising a single FIR filter with a = 1/5 is 465
presented in Table II where it can be seen that inputs/outputs 466
(IOs) and PLLs make up by far most of the power consump- 467
tion. This is due to the included HDMI transceiver, memory 468
controller block (MCB) and color conversion modules. Parts 469
IEE
E P
ro
of
8 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
Fig. 13. (a) NN interp. E ′′5/5 (while refocusing). (b) NN interp. E ′′4/5
(while refocusing). (c) Lin. interp. E ′′5/5 (while refocusing). (d) NN interp.
E ′′5/5 (after refocusing). (e) NN interp. E ′′6/5 (while refocusing). (f) Lin.
interp. E ′′5/5 (after refocusing). Resolution comparison where (a), (c),
(d) and (f) show the same region refocused with a = 5/5 using different
interpolation techniques during and after shift and integration. Images
in (b) and (e) are NN-interpolated versions with varying a indicating
significant variation of the spatial resolution when compared with (a) and
(d). Effective resolution is more consistent when using linear interpolation
[e.g., compare (d), (e), and (f)].
of these modules may be omitted or replaced by on-board470
integrated circuits (ICs) in a prototyping stage. Furthermore,471
Table II gives indication that adding more FIR filters for full472
parallalization (maximum L and K) is noncritical to power, but473
may be limited to the number of logic slices in a Spartan-6474
device.475
Presented refocusing synthesis formulas require all microim-476
ages to be of a consistent size. This is not the case, however,477
in raw lightfield photographs. As indicated with the experimen-478
tal architecture in Fig. 10, microimage cropping remains an479
external process performed prior to streaming the data to the480
FPGA. Embedding this process on an FPGA is essential for481
prototyping, but left for future work. To comply with FIR filter482
designs in Section III, the microimage size is reduced to M = 3483
and M = 5 for comparison. Lightfield images have been ac-484
quired by our custom-built plenoptic camera with an MLA of485
281 microlenses per row and 188 per column. Insightful details486
on the camera calibration can be found in [27].487
Fig. 12 depicts refocused photographs computed by the pro-488
posed 2-D module array to accomplish real-time refocusing.489
Intermediate results after processing images in a horizontal di-490
rection are seen in Fig. 12(a) and (b). Their fully refocused491
counterparts are found in Fig. 12(c) and (d). Closer inspection492
of Fig. 12(d) indicates aliasing in blurred regions. This is due493
to an undersampled directional domain as there are only 3-by-494
3 samples per microimage (M = 3) in the incoming lightfield495
capture. Aliasing in synthetic image blur is an observation Ng496
already pointed out in his thesis [15]. To combat the aliasing 497
problem, the author suggests to sufficiently increase the mi- 498
croimage sampling rate M . Fig. 12(e) and (f) shows refocused 499
images obtained from a raw capture with a native microimage 500
resolution of 5-by-5 pixels (M = 5) using a linear interpola- 501
tion instead of NN. There, it can be seen that aliasing artifacts 502
are satisfyingly suppressed. A comparison of output image res- Q4503
olutions using the inherent NN-interpolation of proposed FIR 504
filters is provided in Fig. 13. Results in Fig. 13(a)–(f) suggest 505
that interpolating microimages while refocusing with a ∈ Z 506
using (6) corresponds to a conventional 2-D image interpola- 507
tion. On the contrary, an effective resolution enhancement can 508
be observed when comparing Fig. 13(a) where a = 5/5 with 509
Fig. 13(b) where a = 4/5, which are both computed from the 510
same raw image using NN-interpolation. Given that respective 511
objects are acceptably well covered by their depth of field and 512
exhibit best focus, it is possible to state that improved resolu- 513
tion is obtained by refocusing with noninteger numbers (a ∈ Z). 514
This effective resolution variation is a consequence of the mi- 515
croimage repetition and the interleaving filter kernel for the 516
refocusing synthesis yielding identical values for adjacent out- 517
put pixels when a ∈ Z, but varying intensities for contiguous 518
pixels if a ∈ R. This can be seen by inspecting output data 519
streams E ′a [xk ] of the timing diagrams in Figs. 4 and 6. To work 520
toward consistency in spatial resolutions for varying a, it is thus 521
essential to employ linear interpolation prior to distributing mi- 522
croimage pixels through the FIR broadcast net. A positive side 523
effect in upsampling microimages is that refocused image slices 524
E ′′a [xk , yl ] are not only interpolated in spatial-domain, but also 525
subsampled along depth as demonstrated in [8]. 526
V. CONCLUSION 527
This paper demonstrated methods to derive optimized FIR 528
refocusing filter kernels for a time- and cost-efficient hardware 529
implementation. Simulating the conceived architecture proved 530
that real-time refocusing can be accomplished with a compu- 531
tation time of 96.24μs per frame reducing the delay time by 532
99.91 % in comparison with a previous state-of-the-art attempt. 533
By interpolating microimages, it was shown how to retain the 534
numerical sensor resolution in refocused photographs. The pro- 535
posed architecture can serve as a groundwork for application- 536
specific integrated circuit chips. 537
A limitation of the results is that timing delays have been sim- 538
ulated and need to be verified using chip analyzing tools. As the 539
number of required PEs grows with higher image resolutions, it 540
may exceed the gate count capacity of the FPGA in full paral- 541
lelization. Besides this, care needs to be taken to prevent long 542
wires in the broadcast net. For the hardware system’s reliabil- 543
ity, it is also recommended to convert semisystolic arrays into a 544
full-systolic architecture. To achieve consistency in microimage 545
size (M ), cropping of the same has to be integrated as a preced- 546
ing processing stage on the FPGA chip. Furthermore, a bilinear 547
interpolation ought to be implemented to replace microimage 548
repetition (NN-interpolation) and work toward consistent effec- 549
tive resolutions in refocused images, although this will cause 550
additional delays. 551
IEE
E P
ro
of
HAHNE et al.: REAL-TIME REFOCUSING USING AN FPGA-BASED STANDARD PLENOPTIC CAMERA 9
A competitive design approach may conceive a refocusing552
architecture based on the FSP theorem. It is, however, expected553
that the Fourier transform produces larger time delays. A con-554
siderable alternative to an FPGA-based implementation is the555
employment of a GPU as this takes less design effort, however,556
by inducing larger delays and more power consumption.557
Deployment of proposed design to an FPC is thought to be558
impractical, since there is a fundamental difference between559
SPC and FPC with regards to the optical design (number of560
microlenses and focus position of MLA). On the algorithmic561
level, SPC refocusing is a pixel-based integration whereas an562
FPC requires the integration of overlapping areas of shifted563
microimage patches such that a refocusing algorithm has to be564
designed specific to the type of plenoptic camera.565
REFERENCES566
[1] A. Isaksen, L. McMillan, and S. J. Gortler, “Dynamically reparameterized567
light fields,” in Proc. 27th Ann. Conf. Comput. Graph. Interactive Tech.,568
ser. SIGGRAPH ’00. New York, NY, USA, 2000, pp. 297–306. [Online].569
Available: http://dx.doi.org/10.1145/344779.344929570
[2] R. Ng, M. Levoy, M. Bre`dif, G. Duval, M. Horowitz, and P. Hanrahan,571
“Light field photography with a hand-held plenoptic camera,” Stanford572
University, Tech. Rep. CTSR 2005-02, 2005.573
[3] R. Ng, “Fourier slice photography,” ACM Trans. Graph., vol. 24,574
no. 3, pp. 735–744, Jul. 2005. [Online]. Available: http://doi.acm.org/575
10.1145/1073204.1073256576
[4] A. Lumsdaine and T. Georgiev, “Full resolution lightfield rendering,”577
Adobe Systems, Inc., Tech. Rep., Jan. 2008.
Q5
578
[5] C. Perwass and L. Wietzke, “Single lens 3D-camera with extended depth-579
of-field,” Proc. SPIE, vol. 8291, February 2012. [Online]. Available:580
http://dx.doi.org/10.1117/12.909882
Q6
581
[6] C. Hahne and A. Aggoun, “Embedded FIR filter design for real-time re-582
focusing using a standard plenoptic video camera,” Proc. SPIE, vol. 9023,583
2014. [Online]. Available: http://hdl.handle.net/10547/313167584
[7] J. Pe´rez, E. Magdaleno, F. Pe´rez, M. Rodrı´guez, D. Herna´ndez, and585
J. Corrales, “Super-resolution in plenoptic cameras using FPGAs,”586
Sensors, vol. 14, no. 5, pp. 8669–8685, 2014. [Online]. Available:587
http://www.mdpi.com/1424-8220/14/5/8669588
[8] C. Hahne, A. Aggoun, V. Velisavljevic, S. Fiebig, and M. Pesch,589
“Refocusing distance of a standard plenoptic camera,” Opt. Exp.,590
vol. 24, no. 19, pp. 21 521–21 540, Sep. 2016. [Online]. Available:591
http://www.opticsexpress.org/abstract.cfm?URI=oe-24-19-21521592
[9] M. Levoy and P. Hanrahan, “Light field rendering,” Stanford University,593
Stanford, CA, USA, Tech. Rep., 1996.
Q7
594
[10] J. C. Yang, M. Everett, C. Buehler, and L. McMillan, “A real-time dis-595
tributed light field camera,” in Proc. 13th Eurographics Workshop Render-596
ing. Aire-la-Ville, Switzerland, Switzerland, 2002, pp. 77–86. [Online].597
Available: http://dl.acm.org/citation.cfm?id=581896.581907598
[11] K. Venkataraman et al., “PiCam: An ultra-thin high performance mono-599
lithic camera array,” ACM Trans. Graph., vol. 32, no. 6, pp. 166:1–166:13,600
Nov. 2013. [Online]. Available: http://doi.acm.org/10.1145/2508363.601
2508390602
[12] G. Lippmann, “ ´Epreuves re´versibles donnant la sensation du relief,”603
Acade´mie Des Sci., pp. 446–451, Mar. 1908.604
[13] E. H. Adelson and J. Y. Wang, “Single lens stereo with a plenoptic camera,”605
IEEE Trans. Pattern Anal. Mach. Intell., vol. 14, no. 2, pp. 99–106, Feb.606
1992.607
[14] E. H. Adelson and J. R. Bergen, “The plenoptic function and the elements608
of early vision,” in Proc. Comput. Models Visual Process., Cambridge,609
MA, USA: MIT Press, 1991, pp. 3–20.610
[15] R. Ng, “Digital light field photography,” Ph.D. dissertation, Stanford Uni-611
versity, Stanford, CA, USA, July 2006.612
[16] T. Georgiev and A. Lumsdaine, “The focused plenoptic camera,” in Proc.613
Int. Conf. Comput. Photography, 2009.614
[17] A. Veeraraghavan, R. Raskar, A. Agrawal, A. Mohan, and J. Tum-615
blin, “Dappled photography: Mask enhanced cameras for heterodyned616
light fields and coded aperture refocusing,” ACM Trans. Graph.,617
vol. 26, no. 3, Jul. 2007. [Online]. Available: http://doi.acm.org/10.1145/618
1276377.1276463619
[18] Z. Xu, J. Ke, and E. Y. Lam, “High-resolution lightfield photography using 620
two masks,” Opt. Exp., vol. 20, no. 10, pp. 10 971–10 983, May 2012. 621
[Online]. Available: http://www.opticsexpress.org/abstract.cfm?URI=oe- 622
20-10-10971 623
[19] C. Hahne, A. Aggoun, V. Velisavljevic, S. Fiebig, and M. Pesch, “Baseline 624
and triangulation geometry in a standard plenoptic camera,” Int. J. Comput. 625
Vis., vol. 126, no. 1, pp. 21–35, Jan. 2018. 626
[20] Lytro, Inc., “Lytro-press releases,” 2016. [Online]. Available: 627
https://www.lytro.com/press/releases/lytro-brings-revolutionary-light- 628
field-technology-to-film-and-tv-production-with-lytro-cinema. Accessed 629
on: Aug. 24, 2016. 630
[21] Z. Mhabary, O. Levi, E. Small, and A. Stern, “Fast and exact method for 631
computing a stack of images at various focuses from a four-dimensional 632
light field,” J. Electron. Imaging, vol. 25, no. 4, 2016, Art. no. 043002. 633
[Online]. Available: http://dx.doi.org/10.1117/1.JEI.25.4.043002 634
[22] A. Lumsdaine, G. Chunev, and T. Georgiev, “Plenoptic rendering with 635
interactive performance using gpus,” Proc. SPIE, vol. 8295, pp. 829 513– 636
829 513–15, 2012. [Online]. Available: http://dx.doi.org/10.1117/12. 637
909683 638
[23] D. G. Bailey, Design for Embedded Image Processing on FPGAs. Hobo- 639
ken, NJ, USA: Wiley, 2011. 640
[24] L. F. Rodrı´guez-Ramos, Y. Marı´n, J. J. Dı´az, J. Piqueras, J. Garcı´a- 641
Jime´nez, and J. M. Rodrı´guez-Ramos, “FPGA-based real time processing 642
of the plenoptic wavefront sensor,” in Proc. Adaptative Opt. Extremely 643
Large Telescopes, 2010, Paper 7007. 644
[25] R. Wimalagunarathne, A. Madanayake, D. G. Dansereau, and L. T. Bruton, 645
“A systolic-array architecture for first-order 4-D IIR frequency-planar dig- 646
ital filters,” in Proc. IEEE Int. Symp. Circuits Syst., May 2012, pp. 3069– 647
3072. 648
[26] Xilinx, Inc., “A 1D Systolic FIR,” 2002. [Online]. Available: 649
http://www.iro.umontreal.ca/aboulham/F6221/Xilinx.htm. Accessed on: 650
Dec. 12, 2015. 651
[27] C. Hahne, “The standard plenoptic camera – Applications of a geometrical 652
light field model,” Ph.D. dissertation, University of Bedfordshire, Luton, 653
U.K., Jan. 2016. 654
Christopher Hahne received the B.Sc. degree 655
from the University of Applied Sciences, Ham- 656
burg, Germany, in 2012, and the Doctoral degree 657
from the University of Bedfordshire, Luton, U.K., 658
in 2016, in a bursary-funded Ph.D. program. 659
He is affiliated with BASF subsidiary trinamiX 660
GmbH, Ludwigshafen, Germany, where he cur- 661
rently works as the Manager of Simulation & 662
Software on adaptive three-dimensional sens- 663
ing. He worked at R&D departments of Ro- 664
hde & Schwarz GmbH & Co. KG, Munich, Ger- 665
many, in 2010, and Arnold & Richter Cinetechnik GmbH & Co. KG, 666
Munich, Germany, in 2011. Subsequently, he became a Visiting Student 667
with Brunel University, London, U.K., in 2012. Q8668
669
Andrew Lumsdaine (SM’15) is an interna- 670
tionally recognized expert in the area of 671
high-performance computing who has made 672
important contributions in many of the consti- 673
tutive areas of HPC. In particular, he has con- 674
tributed in the areas of HPC systems, program- 675
ming languages, software libraries, and perfor- 676
mance modeling. His work in HPC has been 677
motivated by data-driven problems (large-scale 678
graph analytics), as well as more traditional com- 679
putational science problems. In addition, outside 680
of the realm of HPC, he has done seminal work in the area of computa- 681
tional photography and plenoptic cameras. In his career, he has authored 682
or coauthored more than 200 articles in top journals and conferences and 683
holds 15 patents. He has also contributed important software artifacts 684
to the research community, especially in the area of message passing 685
interface (MPI). He is active in a number of standardization efforts with 686
important contributions to the MPI specification, the C++ programming 687
language, and the Graph 500. Q9688
689
IEE
E P
ro
of
10 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
Amar Aggoun received the “Ingenieur d’e´tat”690
degree in electronics engineering from Ecole691
Nationale Polytechnique d’Alger, Algiers,692
Algeria, and the Ph.D. degree in electronic693
engineering from the University of Nottingham,694
Nottingham, U.K.695
He is currently the Head of School of Math-696
ematics and Computer Science and Professor697
of Visual Computing with the University of698
Wolverhampton, Wolverhampton, U.K. His699
academic carrier started at the University of700
Nottingham where he held the positions of Research Fellow in low701
power DSP architectures and Visiting Lecturer in electronic engineering702
and mathematics. In 1993, he joined De Montfort University as a703
Lecturer and progressed to the position of Principal Lecturer in 2000.704
In 2005, he joined Brunel University as a Reader with Information705
and Communication Technologies. From 2013 to 2016, he was at the706
University of Bedfordshire as the Head of School of Computer Science707
and Technology. He was also the Director of the Institute for Research708
in Applicable Computing which oversees all the research within the709
School. His research is mainly focused on three-dimensional (3-D)710
Imaging and Immersive Technologies and he successfully secured711
and delivered research contracts worth in excess of 6.9M, funded by712
the research councils UK, Innovate UK, the European commission713
and industry. Amongst the successful project, he was the initiator and714
the principal coordinator and manager of a project sponsored by the715
EU-FP7 ICT-4-1.5-Networked Media and 3-D Internet, namely live716
immerse video-audio interactive multimedia. He holds 3 filed patents,717
authored or coauthored more than 200 peer-reviewed journals and718
conference publications and contributed to two white papers for the719
European Commission on the future internet.720
Dr. Aggoun also served as an Associate Editor for the IEEE/OSA721
JOURNAL OF DISPLAY TECHNOLOGIES.Q10 722
723
Vladan Velisavljevic (M’06–SM’12) received 724
the Ph.D. degree in the field of signal and image 725
processing from ´Ecole Polytechnique Fe´de´rale 726
de Lausanne (EPFL), Lausanne, Switzerland, in 727
2005. 728
He is a Reader (Associate Professor) in visual 729
systems engineering with the School of Com- 730
puter Science and Technology and the Head of 731
the Centre for Research in Signals, Sensors and 732
Wireless Technology, University of Bedfordshire, 733
Luton, U.K., since 2011. Previously, he was a Se- 734
nior Research Scientist with Deutsche Telekom Laboratories, University 735
of Technology Berlin, Germany, in 2006–2011, and a Doctoral Assistant 736
with LCAV, EPFL, Switzerland, in 2001–2005. He has authored or coau- 737
thored more than 60 peer-reviewed journal and conference publications 738
and two book chapters. 739
Dr. Velisavljevic serves as an Associate Editor for Elsevier Signal 740
Processing: Image Communication and for IET Journal of Engineering 741
and he is a Co-Chair of the IEEE ComSoc MMTC Interest Group on 3-D 742
Processing and Communications. He was a General Chair of the IEEE 743
MMSP 2017, a Lead Guest Editor for special issue on Visual Signal Pro- 744
cessing for Wireless Networks at the IEEE JOURNAL OF SELECTED TOPICS 745
IN SIGNAL PROCESSING in February 2015 and special session organizer 746
at 3DTV-Con 2015 and IEEE ICIP 2011. He was also Associate Editor 747
for IEEE ComSoc MMTC R-Letters and Member of the Review Board for 748
the IEEE ComSoc Multimedia Communications TC. He has served as a 749
TPC member and reviewer for a number of conferences and journals. 750
751
IEE
E P
ro
of
QUERIES 752
Q1. Author: Please provide expansion for “FPGA”. 753
Q2. Author: Please supply index terms/keywords for your paper. To download the IEEE Taxonomy go to 754
http://www.ieee.org/documents/taxonomy_v101.pdf. 755
Q3. Author: Please check whether the affiliation of Vladan Velisavljevic is okay as set. 756
Q4. Author: Please check whether the captions of Figs. 12 and 13 are okay as set. 757
Q5. Author: Please provide location and technical report number for Ref. [4]. 758
Q6. Author: Please provide page range for Refs. [5], [6], and [16]. 759
Q7. Author: Please provide technical report number for Ref. [9]. 760
Q8. Author: Please provide the subject in which Christopher Hahne received the B.Sc. and doctoral degrees. 761
Q9. Author: Please provide educational detail in the biography of Andrew Lumsdaine. 762
Q10. Author: Please provide the year in which Amar Aggoun received the “Ingenieur d’e´tat” and Ph.D. degrees. 763
IEE
E P
ro
of
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 1
Real-Time Refocusing Using an FPGA-Based
Standard Plenoptic Camera
1
2
Christopher Hahne , Andrew Lumsdaine, Senior Member, IEEE, Amar Aggoun,
and Vladan Velisavljevic, Senior Member, IEEE
3
4
Abstract—Plenoptic cameras are receiving increased5
attention in scientific and commercial applications because6
they capture the entire structure of light in a scene, en-7
abling optical transforms (such as focusing) to be applied8
computationally after the fact, rather than once and for all at9
the time a picture is taken. In many settings, real-time inter-10
active performance is also desired, which in turn requires11
significant computational power due to the large amount12
of data required to represent a plenoptic image. Although13
GPUs have been shown to provide acceptable performance14
for real-time plenoptic rendering, their cost and power15
requirements make them prohibitive for embedded uses16
(such as in-camera). On the other hand, the computation17
to accomplish plenoptic rendering is well structured,18
suggesting the use of specialized hardware. Accordingly,19
this paper presents an array of switch-driven finite impulse20
response filters, implemented with FPGA to accomplish
Q1
21
high-throughput spatial-domain rendering. The proposed22
architecture provides a power-efficient rendering hardware23
design suitable for full-video applications as required in24
broadcasting or cinematography. A benchmark assess-25
ment of the proposed hardware implementation shows that26
real-time performance can readily be achieved, with a one27
order of magnitude performance improvement over a GPU28
implementation and three orders of magnitude performance29
improvement over a general-purpose CPU implementation.
Q2
Q3
30
Index Terms—.31
I. INTRODUCTION32
OVER the last two decades, several studies have reported33 methods to computationally render varyingly focused im-34
ages from a single lightfield photograph [1]–[8]. In addition to35
Manuscript received June 11, 2017; revised January 10, 2018; ac-
cepted March 1, 2018. This work was supported by the EU’s 7th
Framework Programme under Grant EU-FP7 ICT-2010-248420. (Cor-
responding author: Christopher Hahne.)
C. Hahne is with the trinamiX GmbH (BASF), Ludwigshafen 67063,
Germany (e-mail: info@christopherhahne.de).
A. Lumsdaine is with the Northwest Institute for Advanced Comput-
ing, University of Washington, Seattle, WA 98195 USA (e-mail: al75@
uw.edu).
A. Aggoun is with the School of Mathematics and Computer Science,
University of Wolverhampton, Wolverhampton WV1 1LZ, U.K. (e-mail:
a.aggoun@wlv.ac.uk).
V. Velisavljevic is with the School of Computer Science and Technol-
ogy, University of Bedfordshire, Luton LU1 3JU, U.K. (e-mail: vladan.
velisavljevic@beds.ac.uk).
This paper has supplementary downloadable material available at
http://ieeexplore.ieee.org, provided by the author.
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIE.2018.2818644
spatial information, lightfields contain directional information, 36
acquired by capturing an array of two-dimensional (2-D) spatial 37
images with either multiple conventional cameras [1], [9]–[11] 38
or by attaching a micro lens array (MLA) to a single image 39
recording device [2], [12], [13]. In science, lightfield cameras 40
are also known as plenoptic cameras derived from the Latin 41
and Greek roots meaning “full view” [13], [14]. For industrial 42
applications, MLAs are preferred to simple pinholes or coded- 43
aperture patterns due to improved light-gather capability and 44
to multiaperture systems due to compact form-factor. A study 45
carried out by Ng et al. [15] has found that the maximum direc- 46
tional information is recorded when placing the microlenses one 47
focal length away from the image sensor. However, a follow-up 48
study reinvestigated this and showed that it is possible to flex- 49
ibly tradeoff directional and spatial resolution by shifting the 50
MLA with respect to the sensor [4], [16]. In this paper, we refer 51
to the former design as the standard plenoptic camera (SPC) 52
and the latter as the focused plenoptic camera (FPC). While re- 53
searchers have developed a number of approaches to plenoptic 54
camera design [17], [18], the rendering (or focusing) process 55
remains computationally intensive, posing a core challenge to 56
the computer vision field. 57
One motivating industrial performance-sensitive application 58
for plenoptic cameras is in cinematography, where the use of 59
plenoptic source video can greatly enhance the flexibility and 60
creativity in capture and production. For example, since the opti- 61
cal parameters are not irrevocably set at the time the video is cap- 62
tured, focus or depth of field can easily be adjusted in postpro- 63
duction. Moreover, new creative effects can be applied, includ- 64
ing nonphysical optical effects. Plenoptic video can also be used 65
to create stereo pairs for three-dimensional (3-D) viewing—with 66
the important advantage over stereo capture that different videos 67
can be created for different devices, each having parallax suited 68
for the particular device [19]. Finally, 2-D and 3-D production 69
can use significantly different effects for directing the viewer’s 70
attention (depth of field is not as useful in 3-D as 2-D, for exam- 71
ple). With plenoptic source video, 2-D and 3-D can be rendered 72
from the same source, with different creative effects for each. 73
We note that Lytro, one of the earliest manufacturers of plenop- 74
tic cameras, has recently announced a video lightfield camera to 75
the broadcast and cinematography market [20]. In any of these 76
scenarios, high rendering performance is essential. For preview 77
and for postproduction, rendering of each video frame must be 78
accomplished at the video frame rate, regardless of the effects 79
and adjustments being applied. 80
0278-0046 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
IEE
E P
ro
of
2 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
An early attempt at high-performance rendering was based81
on the projection slice theorem, which rendered images with82
lower dimensional slices of the lightfield in the Fourier do-83
main [3], [21]. This procedure is also known as Fourier slice84
photography (FSP). Although FSP has the potential to be effi-85
cient when rendering a large number of focused images from86
the same lightfield, there are significant overheads in this ap-87
proach that limit its practical application. Real-time rendering88
in the spatial-domain has been achieved with graphical pro-89
cessing units (GPUs) [22], but the cost and power associated90
with GPUs make their use in embedded settings (for example)91
impractical. Accordingly, it is the goal of this study to devise92
and demonstrate a special-purpose hardware architecture that93
performs real-time rendering in the spatial-domain based on se-94
rially incoming video frames. We propose an array of semisys-95
tolic finite impulse response (FIR) filters designed for high data96
throughput. Moreover, we realize the rendering convolution ker-97
nel in FIR fashion by introducing switches to the filter distribu-98
tion network. For power efficiency and configuration flexibility,99
the proposed design is implemented with a field programmable100
gate array (FPGA). As distinguished from previous studies, our101
hardware design accomplishes a computation time of less than102
100 μs for a single refocused frame with 3201-by-3201 pixel res-103
olution when running at 100-MHz pixel clock frequency. This104
outperforms earlier studies in the field, which we further demon-105
strate with benchmarks against a GPU and a CPU MATLAB106
implementation.107
The organization of this paper is as follows. Section II presents108
recent developments in the field of FSP and SPC lightfield mod-109
eling to serve as a starting point for refocusing in spatial-domain.110
Section III imposes requirements on the filter module architec-111
ture and presents a solution based on switch-driven FIR filters.112
The proposed hardware design is examined in Section IV, us-113
ing a hardware description language (HDL) for FPGAs (see114
supplementary material) and by benchmarks with an alternative115
GPU-based implementation. Conclusions and suggestions for116
further work are presented in Section V.117
II. RELATED WORK118
A. Background119
A lightfield can be retrieved by light rays intersecting two120
consecutively-placed 2-D planes of known relative position [9].121
Intersections of a single ray at two 2-D planes yield four co-122
ordinates in total, thus making up a four-dimensional (4-D)123
light ray parametrization. Because of its simplicity, this concep-124
tual model has gained popularity among scientists in the field of125
computer vision. A related one-plane parameterization based on126
position and angle can also be used [4], [16]. In the celebrated127
work by Ng et al. [3], a raw captured 4-D lightfield is trans-128
formed to the Fourier domain to achieve refocusing using the129
projection-slice theorem. Unfortunately, the process of taking130
Fourier transforms, interpolating for slicing, and then taking in-131
verse transforms introduces significant computational overhead,132
making FSP unsuitable for real-time rendering. This assump-133
tion was confirmed by Mhabary et al. [21], who have worked to134
advance FSP by employing a fractional Fourier transform. How-135
ever, the authors conclude that the integral projection operator 136
in the spatial-domain is faster when computing only a single 137
refocused image from a lightfield. The suitability of refocusing 138
in the spatial-domain was further confirmed by Lumsdaine et al. 139
who demonstrated real-time rendering performance using GPU 140
hardware [22]. For these reasons, our approach in this paper is 141
based on rendering in the spatial-domain. 142
The main concept of computation time improvements using 143
FPGAs builds on the principle of parallelization and pipelin- 144
ing [23]. A pipeline comprises chained processor blocks fed 145
with serialized data that are processed sequentially. Speed up 146
is obtained by processing data chunks in one processor unit 147
while subsequent data chunks are handled in preceding units. 148
Hence, the benefit of pipelining is that serialized data chunks 149
are processed at the same time while processor units perform 150
different tasks. While data serialization limits a specific task 151
to be computed with one single operation at a time, e.g., one 152
pixel after another, parallelized data streams allow a comput- 153
ing system to perform at least two operations of the same type 154
simultaneously. Parallelization can be thought of as duplicat- 155
ing processor pipelines, which requires synchronized parallel 156
data streams as input signals. Letting the degree of paralleliza- 157
tion be ι, the computation time in image processing may be 158
minimized to O (K2/ι) if 2-D image dimensions consist of K 159
samples each and provided that both computation systems run at 160
the same clock frequency. Consequently, the one-dimensional 161
(1-D) parallelization limit is reached where ι = L for image 162
rows and ι = K for image columns, which is the ideal scenario 163
in terms of parallelizing data processes. 164
Early work in the field of embedded plenoptic imaging was 165
reported by Rodrı´guez-Ramos et al. [24], who employed an 166
FPGA to process plenoptic data with the aim of analyzing wave- 167
front measurements. Another interesting approach, reported by 168
Wimalagunarathne et al. [25], proposed a design to render com- 169
putationally focused photographs from a set of multiview im- 170
ages using infinite impulse response filters. Work on real-time 171
rendering from FPC captures was presented in [22]. The first 172
reported hardware design for performing real-time rendering 173
from SPC captures was presented by Hahne et al. [6]. Shortly 174
thereafter, Pe´rez et al. [7] published an article addressing the 175
same topic. The authors demonstrated significant computation 176
time improvements compared with run times based on a cen- 177
tral processing unit (CPU) system that was programmed using 178
an object-oriented language. A theoretical comparison of our 179
method with that of Pe´rez et al. [7] is carried out at the end of 180
Section III. 181
B. SPC Ray Model 182
Development of a computationally efficient refocusing algo- 183
rithm requires knowledge about the ray geometrical properties 184
in a plenoptic camera. To conceive a refocusing hardware archi- 185
tecture in spatial-domain, we employ a ray model reported by 186
Hahne et al. [8], which is based on paraxial optics. The model 187
is depicted in Fig. 1 and builds on the assumption that image 188
sensor plane and MLA are separated by one focal length fs such 189
that the MLA is focused to infinity, which is in accordance with 190
IEE
E P
ro
of
HAHNE et al.: REAL-TIME REFOCUSING USING AN FPGA-BASED STANDARD PLENOPTIC CAMERA 3
Fig. 1. SPC ray model (borrowed from [8]) with microlens chief rays
traveling through the MLA plane s and main lens plane U , which is
depicted as a thin lens. Lightfield intensities captured at the sensor plane
are denoted as Efs [sj , uc+ i ] for the 1-D case. Chief ray colors in a
microimage indicate angular samples uc+ i .
Ng’s concept of a plenoptic camera [15]. To understand light-191
field imaging in an SPC, as in the Lytro setup [20], one may192
regard a main lens image of an object plane to be focused on193
the MLA plane. In this case, the focused light rays converge to194
the microlens and diverge when leaving it to form a microimage195
(see Fig. 1). A pixelated light-sensitive detector placed behind196
the MLA captures angular portions of the incident-divergent197
beam. Each angular sample in this microimage corresponds to198
the same focused spatial point in space observed from different199
views. This point’s intensity is recovered when integrating all200
microimage samples.201
We denote a lightfield captured by an SPC in the follow-202
ing way. For clarity, only the horizontal cross-section is re-203
garded hereafter. In the angular domain u, we start counting204
samples from microimage centers (MICs), which serve as a ref-205
erence positions c = (M − 1)/2 where M denotes a consistent206
total number of samples for each microimage in one dimen-207
sion. Microimages are seen to be radially symmetric and hor-208
izontally indexed by c + i, with i ∈ [−c .. c]. Horizontal light-209
field positions are then given as [sj , uc+i ] with j as the 1-D210
Fig. 2. Processing requirements for the hardware architecture. The
diagram shows exemplary input illuminance values Efs (see Fig. 1)
subdivided into microimages sj and synthesized output values E ′a at a
desired refocused image plane a.
index of a respective micro lens sj . All microimages together 211
form a light field image with its cross-sectional representa- 212
tion Efs [sj , uc+i ] where Efs denotes a pixel’s illuminance. 213
As demonstrated in [8], a horizontal cross-section of a lightfield 214
image can be refocused by employing 215
E ′a [sj ] =
c∑
i=−c
1
M
Efs
[
sj+a(c−i) , uc+i
]
, a ∈ Q (1)
where a adjusts the synthetic focus. Equation (1) can also be 216
applied to the vertical dimension. 217
Since images acquired by an SPC do not feature the 218
Efs [sj , uc+i ] notation, it is convenient to define an index trans- 219
lation formula considering the lightfield photograph to be of two 220
regular sensor dimensions [xk , yl ] as if taken by a conventional 221
sensor. Indices are then converted by 222
k = j × M + c + i (2)
in the horizontal dimension meaning that [xk ] is formed by 223
[xj×M +c+i ] to replace [sj , uc+i ]. This concept of index trans- 224
lation may be similarly extended to the vertical domain. 225
III. FILTER DESIGN 226
An efficient hardware design that enables an FPGA to 227
refocus in real-time may be conceptualized on the basis of the 228
lightfield ray model presented in Section II. The upper data 229
line of Fig. 2 depicts discrete and quantized illuminance values 230
Efs [xk ] of a single horizontal row that is part of a calibrated 231
lightfield image. Lightfield calibration implies MIC detection 232
and rendering procedures to obtain a consistent microimage 233
size (M ). The computational refocusing synthesis given in 234
Section II reveals that pixels involved in the integration process 235
expose interleaved neighborhood relations, which exclusively 236
depend on a. This phenomenon is illustrated by the data flow 237
diagram in Fig. 2, where respective pixels are highlighted for 238
two exemplary refocusing settings: a = 0/3 and a = 2/3. Here, 239
each color corresponds to a chief ray in the model in Fig. 1, 240
with M = 3 where yellow represents the MIC pixel. In this 241
section, a hardware architecture is devised that accomplishes 242
signal processing according to (1) as depicted in Fig. 2. 243
On the supposition that a horizontal cross-section of a cap- 244
tured lightfield Efs [xk ] is a linear, time-invariant system, the 245
integral projection in (1) may be represented as a discrete FIR 246
IEE
E P
ro
of
4 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
convolution formula. Following the [sj , uc+i ] to [xk ] translation247
in Section II, 1-D refocusing can be given by248
E ′a [xk ] =
M −1∑
i=0
1
M
Efs
[
xk ′+i(aM −1)
]
, a ∈ Z (3)
with249
k′ = (k + 1) × M − 1 (4)
taking care of a correct integral projection, which inevitably250
reduces the number of samples in the rendered output image.251
Equation (3) aims at complying with the classical FIR filter no-252
tation, however with indices in subscripts for consistency rea-253
sons and to let x signify the domain and coordinate direction.254
Upon closer examination, one may notice that the impulse re-255
sponse is represented by a constant coefficient 1/M , which is a256
consequence of weighting pixels equally during the integration257
process. Note that i ∈ [0 ..M − 1] in the following.258
In contrast to (3), we seek to reproduce an output image with259
a resolution numerically equal to that of the raw sensor image.260
To compensate for sample reduction in the integral projection261
process, the overall sensor resolution may be retained by up-262
sampling the spatial-domain during image formation. Besides,263
it will be shown hereafter that our proposed upsampling scheme264
enables interpolation of refocused depth planes.265
To break down the complexity, we devise one filtering func-266
tion per refocusing slice a that qualifies for FIR filter implemen-267
tation. Regardless of the microimage resolution M , a filter that268
computes a refocusing slice with a = 0 in horizontal direction269
reads270
E ′0/M [xk ] =
M −1∑
i=0
1
M
Efs
[
xk−i− mod (k+1, M )
] (5)
when k ∈ {0, . . . ,K − 1}. Term mod(k + 1, M) comprises a271
nearest-neighbor (NN) interpolation ensuring that the numerical272
output image resolution matches that of the input. A syntheti-273
cally focused image where a = 1 is formed by274
E ′M/M [xk ] =
M −1∑
i=0
1
M
Efs
[
xk+i(m−1)
]
. (6)
Synthesis equations for different a = a′/M are retrieved by275
reverse-engineering. Probably, the most straightforward refo-276
cusing filter kernel function is given by277
E ′1/M [xk ] =
M −1∑
i=0
1
M
Efs [xk−i ] (7)
which computes refocusing slice a = 1/M . When implement-278
ing (7) as an FIR filter, it becomes obvious that the number of279
filter taps amounts to M . A VHDL implementation using this280
filter type with M = 5 is provided in supplementary material. In281
the following, we demonstrate a refocusing hardware architec-282
ture that is adapted to an SPC with M = 3. Then, a photograph283
refocused with a = 2/3 is computed by284
E ′2/3 [xk ] =
3−1∑
i=0
1
3
Efs
[
xk−i+ | mod (k+1, 3)/3−1|×(i−1)
] (8)
Fig. 3. 1-D semisystolic FIR filter for sub-pixel shift a = 0/3.
where · is the ceiling and | · | the absolute value operator. An 285
exemplary step in the computation of E ′2/3 [xk ] would be 286
E ′2/3 [x3 ] =
1
3
Efs [x3 ] +
1
3
Efs [x2 ] +
1
3
Efs [x1 ]. (9)
Here, fractions 1/3 can be regarded as multipliers, denoted ash0 , 287
which are identical for each pixel such that h0 = 1/M . On the 288
condition that incoming images are underexposed and clipping 289
is prevented, it is noteworthy that multipliers are redundant and 290
thus can be left out. 291
A. Semisystolic Modules 292
Equations (5)–(8) are implemented with a systolic filter de- 293
sign. Systolic arrays broadcast input data to many processing 294
elements (PEs). As shown, all wired connections in a systolic 295
filter contain at least one latch driven by the same clock signal. 296
semisystolic designs omit these latches. All of the remaining 297
designs that we consider are semisystolic, but latches can be 298
added for systolic FPGA implementation purposes. Descriptive 299
information about systolic arrangements can be found in [26]. 300
A positive side effect of the systolic filter is that it can be 301
exploited for an NN-interpolation in microimages. By letting 302
the upsampling factor be the number of microimage samples 303
M , the resolution loss in integral projection is compensated, 304
since incoming and outgoing resolution are the same. Naturally, 305
the interpolation method can be more sophisticated, which in 306
turn requires intermediate calculations, causing delays and an 307
increasing number of occupied logic gates. Closer inspection of 308
(6) reveals that pixels that need to be integrated are interlaced. 309
Thereby, gaps between merged pixels grow with ascending a 310
and extend the filter length. The omission of pixels within gaps 311
is realized with switches. A switch-controlled semisystolic FIR 312
filter design of (5) with multiplier h0 is depicted in Fig. 3. In 313
this design, switch states are controlled by bits in a 2-D vector 314
field denoted as s(a, w , p) that is given by 315
s(0/3, w , p) =
⎡
⎣
1 0 0
0 1 0
0 0 1
⎤
⎦ (10)
if a = 0/3. Depending on refocusing parameter a, switch state 316
matrices s(a, w , p) contain binary numbers with columns indexed 317
by w for the state of each switch in the FIR filter and with rows 318
indexed by p, which loads a new row of switch states when 319
IEE
E P
ro
of
HAHNE et al.: REAL-TIME REFOCUSING USING AN FPGA-BASED STANDARD PLENOPTIC CAMERA 5
Fig. 4. Timing diagram of FIR filter module with a = 0/3.
Fig. 5. 1-D semisystolic FIR filter for sub-pixel shift a = 1/3.
incremented. In addition, a write enable switch helps to prevent320
intermediate falsified values from being streamed out.321
For better comprehension, a timing diagram in Fig. 4 visual-322
izes the computational concept of the FIR design from Fig. 3.323
Here, the pixel clock signal is given as PCLK. Furthermore, the324
proposed architecture employs the doubled pixel clock PCLKx2325
with a time period TPCLKx2 = TPCLK/2 to shift and add pixel val-326
ues in a single pixel clock cycle TPCLK. It is also seen that a new327
row of switch states is called by incrementing p every pixel328
clock cycle. Numbers in the data streams represent unsigned329
decimal 8-bit gray-scale values, which are multiplied with h0 =330
1/3. Pixel colors match those of the SPC ray model in Fig. 1331
representing chief ray positions in microimages with M = 3.332
Orange color highlights interim results and red signifies 1-D re-333
focused output data. Oval circles indicate that the sum of divided334
microimage pixels is reflected in the output pixel E′0/3 [xk ]. The335
filter includes an NN-interpolation upsampling the micro image336
resolution by factor 3. To refocus with a = 1/3, another FIR337
filter module is conceived based on (7) and depicted in Fig. 5. In338
reference to the previous FIR filter where a = 0/3, it becomes339
obvious that the arrangements are identical except for different340
switch states. The switch state matrix s(1/3, w , p) is given by341
s(1/3, w , p) =
⎡
⎣
1 1 1
1 1 1
1 1 1
⎤
⎦ (11)
which means that switches remain closed at all times. A cor-342
responding timing diagram is shown in Fig. 6. Fig. 7 depicts343
Fig. 6. Timing diagram of FIR filter module with a = 1/3.
Fig. 7. 1-D semisystolic FIR filter for sub-pixel shift a = 2/3.
Fig. 8. Timing diagram of FIR filter module with a = 2/3.
an FIR filter according to (8), which occupies more PEs due 344
to the fact that the distance between added pixels grows. The 345
corresponding switch state matrix s(2/3, w , p) is as follows: 346
s(2/3, w , p) =
⎡
⎣
0 0 1 1 1
0 1 1 1 0
1 1 1 0 0
⎤
⎦ (12)
IEE
E P
ro
of
6 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
Fig. 9. Parallelized 2-D processing module array with ι = 3.
producing a filter behavior shown in Fig. 8. As Fig. 7 demon-347
strates, a large 1-D semisystolic filter module may imply long348
wires when broadcasting multiplier outputs. Long wires would349
cause a low-pass filter behavior in the signal transmission, which350
affects the readability of falling and rising edges and therefore351
has to be avoided. To keep wires short in the broadcast net,352
incoming bit words can be distributed to several synchronized353
latches (buffers) before being merged in adders.354
B. 2-D Module Array355
The proposed FIR filter modules process data in 1-D and thus356
in horizontal or vertical directions only. Fig. 9 shows a 2-D357
construct of 1-D semisystolic processor modules to accomplish358
refocusing by processing data in both dimensions. In this exam-359
ple, the degree of parallelization amounts to ι = 3, but could be360
scaled as desired until limits are reached (ι = L for image rows,361
ι = K for image columns).362
The data flow in Fig. 9 is described in the following. First,363
pixels coming from the sensor are fed into horizontal processor364
blocks representing semisystolic FIR filter modules as proposed365
in the previous section. All semisystolic processor modules are366
identical whereas the type relies on the refocusing parameter a.367
In the second stage, horizontally processed data rows E′a [xk , yl ]368
are delayed using skewed registers and assigned to another ar-369
rangement of semisystolic modules making it possible to form370
an incoming image column (e.g., E′a [x0 , yl ]). Here, demulti-371
plexers are driven by a pixel counter to assist in the correct372
assignment of pixels values. This assures that pixels from dif-373
ferent rows sharing index k are sent to the same vertical pro-374
cessing unit that produces an image column (e.g., E′′a [x0 , yl ])375
of the final refocused image. For synchronization purposes, an376
additional array of skewed registers can be optionally placed377
behind column processor blocks.378
In order to estimate the computation time, it is assumed here-379
after that the hardware system refers to the ideal case of maxi-380
mum parallelization where ι = L or ι = K for each dimension,381
respectively. Besides, it is supposed that color channels are also382
parallelized causing no extra time delay. The shift and integra-383
tion for a single output pixel refocused with a = 1/M takes M384
pixel clock cycles in 1-D when using twice the pixel clock to385
process them. Taking this as an example, the overall number of386
steps η to compute a single image E′′1/3 with K-by-L resolution387
TABLE I
BENCHMARK OF PROPOSED ARCHITECTURE
is given by 388
η = 2(Λ + M) + 2(K − 1) + L − 1 (13)
where Λ represents a single clock cycle step to compute the 389
mathematical product of an incoming pixel value. The total 390
computation time O for a single image can be obtained by 391
O(η) = η × TPCLK . (14)
This duration reflects the theoretical time that elapsed from the 392
moment the first pixel Efs [xk , yl ] entered the logic gate until 393
the final output pixel E ′′a [xk , yl ] is available. When pipelining 394
the data stream, output pixels of a subsequent image arrive di- 395
rectly after that letting the overall computation time for a single 396
frame be represented by the delay time of the computational fo- 397
cusing system. Once the first refocused photograph is received, 398
the number of remaining computational steps ηsub for every 399
following image amounts to: 400
ηsub = L − 1 + K − 1 . (15)
To assess performance limits of the presented architecture, we 401
performed a benchmark comparison between this approach, 402
the FPGA-based implementation of Pe´rez et al. [7], and a 403
GPU-based approach [22]. In this comparison, a 3201-by-3201 404
pixel image (K = L = 3201) with 291-by-291 microlenses was 405
computationally refocused in 105.9 ms at 100-MHz clock fre- 406
quency. Thereby, the microimage resolution is M = 11 and 407
the output image resolution amounts to 589-by-589, which 408
is less than 1/6 of the incoming image. Conversely, the 409
proposed semisystolic method numerically preserves the in- 410
coming spatial resolution by employing an NN-interpolation 411
in η = 1 + 11 + 3200 + 1 + 11 + 3200 + 3200 steps yielding 412
O(η) = 96.2μs computation time for a single frame when run- 413
ning at 100 MHz pixel clock. Each subsequent frame, how- 414
ever, can be processed in ηsub = 3200 + 3200 steps, which is 415
available at every O(ηsub) = 64μs. In comparison, an iden- 416
tical implementation based on the GPU implementation by 417
Lumsdaine et al. [22] takes approximately 1.38 ms on aver- 418
age, whereas a MATLAB implementation takes approximately 419
12.1 s per image on average as seen in the overview in Table I. 420
In this comparison, we employed the Spartan-6 XC6SLX45 421
chip using the ISE WebPACK design software from Xilinx. 422
The refocusing shader were executed on a Fermi architecture 423
GeForce 480M GTX with 2 GB of GDDR5 RAM running at 424
1200 MHz, connected to a 256 bit bus [22]. For the CPU en- 425
vironment, we used MATLAB 7.11.0.584 (R2010b) on an Intel 426
Core i7-3770 CPU @ 3.40 GHz without multithreading. 427
IV. VALIDATION 428
In this section, we evaluate the functionality of the proposed 429
FPGA-based refocusing hardware design. For that purpose, the 430
IEE
E P
ro
of
HAHNE et al.: REAL-TIME REFOCUSING USING AN FPGA-BASED STANDARD PLENOPTIC CAMERA 7
Fig. 10. Block diagram (borrowed from [6]) for experimental validation.
Single arrows denote serialized whereas three arrows indicate paral-
lelized data streams. Row buffers are employed to simulate data paral-
lelization in the experiment.
TABLE II
UTILIZATION SUMMARY FOR XC6SLX45–CSG324
VHSIC HDL (VHDL) is used to configure the FPGA where VH-431
SIC stands for very high speed integrated circuit. A schematic432
file, generated from a VHDL compiler, is then flashed onto433
the FPGA chip model XC6SLX45. Fig. 10 contains a block434
diagram illustrating the implemented processing architecture435
used to validate the design proposed in the previous section.436
The FPGA board features high-definition multimedia interface437
(HDMI) connectors such that video frame transmission is ac-438
complished using the transition minimized differential signaling439
(TMDS) protocol. TMDS receiver and transmitter designs have440
been integrated on the FPGA to fulfill deserialization, serial-441
ization just as decoding and encoding tasks. Off-chip memory442
is used for buffering decoded and serialized video frames out-443
side the FPGA since the amount of image data exceeds internal444
memory storage.445
In our implementation, a row of switch settings is loaded446
from a look-up table (LUT) every clock cycle starting from447
the first row again after the last one is reached. The switch-448
state LUTs can be stored in block random-access memorys449
(BRAMs), which are part of the FPGA. The integration of mul-450
tiplier h0 is also achieved using on-chip memory, making it451
called stored product. In accordance with the TMDS protocol452
specification, a decoded pixel value is of 8-bit depth per color453
channel, which yields a manageable number of 256 possible454
results when dividing by M . Thus, quotients can be precalcu-455
lated for a specific divisor M and stored in one BRAM per456
color channel for each image row. Note that these BRAMs are457
read-only memories.458
Fig. 11. Timing diagram example from ISE simulator.
Fig. 12. Refocused photographs using the proposed architecture. (a)
E ′0/3 . (b) E ′5/3 . (c) E ′′0/3 . (d) E ′′5/3 . (e) E ′′0/5 . (f) E ′′8/5 . Input and output
spatial image resolutions amount to 843-by-561 pixels with M = 3 in
(a)–(d). Intermediate horizontally processed images are shown in (a)
and (b) whereas (c) and (d) depict fully refocused images after horizontal
and vertical processing with varying a. In comparison, output images in
(e) and (f) with 1405-by-935 pixel resolution expose improved synthetic
blur by using a linear interpolation of whole microimages with M = 5.
Reducing a lightfield’s angular sampling rate M extends the depth of
field [8] and leads to blur aliasing in case of angular undersampling [15].
A screenshot from an exemplary timing diagram simula- 459
tion where a = 1/3 and TPCLK = 60 ns is provided in Fig. 11 460
with the code attached to this article. This VHDL-implemented 461
hardware simulation shows that the filter behaves as expected, 462
justifying the conceived architecture. PCLKx2 can be obtained 463
with a phase-locked loop (PLL). An overview of the imple- 464
mented design comprising a single FIR filter with a = 1/5 is 465
presented in Table II where it can be seen that inputs/outputs 466
(IOs) and PLLs make up by far most of the power consump- 467
tion. This is due to the included HDMI transceiver, memory 468
controller block (MCB) and color conversion modules. Parts 469
IEE
E P
ro
of
8 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
Fig. 13. (a) NN interp. E ′′5/5 (while refocusing). (b) NN interp. E ′′4/5
(while refocusing). (c) Lin. interp. E ′′5/5 (while refocusing). (d) NN interp.
E ′′5/5 (after refocusing). (e) NN interp. E ′′6/5 (while refocusing). (f) Lin.
interp. E ′′5/5 (after refocusing). Resolution comparison where (a), (c),
(d) and (f) show the same region refocused with a = 5/5 using different
interpolation techniques during and after shift and integration. Images
in (b) and (e) are NN-interpolated versions with varying a indicating
significant variation of the spatial resolution when compared with (a) and
(d). Effective resolution is more consistent when using linear interpolation
[e.g., compare (d), (e), and (f)].
of these modules may be omitted or replaced by on-board470
integrated circuits (ICs) in a prototyping stage. Furthermore,471
Table II gives indication that adding more FIR filters for full472
parallalization (maximum L and K) is noncritical to power, but473
may be limited to the number of logic slices in a Spartan-6474
device.475
Presented refocusing synthesis formulas require all microim-476
ages to be of a consistent size. This is not the case, however,477
in raw lightfield photographs. As indicated with the experimen-478
tal architecture in Fig. 10, microimage cropping remains an479
external process performed prior to streaming the data to the480
FPGA. Embedding this process on an FPGA is essential for481
prototyping, but left for future work. To comply with FIR filter482
designs in Section III, the microimage size is reduced to M = 3483
and M = 5 for comparison. Lightfield images have been ac-484
quired by our custom-built plenoptic camera with an MLA of485
281 microlenses per row and 188 per column. Insightful details486
on the camera calibration can be found in [27].487
Fig. 12 depicts refocused photographs computed by the pro-488
posed 2-D module array to accomplish real-time refocusing.489
Intermediate results after processing images in a horizontal di-490
rection are seen in Fig. 12(a) and (b). Their fully refocused491
counterparts are found in Fig. 12(c) and (d). Closer inspection492
of Fig. 12(d) indicates aliasing in blurred regions. This is due493
to an undersampled directional domain as there are only 3-by-494
3 samples per microimage (M = 3) in the incoming lightfield495
capture. Aliasing in synthetic image blur is an observation Ng496
already pointed out in his thesis [15]. To combat the aliasing 497
problem, the author suggests to sufficiently increase the mi- 498
croimage sampling rate M . Fig. 12(e) and (f) shows refocused 499
images obtained from a raw capture with a native microimage 500
resolution of 5-by-5 pixels (M = 5) using a linear interpola- 501
tion instead of NN. There, it can be seen that aliasing artifacts 502
are satisfyingly suppressed. A comparison of output image res- Q4503
olutions using the inherent NN-interpolation of proposed FIR 504
filters is provided in Fig. 13. Results in Fig. 13(a)–(f) suggest 505
that interpolating microimages while refocusing with a ∈ Z 506
using (6) corresponds to a conventional 2-D image interpola- 507
tion. On the contrary, an effective resolution enhancement can 508
be observed when comparing Fig. 13(a) where a = 5/5 with 509
Fig. 13(b) where a = 4/5, which are both computed from the 510
same raw image using NN-interpolation. Given that respective 511
objects are acceptably well covered by their depth of field and 512
exhibit best focus, it is possible to state that improved resolu- 513
tion is obtained by refocusing with noninteger numbers (a ∈ Z). 514
This effective resolution variation is a consequence of the mi- 515
croimage repetition and the interleaving filter kernel for the 516
refocusing synthesis yielding identical values for adjacent out- 517
put pixels when a ∈ Z, but varying intensities for contiguous 518
pixels if a ∈ R. This can be seen by inspecting output data 519
streams E ′a [xk ] of the timing diagrams in Figs. 4 and 6. To work 520
toward consistency in spatial resolutions for varying a, it is thus 521
essential to employ linear interpolation prior to distributing mi- 522
croimage pixels through the FIR broadcast net. A positive side 523
effect in upsampling microimages is that refocused image slices 524
E ′′a [xk , yl ] are not only interpolated in spatial-domain, but also 525
subsampled along depth as demonstrated in [8]. 526
V. CONCLUSION 527
This paper demonstrated methods to derive optimized FIR 528
refocusing filter kernels for a time- and cost-efficient hardware 529
implementation. Simulating the conceived architecture proved 530
that real-time refocusing can be accomplished with a compu- 531
tation time of 96.24μs per frame reducing the delay time by 532
99.91 % in comparison with a previous state-of-the-art attempt. 533
By interpolating microimages, it was shown how to retain the 534
numerical sensor resolution in refocused photographs. The pro- 535
posed architecture can serve as a groundwork for application- 536
specific integrated circuit chips. 537
A limitation of the results is that timing delays have been sim- 538
ulated and need to be verified using chip analyzing tools. As the 539
number of required PEs grows with higher image resolutions, it 540
may exceed the gate count capacity of the FPGA in full paral- 541
lelization. Besides this, care needs to be taken to prevent long 542
wires in the broadcast net. For the hardware system’s reliabil- 543
ity, it is also recommended to convert semisystolic arrays into a 544
full-systolic architecture. To achieve consistency in microimage 545
size (M ), cropping of the same has to be integrated as a preced- 546
ing processing stage on the FPGA chip. Furthermore, a bilinear 547
interpolation ought to be implemented to replace microimage 548
repetition (NN-interpolation) and work toward consistent effec- 549
tive resolutions in refocused images, although this will cause 550
additional delays. 551
IEE
E P
ro
of
HAHNE et al.: REAL-TIME REFOCUSING USING AN FPGA-BASED STANDARD PLENOPTIC CAMERA 9
A competitive design approach may conceive a refocusing552
architecture based on the FSP theorem. It is, however, expected553
that the Fourier transform produces larger time delays. A con-554
siderable alternative to an FPGA-based implementation is the555
employment of a GPU as this takes less design effort, however,556
by inducing larger delays and more power consumption.557
Deployment of proposed design to an FPC is thought to be558
impractical, since there is a fundamental difference between559
SPC and FPC with regards to the optical design (number of560
microlenses and focus position of MLA). On the algorithmic561
level, SPC refocusing is a pixel-based integration whereas an562
FPC requires the integration of overlapping areas of shifted563
microimage patches such that a refocusing algorithm has to be564
designed specific to the type of plenoptic camera.565
REFERENCES566
[1] A. Isaksen, L. McMillan, and S. J. Gortler, “Dynamically reparameterized567
light fields,” in Proc. 27th Ann. Conf. Comput. Graph. Interactive Tech.,568
ser. SIGGRAPH ’00. New York, NY, USA, 2000, pp. 297–306. [Online].569
Available: http://dx.doi.org/10.1145/344779.344929570
[2] R. Ng, M. Levoy, M. Bre`dif, G. Duval, M. Horowitz, and P. Hanrahan,571
“Light field photography with a hand-held plenoptic camera,” Stanford572
University, Tech. Rep. CTSR 2005-02, 2005.573
[3] R. Ng, “Fourier slice photography,” ACM Trans. Graph., vol. 24,574
no. 3, pp. 735–744, Jul. 2005. [Online]. Available: http://doi.acm.org/575
10.1145/1073204.1073256576
[4] A. Lumsdaine and T. Georgiev, “Full resolution lightfield rendering,”577
Adobe Systems, Inc., Tech. Rep., Jan. 2008.
Q5
578
[5] C. Perwass and L. Wietzke, “Single lens 3D-camera with extended depth-579
of-field,” Proc. SPIE, vol. 8291, February 2012. [Online]. Available:580
http://dx.doi.org/10.1117/12.909882
Q6
581
[6] C. Hahne and A. Aggoun, “Embedded FIR filter design for real-time re-582
focusing using a standard plenoptic video camera,” Proc. SPIE, vol. 9023,583
2014. [Online]. Available: http://hdl.handle.net/10547/313167584
[7] J. Pe´rez, E. Magdaleno, F. Pe´rez, M. Rodrı´guez, D. Herna´ndez, and585
J. Corrales, “Super-resolution in plenoptic cameras using FPGAs,”586
Sensors, vol. 14, no. 5, pp. 8669–8685, 2014. [Online]. Available:587
http://www.mdpi.com/1424-8220/14/5/8669588
[8] C. Hahne, A. Aggoun, V. Velisavljevic, S. Fiebig, and M. Pesch,589
“Refocusing distance of a standard plenoptic camera,” Opt. Exp.,590
vol. 24, no. 19, pp. 21 521–21 540, Sep. 2016. [Online]. Available:591
http://www.opticsexpress.org/abstract.cfm?URI=oe-24-19-21521592
[9] M. Levoy and P. Hanrahan, “Light field rendering,” Stanford University,593
Stanford, CA, USA, Tech. Rep., 1996.
Q7
594
[10] J. C. Yang, M. Everett, C. Buehler, and L. McMillan, “A real-time dis-595
tributed light field camera,” in Proc. 13th Eurographics Workshop Render-596
ing. Aire-la-Ville, Switzerland, Switzerland, 2002, pp. 77–86. [Online].597
Available: http://dl.acm.org/citation.cfm?id=581896.581907598
[11] K. Venkataraman et al., “PiCam: An ultra-thin high performance mono-599
lithic camera array,” ACM Trans. Graph., vol. 32, no. 6, pp. 166:1–166:13,600
Nov. 2013. [Online]. Available: http://doi.acm.org/10.1145/2508363.601
2508390602
[12] G. Lippmann, “ ´Epreuves re´versibles donnant la sensation du relief,”603
Acade´mie Des Sci., pp. 446–451, Mar. 1908.604
[13] E. H. Adelson and J. Y. Wang, “Single lens stereo with a plenoptic camera,”605
IEEE Trans. Pattern Anal. Mach. Intell., vol. 14, no. 2, pp. 99–106, Feb.606
1992.607
[14] E. H. Adelson and J. R. Bergen, “The plenoptic function and the elements608
of early vision,” in Proc. Comput. Models Visual Process., Cambridge,609
MA, USA: MIT Press, 1991, pp. 3–20.610
[15] R. Ng, “Digital light field photography,” Ph.D. dissertation, Stanford Uni-611
versity, Stanford, CA, USA, July 2006.612
[16] T. Georgiev and A. Lumsdaine, “The focused plenoptic camera,” in Proc.613
Int. Conf. Comput. Photography, 2009.614
[17] A. Veeraraghavan, R. Raskar, A. Agrawal, A. Mohan, and J. Tum-615
blin, “Dappled photography: Mask enhanced cameras for heterodyned616
light fields and coded aperture refocusing,” ACM Trans. Graph.,617
vol. 26, no. 3, Jul. 2007. [Online]. Available: http://doi.acm.org/10.1145/618
1276377.1276463619
[18] Z. Xu, J. Ke, and E. Y. Lam, “High-resolution lightfield photography using 620
two masks,” Opt. Exp., vol. 20, no. 10, pp. 10 971–10 983, May 2012. 621
[Online]. Available: http://www.opticsexpress.org/abstract.cfm?URI=oe- 622
20-10-10971 623
[19] C. Hahne, A. Aggoun, V. Velisavljevic, S. Fiebig, and M. Pesch, “Baseline 624
and triangulation geometry in a standard plenoptic camera,” Int. J. Comput. 625
Vis., vol. 126, no. 1, pp. 21–35, Jan. 2018. 626
[20] Lytro, Inc., “Lytro-press releases,” 2016. [Online]. Available: 627
https://www.lytro.com/press/releases/lytro-brings-revolutionary-light- 628
field-technology-to-film-and-tv-production-with-lytro-cinema. Accessed 629
on: Aug. 24, 2016. 630
[21] Z. Mhabary, O. Levi, E. Small, and A. Stern, “Fast and exact method for 631
computing a stack of images at various focuses from a four-dimensional 632
light field,” J. Electron. Imaging, vol. 25, no. 4, 2016, Art. no. 043002. 633
[Online]. Available: http://dx.doi.org/10.1117/1.JEI.25.4.043002 634
[22] A. Lumsdaine, G. Chunev, and T. Georgiev, “Plenoptic rendering with 635
interactive performance using gpus,” Proc. SPIE, vol. 8295, pp. 829 513– 636
829 513–15, 2012. [Online]. Available: http://dx.doi.org/10.1117/12. 637
909683 638
[23] D. G. Bailey, Design for Embedded Image Processing on FPGAs. Hobo- 639
ken, NJ, USA: Wiley, 2011. 640
[24] L. F. Rodrı´guez-Ramos, Y. Marı´n, J. J. Dı´az, J. Piqueras, J. Garcı´a- 641
Jime´nez, and J. M. Rodrı´guez-Ramos, “FPGA-based real time processing 642
of the plenoptic wavefront sensor,” in Proc. Adaptative Opt. Extremely 643
Large Telescopes, 2010, Paper 7007. 644
[25] R. Wimalagunarathne, A. Madanayake, D. G. Dansereau, and L. T. Bruton, 645
“A systolic-array architecture for first-order 4-D IIR frequency-planar dig- 646
ital filters,” in Proc. IEEE Int. Symp. Circuits Syst., May 2012, pp. 3069– 647
3072. 648
[26] Xilinx, Inc., “A 1D Systolic FIR,” 2002. [Online]. Available: 649
http://www.iro.umontreal.ca/aboulham/F6221/Xilinx.htm. Accessed on: 650
Dec. 12, 2015. 651
[27] C. Hahne, “The standard plenoptic camera – Applications of a geometrical 652
light field model,” Ph.D. dissertation, University of Bedfordshire, Luton, 653
U.K., Jan. 2016. 654
Christopher Hahne received the B.Sc. degree 655
from the University of Applied Sciences, Ham- 656
burg, Germany, in 2012, and the Doctoral degree 657
from the University of Bedfordshire, Luton, U.K., 658
in 2016, in a bursary-funded Ph.D. program. 659
He is affiliated with BASF subsidiary trinamiX 660
GmbH, Ludwigshafen, Germany, where he cur- 661
rently works as the Manager of Simulation & 662
Software on adaptive three-dimensional sens- 663
ing. He worked at R&D departments of Ro- 664
hde & Schwarz GmbH & Co. KG, Munich, Ger- 665
many, in 2010, and Arnold & Richter Cinetechnik GmbH & Co. KG, 666
Munich, Germany, in 2011. Subsequently, he became a Visiting Student 667
with Brunel University, London, U.K., in 2012. Q8668
669
Andrew Lumsdaine (SM’15) is an interna- 670
tionally recognized expert in the area of 671
high-performance computing who has made 672
important contributions in many of the consti- 673
tutive areas of HPC. In particular, he has con- 674
tributed in the areas of HPC systems, program- 675
ming languages, software libraries, and perfor- 676
mance modeling. His work in HPC has been 677
motivated by data-driven problems (large-scale 678
graph analytics), as well as more traditional com- 679
putational science problems. In addition, outside 680
of the realm of HPC, he has done seminal work in the area of computa- 681
tional photography and plenoptic cameras. In his career, he has authored 682
or coauthored more than 200 articles in top journals and conferences and 683
holds 15 patents. He has also contributed important software artifacts 684
to the research community, especially in the area of message passing 685
interface (MPI). He is active in a number of standardization efforts with 686
important contributions to the MPI specification, the C++ programming 687
language, and the Graph 500. Q9688
689
IEE
E P
ro
of
10 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
Amar Aggoun received the “Ingenieur d’e´tat”690
degree in electronics engineering from Ecole691
Nationale Polytechnique d’Alger, Algiers,692
Algeria, and the Ph.D. degree in electronic693
engineering from the University of Nottingham,694
Nottingham, U.K.695
He is currently the Head of School of Math-696
ematics and Computer Science and Professor697
of Visual Computing with the University of698
Wolverhampton, Wolverhampton, U.K. His699
academic carrier started at the University of700
Nottingham where he held the positions of Research Fellow in low701
power DSP architectures and Visiting Lecturer in electronic engineering702
and mathematics. In 1993, he joined De Montfort University as a703
Lecturer and progressed to the position of Principal Lecturer in 2000.704
In 2005, he joined Brunel University as a Reader with Information705
and Communication Technologies. From 2013 to 2016, he was at the706
University of Bedfordshire as the Head of School of Computer Science707
and Technology. He was also the Director of the Institute for Research708
in Applicable Computing which oversees all the research within the709
School. His research is mainly focused on three-dimensional (3-D)710
Imaging and Immersive Technologies and he successfully secured711
and delivered research contracts worth in excess of 6.9M, funded by712
the research councils UK, Innovate UK, the European commission713
and industry. Amongst the successful project, he was the initiator and714
the principal coordinator and manager of a project sponsored by the715
EU-FP7 ICT-4-1.5-Networked Media and 3-D Internet, namely live716
immerse video-audio interactive multimedia. He holds 3 filed patents,717
authored or coauthored more than 200 peer-reviewed journals and718
conference publications and contributed to two white papers for the719
European Commission on the future internet.720
Dr. Aggoun also served as an Associate Editor for the IEEE/OSA721
JOURNAL OF DISPLAY TECHNOLOGIES.Q10 722
723
Vladan Velisavljevic (M’06–SM’12) received 724
the Ph.D. degree in the field of signal and image 725
processing from ´Ecole Polytechnique Fe´de´rale 726
de Lausanne (EPFL), Lausanne, Switzerland, in 727
2005. 728
He is a Reader (Associate Professor) in visual 729
systems engineering with the School of Com- 730
puter Science and Technology and the Head of 731
the Centre for Research in Signals, Sensors and 732
Wireless Technology, University of Bedfordshire, 733
Luton, U.K., since 2011. Previously, he was a Se- 734
nior Research Scientist with Deutsche Telekom Laboratories, University 735
of Technology Berlin, Germany, in 2006–2011, and a Doctoral Assistant 736
with LCAV, EPFL, Switzerland, in 2001–2005. He has authored or coau- 737
thored more than 60 peer-reviewed journal and conference publications 738
and two book chapters. 739
Dr. Velisavljevic serves as an Associate Editor for Elsevier Signal 740
Processing: Image Communication and for IET Journal of Engineering 741
and he is a Co-Chair of the IEEE ComSoc MMTC Interest Group on 3-D 742
Processing and Communications. He was a General Chair of the IEEE 743
MMSP 2017, a Lead Guest Editor for special issue on Visual Signal Pro- 744
cessing for Wireless Networks at the IEEE JOURNAL OF SELECTED TOPICS 745
IN SIGNAL PROCESSING in February 2015 and special session organizer 746
at 3DTV-Con 2015 and IEEE ICIP 2011. He was also Associate Editor 747
for IEEE ComSoc MMTC R-Letters and Member of the Review Board for 748
the IEEE ComSoc Multimedia Communications TC. He has served as a 749
TPC member and reviewer for a number of conferences and journals. 750
751
IEE
E P
ro
of
QUERIES 752
Q1. Author: Please provide expansion for “FPGA”. 753
Q2. Author: Please supply index terms/keywords for your paper. To download the IEEE Taxonomy go to 754
http://www.ieee.org/documents/taxonomy_v101.pdf. 755
Q3. Author: Please check whether the affiliation of Vladan Velisavljevic is okay as set. 756
Q4. Author: Please check whether the captions of Figs. 12 and 13 are okay as set. 757
Q5. Author: Please provide location and technical report number for Ref. [4]. 758
Q6. Author: Please provide page range for Refs. [5], [6], and [16]. 759
Q7. Author: Please provide technical report number for Ref. [9]. 760
Q8. Author: Please provide the subject in which Christopher Hahne received the B.Sc. and doctoral degrees. 761
Q9. Author: Please provide educational detail in the biography of Andrew Lumsdaine. 762
Q10. Author: Please provide the year in which Amar Aggoun received the “Ingenieur d’e´tat” and Ph.D. degrees. 763
