Medical image denoising on field programmable gate

array using finite Radon transform by Ahmad , Afandi et al.
Publ~shed In IET Slgnal Process~ng 
Receive d on 25th October 2011 
Rev~sed on 11th August 2012 
dol.10 1049iiet-spr 2011 0392 
Medical image denoising on field programmable gate 
array using finite Radon transform 
A. ~hrnad' A. ~ r n i r a " ~  H. ~ a b a h ~  Y. ~ervi l ler~ 
' ~ e ~ a r t m e n t  of Computer Engineering, Facutly of Electrical & Electronic Engineering, Universiti Tun Hussein Onn 
Malaysia (UTHMI, Batu Pahat, Malaysia 
Z ~ l B ~ ~ ,  University of Ulster, Jordanstown Campus, Newtownabbey, United Kingdom 
'~epar tment  of Electrical Engineering, Qatar University, Doha, Qatar 
"Department NZEV, lnstitut Jean Lamour, University of Lorraine, Nancy, France 
E-mail: a.amira@ulster.ac.uk 
Abstract: This shldy presents the design and implementatio~~ of eff~cient architectures for linite Radon transfonn (FRAT) 011 a 
field progammablc gate m y  (FPGA). FPGA-baed architectures with two design strategies have been proposcd: direct 
implementation of pseudo-code with a sequential or pipelined description, and a block random access memory-based 
approach. Various medical images modalities have been deployed Tor both software evaluation and hardware implementation. 
Xiinx DSP tool has been used to in~prove the impleme~~tatiou time and reduce the design cycle and the Xilinx software has 
been used for generating a hardware description language from a high-level MATLAB description. Objective evaluation of 
image denoisins using FRAT is carried out and demonskates promising results. Moreover, the impact o f d i k e n t  block sizes 
on image reconstmction has been analysed. Pedom~ance analysis in terms of area, maxi~num frequency and throughput is 
presented and reveals significant achievements. 
1 Introduction 
In medical imaging systems, noise can be classified as additive 
or multiplicative [I]. Noise reduction in ine<!ical unaging 
applications is very impottanf a3 various types of noises 
generated by medical imaging equipment, conseque~ltly, limit 
the effectiveness of medical image diagnosis [2]. 
The contributions of transform domains in various 
applications including image denoising, enhancement and 
con~pression are undebatable facts. As an example, the 
wavelet transform has been extensively used as a solution 
to the of the short time Fourier transform (STFT) 
and excels in isolation discontuiuities and spikes [3]. 
I-Iowever, the wavelet suffers from flexible directionality, as 
it does not isolate the smoothness along edges. This demerit 
of the wavelet is well addressed by the ridgelet and culvelet 
transforms, as they extend the functionality of the wavelets 
to higher dimensional singularities, and it is proven as an 
effective tool to perform sparse directional analysis [3]. The 
basic building block of these transforms is the tinite Radon 
hanrform (FRAT). 
Since medical images contain severdl objects and curves, 
doubtless, the curvelet and ridgelet with their main bnilding 
block FRAT play a major role in better image analysis. By 
omoading the intensive processing procedures of these 
transforms into a proper liard\%arc platform, coniputational 
acceleration cari be achieved, and at the same time the 
outconles quality can be maintained. 
The FRAT 'algorithm is stringent, as it is inherently serial, 
iterative and has a long latency. To overcome these 
boundaries, there is a real need for hardware 
implementation and acceleration of FRAT especially for 
medical imaging applications. Existing lilnited hardware 
implementation of the FRAT in medical imaging 
applications opens a huge gap to be filled [3-71. 
This paper presents the design and imnplementation of 
FRAT on reconfig~~rable hardwarc using a field 
programmable gatc array (FPCiA) for medical image 
de~~oising using the Xilinx DSP tool. Two design strategies 
have been proposed: direct in~plementation of pseudo-code 
with a sequential or pipelincd description and block random 
access memory @RAM)-based method. Analysis for both 
software simulation and hardware impleme~~tation with 
different medical image modalities has been carried out and 
discussed. An evaluation of FRAT'S capability on medical 
imagc denoising is also addressed. 
The organisation of this paper is as follows. The related 
work is presented in Section 2. An overview of the 
algorithms used is presented in Section 3. Section 4 
explains the proposed system implementation in two 
aspects: denoising system and architectures. Experimental 
resuults analysis of medical unage denoising, using sofhvare 
simulations and hardware implementations are explained in 
Section 5. Finally, a summary is give11 in Section 6 .  
I. I Related work 
Sevelal existing hardware implementations of FRAT are 
discosued in this section. In [4], hvo architectu~cs arc 
proposed, a gene~ic and a standard nAT-based pseudo-code. 
IET Signal Process.. pp. 1-9 
doi: 10.1049/let-spr.ZO11.0392 
1 
O The Institution of Engineering and Technology 2012 






Table 5 Comparison of performance with the exi 
T v ~ e  Platform Desian 
sting architectures of FRATfor the case p -  7 
F. MHz TiMPPS) A (Slices) 
sequential VirtexE [31 
Virtex-ll 171 
[31 
[51 
[61:A1 
[6l:A2 
Virrex-5 proposed 
pipelined Virteu-5 proposed (1) 
BRAM-based Virtexd vroposed 
245 
345 
215 
159 
198 
131 
110-1- 1 BRAM 
1687 
637C 4 BRAMs 
Note: (1) Loops unrolled 
Table 6 Comparison of different FRATarchitectures and system 
architectures 
Parameter (unit) Proposed architecture 
min. period, ns 
max. frequency 
(MHz) 
'latency, cycles 
'latency, time ks  
throughput, MPPS 
total power, mW 
total slices 
power.time. W.*s 
Sequential 
bit accurate 
(FRAT) 
4.20 
238.10 
3297 
13 847 
3.5 
1122 
110 
15.5 
Loop 
unrolled 
(FRAT) 
-
4.97 
200 
49 
0.244 
200.8 
1.301 
1.687 
0.32 
Loop 
unrolled 
ISysteml 
4.98 
200 
69 
0.344 
142.5 
1,608 
4,627 
0.55 
Note: 'Execution for a 7 x 7 block of data 
Fig. 7 Chip lqwutfor seqrrenriul i,~,plcn~en/nIion 
target device. The proposed architecture also exhibits 0.32 pJ 
of energy: which can also be considercd as efficient enerby 
consumption compared with the sequential architecture. 
The results achieved for hardware implementation 
demonstlate various trade-offs with sec]uential and pipelined 
descriptions yeldiig better achievement for maximum 
frequency (F)  and throughput (T), respectivclp. Moreover, 
the URAM-based mcthod 2tlm reveals less area (A)  
occupied and better maximum frequency. To visualise the 
design and imnplemenrdtion of the FRAT, Fig. 7 illustrates 
thc chip layout for sequential implementation on 
XCSVLXIIO T FPGA device. Fig. 8 illustrates the chip 
8 
Q The Institution of Engineering and Technology 2012 
Fig. 8 Chip Lryol't for. pnrullrl irr~plen~entution of the full 
demising .syr~en~ (foot printr for. r-adon, derloire and inrdon 
modtrios d q ~ i ~ , t c d  sepnrnlely on ihr righr side) 
t a y ~ ~ n p t u m ~ - t ~ ~ e ~ ~ 1 ~ e - ~ -  
footprints of 'radon', 'denoise' and 'iradon' sub-modules, 
depicted separately on the right side. 
A detailed comparison for both hardwarc implementation 
and software simulation with test medical images has been 
canied out. As shown in Table 7, soflurare simulation 
achieved better PSNR ovor hardware implementation with 
the percentage difference being 12.92, 21.47 and 33.09% 
Carp = 7, 17 and 31, respectively. This is due to the use of 
Table 7 Comparison of PSNR Values for CT images 
Implementation PSNR, dB 
Block sizes, p 
7 17 31 
Hardware 46.30 38.13 30.30 
Software 53.71 48.56 45.29 
IETSignal Process.. pp. 1-9 
doi: 10.1049liet-spr.2011.0392 

