Fast Sobel Edge Detection Using Parallel
Pipeline-based Architecture on FPGA
Mohammad Shokrolah Shirazi, Brendan Morris
Department of Electrical & Computer Engineering,
University of Nevada, Las Vegas

Abstract
Implementing image processing algorithms on FPGA has
recently become more popular since it provides high
speed in comparison with software-based approaches.
In this paper, we have presented fast pipeline-based
architecture for one of the most popular edge detection
algorithms called Sobel edge detection. The objective of
our work is to present two fast pipeline-based
architectures for Sobel edge detection on FPGA
benefiting one and two way parallelism. We used Verilog
language to implement our designs and we synthesized
each one for Cyclone IV FPGA. Experimental results
show that our pipeline-based architectures perform edge
detection process more than 379 and 751 times faster
than software-based approach using MATLAB.

Fast FPGA-based Edge Detection
Architecture
• Accelerating edge detection process by
designing pipeline architecture
 Fetch
 Convolution
 Computing 𝑮𝒙 𝐚𝐧𝐝 𝑮𝒚
 𝑮 = 𝑮𝒙 + 𝑮𝒚

Real-Time Sobel Edge Detection
 Real time edge detection is obtained by implementing

edge detection algorithm over some hardware like
FPGA
 By passing convolution kernels over intensity image,
image gradients like 𝑮𝒙 and 𝑮𝒚 are obtained.
-1 -2 -1

1

0 -1

0

0

0

2

0 -2

1

2

1

1

0 -1

 The magnitude of the gradient is computed by (1)

𝟏) 𝑮 =

𝑮𝟐𝒙 + 𝑮𝟐𝒚 ≅ 𝑮𝒙 + 𝑮𝒚

 Direction of gradient is computed by (2)

𝟐) 𝜶 𝒙, 𝒚 =

𝒕𝒂𝒏−𝟏

 In one way parallel method,
eight
intensity
values
regarding to each image
pixel are fetched by each
clock cycle.
 Thanks to our pipeline,
image gradients 𝑮𝒙 and 𝑮𝒚
are obtained in parallel at
same clock cycle.

 In two way parallel
method, two different
paths are chosen and
sixteen intensity values
regarding to two pixels
are fetched at same
clock cycle.
 Two image gradients 𝑮𝒙
and 𝑮𝒚 are obtained at
same clock cycle.

• Accelerating edge detection process using
parallelism
 One way parallel method
 Two way parallel method

Experimental Results
 Image intensity values were written into txt file using
Matlab.
 Two proposed architectures were implemented by
Verilog language and simulated with Modelsim 10.1 d.
 Edge detection was done over Cameraman and Lenna
images with 128 by 128 pixels.

 15880 and 8005 clock cycles
are needed for one way and
two
way
parallel
architectures respectively.
 Filtered image intensity
values are reconstructed to
form the image using
Matlab.

𝑮𝒚
𝑮𝒙

Experimental Results
 Two
edge
detection
architectures
are
synthesized using Quartus
II 9.1 for cyclone IV FPGA.
 Total consumed memory
bits are identical but two
way parallel architecture
consumes
more
logic
elements.
 Our fast FPGA-based edge
detection architectures are
more than 379 and 751
times faster than softwarebased approach using
MATLAB.

Table 1. Device utilization summery

One
Way
Parallel
Method
Two
Way
Parallel
Method

Total Logic
Elements

Total
Memory
Bits

Used

666

1048576

Available

149760

6635520

Utilization

1%

16%

Used
Available

735
149760

1048576
6635520

Utilization

1%

16%

Table 2. Run-time of the Soble edge
detection process
MATLAB

One way
Method

Two way
Method

Lenna

0.120315

0.000317

0.000160

Camera man

0.124729

0.000317

0.000160

