Hardware-based acceleration is an extensive attempt to facilitate many
computationally-intensive mathematics operations. This paper proposes an
FPGA-based architecture to accelerate the convolution operation - a complex and
expensive computing step that appears in many Convolutional Neural Network
models. We target the design to the standard convolution operation, intending
to launch the product as an edge-AI solution. The project's purpose is to
produce an FPGA IP core that can process a convolutional layer at a time.
System developers can deploy the IP core with various FPGA families by using
Verilog HDL as the primary design language for the architecture. The
experimental results show that our single computing core synthesized on a
simple edge computing FPGA board can offer 0.224 GOPS. When the board is fully
utilized, 4.48 GOPS can be achieved.Comment: 11 pages, 6 figures, accepted to The First International Conference
on Intelligence of Things (ICIT 2022