1 research outputs found
AddNet: Deep Neural Networks Using FPGA-Optimized Multipliers
Low-precision arithmetic operations to accelerate deep-learning applications
on field-programmable gate arrays (FPGAs) have been studied extensively,
because they offer the potential to save silicon area or increase throughput.
However, these benefits come at the cost of a decrease in accuracy. In this
article, we demonstrate that reconfigurable constant coefficient multipliers
(RCCMs) offer a better alternative for saving the silicon area than utilizing
low-precision arithmetic. RCCMs multiply input values by a restricted choice of
coefficients using only adders, subtractors, bit shifts, and multiplexers
(MUXes), meaning that they can be heavily optimized for FPGAs. We propose a
family of RCCMs tailored to FPGA logic elements to ensure their efficient
utilization. To minimize information loss from quantization, we then develop
novel training techniques that map the possible coefficient representations of
the RCCMs to neural network weight parameter distributions. This enables the
usage of the RCCMs in hardware, while maintaining high accuracy. We demonstrate
the benefits of these techniques using AlexNet, ResNet-18, and ResNet-50
networks. The resulting implementations achieve up to 50% resource savings over
traditional 8-bit quantized networks, translating to significant speedups and
power savings. Our RCCM with the lowest resource requirements exceeds 6-bit
fixed point accuracy, while all other implementations with RCCMs achieve at
least similar accuracy to an 8-bit uniformly quantized design, while achieving
significant resource savings.Comment: 14 page