Hardware/Software Co-verification of DNN for IoT Edge Devices: Leveraging FPGAs by Groves, Katie & Odetola, Tolulope
Hardware/Software Co-verification of DNN for IoT Edge Devices: 
Leveraging FPGAs
By Katie Groves and Tolulope A. Odetola, Electrical and Computer Engineering Department
Abstract
Deep neural networks (DNN) is one of the emerging types of
machine learning that is being used to solve problems that are too
complex to be solved by humans. Field Programmable Gate
Arrays (FPGAs) can be used to implement DNNs for IoTs
because of their high throughput, low power operations, and
portability. This research shows how one type of DNN can be
placed on a PYNQ-Z1 board and maintain same prediction
accuracy.
Method
Introduction
• There is a need for IoTs and edge
devices to do real time classifications
• Currently, predictions and calculations
for machine learning are sent through
the network to perform computations off
site and sent back to the device to
output the answer.
• Applying a DNN directly to an IoT will
eliminate the need for IoTs to be
connected to a off site server and
consequently save bandwidth along with
reduced bandwidth
Need:
• With the need for internet and time taken to send and receive
data, this method is too slow [2].
• Portability and flexibility of an FPGA with a DNN will allow for
an adaption to changing environments and inputs from
peripherals.
• When applying a deep learning architecture to an FPGA , a
hardware/software co-verification is needed to ensure the
accuracy is still maintained after implementation onto the
hardware.
Research Question:
• How to accommodate for memory of storing weights and
biases needed by the architecture and computationally
intensive deep learning architectures into the FPGA?
• What methods to use to ensure a hardware/software co-
verification technique?
• A solution to this problem would be to have an FPGA on the
edge deceive to run computations and make predictions on
site.
• A compression of the machine learning framework is needed
in order to fit the architecture on the FPGA.
Acknowledgements: Dr. Hasan, the Electrical and Computer Engineering Department, and the Gower Fellowship 
Contact: kmgroves42@students.tntech.edu
Results
For the following results, an image of a hand written 4, shown in Figure 2, was
passed through both of the software and hardware networks. The results are shown
below in Table 1.
Discussion
• After taking the original CNN code and efficiently reducing the
overall size of the code and the parameters, the code
successfully ran in the program Vivado HLS.
• A 28x28 black and white image of a hand written number is
called by the test bench code. The test bench code then passes
the image to the CNN.
• A prediction with its predetermined weights is made. Given
different hand written images, the code has successfully
outputted the correct prediction for hand written numbers.
References
[1] Freepik. “Chip Free Vector Icons Designed by Freepik.” Flaticon, www.flaticon.com/free-
icon/chip_897219.
[2] H. Li, K. Ota, and M. Dong, “Learning IoT in Edge: Deep Learning for the Internet of Things with 
Edge Computing,” IEEE Network, vol. 32, no. 1, pp. 96–101, Jan. 2018.
[3] “Python Zynq = PYNQ, Which Runs on Digilent's New $229 Pink PYNQ-Z1 Python Productivity 
Package.” Community Forums, 28 June 2018, forums.xilinx.com/t5/Xcell-Daily-Blog-Archived/bg-
p/Xcell/page/24.
[4] L. Stornaiuolo, M. Santambrogio, and D. Sciuto, “On How to Efficiently Implement Deep Learning 
Algorithms on PYNQ Platform,” 2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 
2018.
Conclusion/Future Works
It can be seen from our work that complex neural networks can be
compressed onto hardware and still maintain accurate results just like
its original software state.
Implementing the final CNN code on the PYNQ board help us with
building a framework which will allow testing and verification of deep
learning on a FPGA to be very convenient.
Future Work:
• Evaluating the effects of approximate computing on the accuracy of
the deployed model on the PYNQ-Z1 board, shown in Figure 3.
• Coming up with a framework for hardware/software co-verification of
DNN without requiring intricate knowledge of FPGA design.
• Train the CNN on Caffe using a GPUTrain
• Test the weights to ensure correct prediction
• Test reducing weight bit size and confirm its accuracyTest
• C++ layer abstraction of the network
• Convert parameters from python to C++C++
• Implement C++ layer abstraction in Vivado HLS
• Ensure the code compiles with no errorsImplement
• Read Test Image from File 
• Ensure it Mimics the Caffe outputTest
• Package Code and Generate custom IP in Vivado
• Configure and deploy code to PYNQ-Z1Deploy
Fig. 2: Hand 
written 4 
Conv1
Pool1
Conv2
Caffe Vivado
Predict
Table 1: Predicted results of both Caffe and Vivado
Fig. 1: Intelligent Hardware [1]
Fig. 3: Xilinx PYNQ-Z1 [3]
