2 research outputs found

    Convolutional Neural Network Accelerator on System on a Chip: Application at the Intel/Movidius Myriad2

    Get PDF
    Η παρούσα διπλωματική εργασία ασχολείται με την επιτάχνσυη Συνελεκτικών Νευρωρικών Δικτύων(CNN) στην Myriad2. Η διπλωματική αρχίζει με μία εισαγωγή στα CNN εστιάζοντας στις πράξεις των στρωμάτων από τα οποία αποτελούνται. Η Myriad2, μία μονάδας επεξεργάσιας οράσεως, και μερικές από τις εφαρμογές της παρουσιάζονται στο δεύτερο κεφάλαιο. Tέλος, ένας πρωτότυπος σχεδιασμός εφαρμοσμένος στην Myriad2 επιταχύνει την CNN επίλυση του προβλήματος της κατηγοριοποίησης χειρόγραφων ψηφίων και τέλος, ο σχεδιασμός αυτός συγκρίνεται και με αλλά συστήματα.This thesis considers an acceleration of the inference of a Convolutional Neural Network at Myriad 2. CNNs are introduced with a focus on their inference calculations. Myriad 2 Vision Processing Unit and its applications are presented next. A novel VPU design that accelerates a CNN solution of classifying handwritten numbers is developed and is compared with other systems and designs

    Resources and Power Efficient FPGA Accelerators for Real-Time Image Classification

    No full text
    A plethora of image and video-related applications involve complex processes that impose the need for hardware accelerators to achieve real-time performance. Among these, notable applications include the Machine Learning (ML) tasks using Convolutional Neural Networks (CNNs) that detect objects in image frames. Aiming at contributing to the CNN accelerator solutions, the current paper focuses on the design of Field-Programmable Gate Arrays (FPGAs) for CNNs of limited feature space to improve performance, power consumption and resource utilization. The proposed design approach targets the designs that can utilize the logic and memory resources of a single FPGA device and benefit mainly the edge, mobile and on-board satellite (OBC) computing; especially their image-processing- related applications. This work exploits the proposed approach to develop an FPGA accelerator for vessel detection on a Xilinx Virtex 7 XC7VX485T FPGA device (Advanced Micro Devices, Inc, Santa Clara, CA, USA). The resulting architecture operates on RGB images of size 80×80 or sliding windows; it is trained for the “Ships in Satellite Imagery” and by achieving frequency 270 MHz, completing the inference in 0.687 ms and consuming 5 watts, it validates the approach
    corecore