75 research outputs found

    Machine Learning inference using PYNQ environment in a AWS EC2 F1 Instance

    Get PDF
    In the past few years, using Machine and Deep Learning techniques has become more and more viable, thanks to the availability of tools which make the need of specific knowledge in the realm of data science and complex networks less vital to achieve a satisfactory final result in a variety of research fields. This process has caused an explosion in the adoption of such techniques, e.g. in the context of High Energy Physics. The range of applications for ML becomes even larger if we consider the implementation of these algorithms on low-latency hardware like FPGAs which promise smaller latency with respect to traditional inference algorithms running on general purpose CPUs. This paper presents and discusses the activity running at the University of Bologna and INFN-Bologna where a new open-source project from Xilinx called PYNQ is being tested. Its purpose is to grant designers the possibility to exploit the benefits of programmable logic and microprocessors using the Python language and libraries. This new software environment can be deployed on a variety of Xilinx platforms, from the simplest ones like ZYNQ boards, to more advanced and high performance ones, like Alveo accelerator cards and AWS EC2 F1 instances. The use of cloud computing in this work lets us test the capabilities of this new workflow, from the creation and training of a Neural Network and the creation of a HLS project using HLS4ML, to testing the predictions of the NN using PYNQ APIs and functions written in Pytho

    Integration of MFCC Extraction and LSTM Algorithm on PYNQ-Z2 for Enhanced Audio Analysis

    Get PDF
    The need for Speech Emotion Recognition (SER) is growing since researchers have found it difficult to interpret human emotions from speech data. SER is very interesting yet very challenging   task   of human-computer interaction (HCI). The SER application can be benefitted depending on the type of feature extraction technique and model used for classification. Deep Learning has made a great impact in the field of audio, image, video, EEG and ECG classification. The speech signal characteristics and classification model affect how well the SER application performs.  The paper briefs about deploying Deep Learning Algorithm on FPGA based board i.e., PYNQ-Z2. MFCC feature extraction technique and LSTM model used for classification of human emotion is implemented on the board. Emotion can be predicted using led buttons on the board

    Efficient Hardware Architectures for Accelerating Deep Neural Networks: Survey

    Get PDF
    In the modern-day era of technology, a paradigm shift has been witnessed in the areas involving applications of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL). Specifically, Deep Neural Networks (DNNs) have emerged as a popular field of interest in most AI applications such as computer vision, image and video processing, robotics, etc. In the context of developed digital technologies and the availability of authentic data and data handling infrastructure, DNNs have been a credible choice for solving more complex real-life problems. The performance and accuracy of a DNN is a way better than human intelligence in certain situations. However, it is noteworthy that the DNN is computationally too cumbersome in terms of the resources and time to handle these computations. Furthermore, general-purpose architectures like CPUs have issues in handling such computationally intensive algorithms. Therefore, a lot of interest and efforts have been invested by the research fraternity in specialized hardware architectures such as Graphics Processing Unit (GPU), Field Programmable Gate Array (FPGA), Application Specific Integrated Circuit (ASIC), and Coarse Grained Reconfigurable Array (CGRA) in the context of effective implementation of computationally intensive algorithms. This paper brings forward the various research works carried out on the development and deployment of DNNs using the aforementioned specialized hardware architectures and embedded AI accelerators. The review discusses the detailed description of the specialized hardware-based accelerators used in the training and/or inference of DNN. A comparative study based on factors like power, area, and throughput, is also made on the various accelerators discussed. Finally, future research and development directions are discussed, such as future trends in DNN implementation on specialized hardware accelerators. This review article is intended to serve as a guide for hardware architectures for accelerating and improving the effectiveness of deep learning research.publishedVersio

    Hardware/Software Co-verification of DNN for IoT Edge Devices: Leveraging FPGAs

    Get PDF
    Deep neural networks (DNN) is one of the emerging types of machine learning that is being used to solve problems that are too complex to be solved by humans. However, with the growing complexity of the problems in practical applications today, there are more computations and a need for portability for these DNN. With the utilization of graphics processing units (GPUs) to perform the computations required by DNN now deep learning is becoming ubiquitous. However, the downside is that GPUs are not portable to deploy inference on non-specialized general purpose embedded system designs or internet of things (IoT) applications. An alternative is to use a Field Programmable Gate Arrays (FPGAs) which is an embedded system designed for data-intensive processing and can even be used in low-power battery operated devices. FPGAs can achieve a high level of performance at a lower cost of power. In this work, a complete methodology is shown to deploy inference phase of DNN. One of the most popular DNN architecture known as Convolutional Neural Network (CNN) has been implemented on a FPGA board known at PYNQ-Z1 using the PYNQ platform to execute the CNN. Our methodology takes the advantage of Python overlay provided for PYNQ-Z1 to easily debug and verify the CNN-IP generated by regular syntheses tools such as Xilinx’s Vivado

    Low power and high performance heterogeneous computing on FPGAs

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen
    • …
    corecore