2,240 research outputs found

    Chipmunk: A Systolically Scalable 0.9 mm2{}^2, 3.08 Gop/s/mW @ 1.2 mW Accelerator for Near-Sensor Recurrent Neural Network Inference

    Get PDF
    Recurrent neural networks (RNNs) are state-of-the-art in voice awareness/understanding and speech recognition. On-device computation of RNNs on low-power mobile and wearable devices would be key to applications such as zero-latency voice-based human-machine interfaces. Here we present Chipmunk, a small (<1 mm2{}^2) hardware accelerator for Long-Short Term Memory RNNs in UMC 65 nm technology capable to operate at a measured peak efficiency up to 3.08 Gop/s/mW at 1.24 mW peak power. To implement big RNN models without incurring in huge memory transfer overhead, multiple Chipmunk engines can cooperate to form a single systolic array. In this way, the Chipmunk architecture in a 75 tiles configuration can achieve real-time phoneme extraction on a demanding RNN topology proposed by Graves et al., consuming less than 13 mW of average power

    Energy-Efficient Inference Accelerator for Memory-Augmented Neural Networks on an FPGA

    Full text link
    Memory-augmented neural networks (MANNs) are designed for question-answering tasks. It is difficult to run a MANN effectively on accelerators designed for other neural networks (NNs), in particular on mobile devices, because MANNs require recurrent data paths and various types of operations related to external memory access. We implement an accelerator for MANNs on a field-programmable gate array (FPGA) based on a data flow architecture. Inference times are also reduced by inference thresholding, which is a data-based maximum inner-product search specialized for natural language tasks. Measurements on the bAbI data show that the energy efficiency of the accelerator (FLOPS/kJ) was higher than that of an NVIDIA TITAN V GPU by a factor of about 125, increasing to 140 with inference thresholdingComment: Accepted to DATE 201

    Artificial Intelligence in Invoice Recognition: a Systematic Literature Review

    Get PDF
    In the era marked by a flourishing economy and rapid advancements in information technology, the proliferation of invoice data has accentuated the urgent need for automated invoice recognition. Traditional manual methods, long relied upon for this task, have proven to be inefficient, error-prone, and incapable of coping with the rising volume of invoices. This research endeavours to addresses the imperative of automating invoice recognition by exploring, assessing, and advancing cutting-edge algorithms, techniques, and methods within the domain of Artificial Intelligence (AI). This research conducts a comprehensive Systematic Literature Review (SLR) to investigate Computer Vision (CV) approaches, encompassing image preprocessing, Layout Analysis (LA), Optical Character Recognition (OCR), and Information Extraction (IE). The objective is to provide valuable insights into these fundamental components of invoice recognition, emphasizing their significance in achieving accuracy and efficiency. This exploration aims to contribute to the development of more effective automated systems for extracting information from invoices, addressing the challenges posed by diverse formats and content. The results indicate that in LA, the combination of Mask Region-based Convolutional Neural Networks (M-RCNN) and Feature Pyramid Network (FPN) achieves goods results. In OCR, algorithms like Convolutional Recurrent Neural Network (CRNN), You Only Look Once version 4 (YOLOv4) and models inspired by M-RCNN and Faster Region-based Convolutional Neural Network (F-RCNN) with ResNetXt-101 as the backbone demonstrate remarkable performance. When it comes to IE, algorithms inspired by F-RCNN and Region Proposal Network (RPN), Grid Convolutional Neural Network (G-CNN) and Layer Graph Convolutional Networks (LGCN), and Gated Graph Convolutional Network (GatedGCN) consistently deliver the best results

    Enhancing Automation with Label Defect Detection and Content Parsing Algorithms

    Get PDF
    The stable operation of power transmission and distribution is closely related to the overall performance and construction quality of circuit breakers. Focusing on circuit breakers as the research subject, we propose a machine vision method for automated defect detection, which can be applied in intelligent robots to improve detection efficiency, reduce costs, and address the issues related to performance and assembly quality. Based on the LeNet-5 convolutional neural network, a method for the detection of character defects on labels is proposed. This method is then combined with squeezing and excitation networks to achieve more precise classification with a feature graph mechanism. The experimental results show the accuracy of the LeNet-CB model can reach up to 99.75%, while the average time for single character detection is 17.9 milliseconds. Although the LeNet-SE model demonstrates certain limitations in handling some easily confused characters, it maintains an average accuracy of 98.95%. Through further optimization, a label content detection method based on the LSTM framework is constructed, with an average accuracy of 99.57%, and an average detection time of 84 milliseconds. Overall, the system meets the detection accuracy requirements and delivers a rapid response. making the results of this research a meaningful contribution to the practical foundation for ongoing improvements in robot intelligence and machine vision
    corecore