160 research outputs found

    Optimizing Bayesian Recurrent Neural Networks on an FPGA-based Accelerator

    Get PDF
    Neural networks have demonstrated their outstanding performance in a wide range of tasks. Specifically recurrent architectures based on long-short term memory (LSTM) cells have manifested excellent capability to model time dependencies in real-world data. However, standard recurrent architectures cannot estimate their uncertainty which is essential for safety-critical applications such as in medicine. In contrast, Bayesian recurrent neural networks (RNNs) are able to provide uncertainty estimation with improved accuracy. Nonetheless, Bayesian RNNs are computationally and memory demanding, which limits their practicality despite their advantages. To address this issue, we propose an FPGA-based hardware design to accelerate Bayesian LSTM-based RNNs. To further improve the overall algorithmic-hardware performance, a co-design framework is proposed to explore the most fitting algorithmic-hardware configurations for Bayesian RNNs. We conduct extensive experiments on healthcare applications to demonstrate the improvement of our design and the effectiveness of our framework. Compared with GPU implementation, our FPGA-based design can achieve up to 10 times speedup with nearly 106 times higher energy efficiency. To the best of our knowledge, this is the first work targeting acceleration of Bayesian RNNs on FPGAs

    Review: Recent Directions in ECG-FPGA Researches

    Get PDF
    لقد شهدت السنوات القليلة الماضية اهتماماً متزايداً نحو استخدام مصفوفة البوابات المنطقية القابلة للبرمجة FPGA في التطبيقات المختلفة. لقد أدى التقدم الحاصل في مرونة التعامل مع الموارد بالاضافة الى الزيادة في سرعة الاداء وانخفاض الثمن للـ FPGA وكذلك الاستهلاك القليل للطاقة الى هذا الاهتمام المتزايد بالـ FPGA. ان استخدام الـ FPGA في مجالات الطب والصحة يهدف بشكل عام الى استبدال اجهزة المراقبة الطبية كبيرة الحجم وغالية الثمن باخرى أصغر حجماً مع امكانية تصميمها لكي تكون اجهزة محمولة اعتماداً على مرونة التصميم التي يوفرها الـ FPGA. إنصب الاهتمام في العديد من البحوث الحالية على استخدام نظام FPGA لمعالجة الجوانب المتعلقة بإشارة تخطيط القلب وذلك لتوفير التحسينات في الاداء وزيادة السرعة بالاضافة الى أيجاد وإقتراح افكار جديدة لمثل هذه التطبيقات. ان هذا البحث يوفر نظرة عامة عن الاتجاهات الحالية في انظمة ECG-FPGA.The last few years witnessed an increased interest in utilizing field programmable gate array (FPGA) for a variety of applications. This utilizing derived mostly by the advances in the FPGA flexible resource configuration, increased speed, relatively low cost and low energy consumption. The introduction of FPGA in medicine and health care field aim generally to replace costly and usually bigger medical monitoring and diagnostic equipment with much smaller and possibly portable systems based on FPGA that make use of the design flexibility of FPGA. Many recent researches focus on FPGA systems to deal with the well-known yet very important electrocardiogram (ECG) signal aspects to provide acceleration and improvement in the performance as well as finding and proposing new ideas for such implementations. The recent directions in ECG-FPGA are introduced in this paper

    Hardware Implementation of Deep Network Accelerators Towards Healthcare and Biomedical Applications

    Get PDF
    With the advent of dedicated Deep Learning (DL) accelerators and neuromorphic processors, new opportunities are emerging for applying deep and Spiking Neural Network (SNN) algorithms to healthcare and biomedical applications at the edge. This can facilitate the advancement of the medical Internet of Things (IoT) systems and Point of Care (PoC) devices. In this paper, we provide a tutorial describing how various technologies ranging from emerging memristive devices, to established Field Programmable Gate Arrays (FPGAs), and mature Complementary Metal Oxide Semiconductor (CMOS) technology can be used to develop efficient DL accelerators to solve a wide variety of diagnostic, pattern recognition, and signal processing problems in healthcare. Furthermore, we explore how spiking neuromorphic processors can complement their DL counterparts for processing biomedical signals. After providing the required background, we unify the sparsely distributed research on neural network and neuromorphic hardware implementations as applied to the healthcare domain. In addition, we benchmark various hardware platforms by performing a biomedical electromyography (EMG) signal processing task and drawing comparisons among them in terms of inference delay and energy. Finally, we provide our analysis of the field and share a perspective on the advantages, disadvantages, challenges, and opportunities that different accelerators and neuromorphic processors introduce to healthcare and biomedical domains. This paper can serve a large audience, ranging from nanoelectronics researchers, to biomedical and healthcare practitioners in grasping the fundamental interplay between hardware, algorithms, and clinical adoption of these tools, as we shed light on the future of deep networks and spiking neuromorphic processing systems as proponents for driving biomedical circuits and systems forward.Comment: Submitted to IEEE Transactions on Biomedical Circuits and Systems (21 pages, 10 figures, 5 tables

    Demonstrating Analog Inference on the BrainScaleS-2 Mobile System

    Full text link
    We present the BrainScaleS-2 mobile system as a compact analog inference engine based on the BrainScaleS-2 ASIC and demonstrate its capabilities at classifying a medical electrocardiogram dataset. The analog network core of the ASIC is utilized to perform the multiply-accumulate operations of a convolutional deep neural network. At a system power consumption of 5.6W, we measure a total energy consumption of 192uJ for the ASIC and achieve a classification time of 276us per electrocardiographic patient sample. Patients with atrial fibrillation are correctly identified with a detection rate of (93.7±{\pm}0.7)% at (14.0±{\pm}1.0)% false positives. The system is directly applicable to edge inference applications due to its small size, power envelope, and flexible I/O capabilities. It has enabled the BrainScaleS-2 ASIC to be operated reliably outside a specialized lab setting. In future applications, the system allows for a combination of conventional machine learning layers with online learning in spiking neural networks on a single neuromorphic platform

    Wearable Technologies and AI at the Far Edge for Chronic Heart Failure Prevention and Management: A Systematic Review and Prospects

    Get PDF
    Smart wearable devices enable personalized at-home healthcare by unobtrusively collecting patient health data and facilitating the development of intelligent platforms to support patient care and management. The accurate analysis of data obtained from wearable devices is crucial for interpreting and contextualizing health data and facilitating the reliable diagnosis and management of critical and chronic diseases. The combination of edge computing and artificial intelligence has provided real-time, time-critical, and privacy-preserving data analysis solutions. However, based on the envisioned service, evaluating the additive value of edge intelligence to the overall architecture is essential before implementation. This article aims to comprehensively analyze the current state of the art on smart health infrastructures implementing wearable and AI technologies at the far edge to support patients with chronic heart failure (CHF). In particular, we highlight the contribution of edge intelligence in supporting the integration of wearable devices into IoT-aware technology infrastructures that provide services for patient diagnosis and management. We also offer an in-depth analysis of open challenges and provide potential solutions to facilitate the integration of wearable devices with edge AI solutions to provide innovative technological infrastructures and interactive services for patients and doctors

    Arrhythmia Classifier Based on Ultra-Lightweight Binary Neural Network

    Full text link
    Reasonably and effectively monitoring arrhythmias through ECG signals has significant implications for human health. With the development of deep learning, numerous ECG classification algorithms based on deep learning have emerged. However, most existing algorithms trade off high accuracy for complex models, resulting in high storage usage and power consumption. This also inevitably increases the difficulty of implementation on wearable Artificial Intelligence-of-Things (AIoT) devices with limited resources. In this study, we proposed a universally applicable ultra-lightweight binary neural network(BNN) that is capable of 5-class and 17-class arrhythmia classification based on ECG signals. Our BNN achieves 96.90% (full precision 97.09%) and 97.50% (full precision 98.00%) accuracy for 5-class and 17-class classification, respectively, with state-of-the-art storage usage (3.76 KB and 4.45 KB). Compared to other binarization works, our approach excels in supporting two multi-classification modes while achieving the smallest known storage space. Moreover, our model achieves optimal accuracy in 17-class classification and boasts an elegantly simple network architecture. The algorithm we use is optimized specifically for hardware implementation. Our research showcases the potential of lightweight deep learning models in the healthcare industry, specifically in wearable medical devices, which hold great promise for improving patient outcomes and quality of life. Code is available on: https://github.com/xpww/ECG_BNN_NetComment: 6 pages, 3 figure

    Neuromorphic computing based on stochastic spiking reservoir for heartbeat classification

    Get PDF
    Heart disease is the leading cause of mortality worldwide. The precise heartbeat classification usually requires a higher number of extracted features and heartbeats of the same class may also behave differently in patients. This will lead to computation overhead and challenges in hardware implementation due to the large number of nodes utilized in reservoir computing (RC) networks. In this work, a reservoir computing-based stochastic spiking neural network (SSNN) has been proposed for heartbeat rhythm classification, enabling a patient adaptable and more efficient hardware implementation with low computation overhead caused by minimum extracted features. Only a single feature is employed in template matching to achieve patient adaptability with minimal computation overhead. The single feature, QRS complexes, was extracted and fed into the neural reservoir with 20 neurons in a cyclic topology for arrhythmia similarity calculation and classification. 43 recordings of Electrocardiogram (ECG) signals that included both normal and arrhythmic beats from MIT-BIH arrhythmia database obtained from Physio-Net were used in this work. The proposed stochastic spiking reservoir achieves a sensitivity of 99.6% and an accuracy of 96.91%, signifying that the system is accurate and efficient in classifying normal and abnormal arrhythmias

    Simulation and implementation of novel deep learning hardware architectures for resource constrained devices

    Get PDF
    Corey Lammie designed mixed signal memristive-complementary metal–oxide–semiconductor (CMOS) and field programmable gate arrays (FPGA) hardware architectures, which were used to reduce the power and resource requirements of Deep Learning (DL) systems; both during inference and training. Disruptive design methodologies, such as those explored in this thesis, can be used to facilitate the design of next-generation DL systems

    Reconfigurable acceleration of Recurrent Neural Networks

    Get PDF
    Recurrent Neural Networks (RNNs) have been successful in a wide range of applications involving temporal sequences such as natural language processing, speech recognition and video analysis. However, RNNs often require a significant amount of memory and computational resources. In addition, the recurrent nature and data dependencies in RNN computations can lead to system stall, resulting in low throughput and high latency. This work describes novel parallel hardware architectures for accelerating RNN inference using Field-Programmable Gate Array (FPGA) technology, which considers the data dependencies and high computational costs of RNNs. The first contribution of this thesis is a latency-hiding architecture that utilizes column-wise matrix-vector multiplication instead of the conventional row-wise operation to eliminate data dependencies and improve the throughput of RNN inference designs. This architecture is further enhanced by a configurable checkerboard tiling strategy which allows large dimensions of weight matrices, while supporting element-based parallelism and vector-based parallelism. The presented reconfigurable RNN designs show significant speedup over CPU, GPU, and other FPGA designs. The second contribution of this thesis is a weight reuse approach for large RNN models with weights stored in off-chip memory, running with a batch size of one. A novel blocking-batching strategy is proposed to optimize the throughput of large RNN designs on FPGAs by reusing the RNN weights. Performance analysis is also introduced to enable FPGA designs to achieve the best trade-off between area, power consumption and performance. Promising power efficiency improvement has been achieved in addition to speeding up over CPU and GPU designs. The third contribution of this thesis is a low latency design for RNNs based on a partially-folded hardware architecture. It also introduces a technique that balances initiation interval of multi-layer RNN inferences to increase hardware efficiency and throughput while reducing latency. The approach is evaluated on a variety of applications, including gravitational wave detection and Bayesian RNN-based ECG anomaly detection. To facilitate the use of this approach, we open source an RNN template which enables the generation of low-latency FPGA designs with efficient resource utilization using high-level synthesis tools.Open Acces
    corecore