457 research outputs found

    Accelerating Deterministic and Stochastic Binarized Neural Networks on FPGAs Using OpenCL

    Full text link
    Recent technological advances have proliferated the available computing power, memory, and speed of modern Central Processing Units (CPUs), Graphics Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs). Consequently, the performance and complexity of Artificial Neural Networks (ANNs) is burgeoning. While GPU accelerated Deep Neural Networks (DNNs) currently offer state-of-the-art performance, they consume large amounts of power. Training such networks on CPUs is inefficient, as data throughput and parallel computation is limited. FPGAs are considered a suitable candidate for performance critical, low power systems, e.g. the Internet of Things (IOT) edge devices. Using the Xilinx SDAccel or Intel FPGA SDK for OpenCL development environment, networks described using the high-level OpenCL framework can be accelerated on heterogeneous platforms. Moreover, the resource utilization and power consumption of DNNs can be further enhanced by utilizing regularization techniques that binarize network weights. In this paper, we introduce, to the best of our knowledge, the first FPGA-accelerated stochastically binarized DNN implementations, and compare them to implementations accelerated using both GPUs and FPGAs. Our developed networks are trained and benchmarked using the popular MNIST and CIFAR-10 datasets, and achieve near state-of-the-art performance, while offering a >16-fold improvement in power consumption, compared to conventional GPU-accelerated networks. Both our FPGA-accelerated determinsitic and stochastic BNNs reduce inference times on MNIST and CIFAR-10 by >9.89x and >9.91x, respectively.Comment: 4 pages, 3 figures, 1 tabl

    Efficient Design of Triplet Based Spike-Timing Dependent Plasticity

    Full text link
    Spike-Timing Dependent Plasticity (STDP) is believed to play an important role in learning and the formation of computational function in the brain. The classical model of STDP which considers the timing between pairs of pre-synaptic and post-synaptic spikes (p-STDP) is incapable of reproducing synaptic weight changes similar to those seen in biological experiments which investigate the effect of either higher order spike trains (e.g. triplet and quadruplet of spikes), or, simultaneous effect of the rate and timing of spike pairs on synaptic plasticity. In this paper, we firstly investigate synaptic weight changes using a p-STDP circuit and show how it fails to reproduce the mentioned complex biological experiments. We then present a new STDP VLSI circuit which acts based on the timing among triplets of spikes (t-STDP) that is able to reproduce all the mentioned experimental results. We believe that our new STDP VLSI circuit improves upon previous circuits, whose learning capacity exceeds current designs due to its capability of mimicking the outcomes of biological experiments more closely; thus plays a significant role in future VLSI implementation of neuromorphic systems

    Design and Implementation of BCM Rule Based on Spike-Timing Dependent Plasticity

    Full text link
    The Bienenstock-Cooper-Munro (BCM) and Spike Timing-Dependent Plasticity (STDP) rules are two experimentally verified form of synaptic plasticity where the alteration of synaptic weight depends upon the rate and the timing of pre- and post-synaptic firing of action potentials, respectively. Previous studies have reported that under specific conditions, i.e. when a random train of Poissonian distributed spikes are used as inputs, and weight changes occur according to STDP, it has been shown that the BCM rule is an emergent property. Here, the applied STDP rule can be either classical pair-based STDP rule, or the more powerful triplet-based STDP rule. In this paper, we demonstrate the use of two distinct VLSI circuit implementations of STDP to examine whether BCM learning is an emergent property of STDP. These circuits are stimulated with random Poissonian spike trains. The first circuit implements the classical pair-based STDP, while the second circuit realizes a previously described triplet-based STDP rule. These two circuits are simulated using 0.35 um CMOS standard model in HSpice simulator. Simulation results demonstrate that the proposed triplet-based STDP circuit significantly produces the threshold-based behaviour of the BCM. Also, the results testify to similar behaviour for the VLSI circuit for pair-based STDP in generating the BCM

    Automated machine learning for healthcare and clinical notes analysis

    Get PDF
    Machine learning (ML) has been slowly entering every aspect of our lives and its positive impact has been astonishing. To accelerate embedding ML in more applications and incorporating it in real-world scenarios, automated machine learning (AutoML) is emerging. The main purpose of AutoML is to provide seamless integration of ML in various industries, which will facilitate better outcomes in everyday tasks. In healthcare, AutoML has been already applied to easier settings with structured data such as tabular lab data. However, there is still a need for applying AutoML for interpreting medical text, which is being generated at a tremendous rate. For this to happen, a promising method is AutoML for clinical notes analysis, which is an unexplored research area representing a gap in ML research. The main objective of this paper is to fill this gap and provide a comprehensive survey and analytical study towards AutoML for clinical notes. To that end, we first introduce the AutoML technology and review its various tools and techniques. We then survey the literature of AutoML in the healthcare industry and discuss the developments specific to clinical settings, as well as those using general AutoML tools for healthcare applications. With this background, we then discuss challenges of working with clinical notes and highlight the benefits of developing AutoML for medical notes processing. Next, we survey relevant ML research for clinical notes and analyze the literature and the field of AutoML in the healthcare industry. Furthermore, we propose future research directions and shed light on the challenges and opportunities this emerging field holds. With this, we aim to assist the community with the implementation of an AutoML platform for medical notes, which if realized can revolutionize patient outcomes

    Neuromorphic engineering: neuromimetic computation for understanding the brain

    Get PDF
    Neuromorphic engineering attempts to understand the computational properties of neural processing systems by building electronic circuits and systems that emulate the principles of computation in the neural systems. The electronic systems that are developed in this process can serve both engineering and life sciences in various ways ranging from low-power brain-like computing embedded systems to neural-based control, brain machine interfaces, and neuroprosthesis. To realize such systems, various approaches and strategies with their own advantages and limitations, may be adopted. Here, we provide a summary of our recent article published in the proceedings of the IEEE [1], where we have discussed and reviewed the various approaches to the design and implementation of neuromorphic learning systems, and pointed out challenges and opportunities in these systems

    Neuromorphic engineering: neuromimetic computation for understanding the brain

    Get PDF
    Neuromorphic engineering attempts to understand the computational properties of neural processing systems by building electronic circuits and systems that emulate the principles of computation in the neural systems. The electronic systems that are developed in this process can serve both engineering and life sciences in various ways ranging from low-power brain-like computing embedded systems to neural-based control, brain machine interfaces, and neuroprosthesis. To realize such systems, various approaches and strategies with their own advantages and limitations, may be adopted. Here, we provide a summary of our recent article published in the proceedings of the IEEE [1], where we have discussed and reviewed the various approaches to the design and implementation of neuromorphic learning systems, and pointed out challenges and opportunities in these systems

    Design and analysis of efficient QCA reversible adders

    Get PDF
    Quantum-dot cellular automata (QCA) as an emerging nanotechnology are envisioned to overcome the scaling and the heat dissipation issues of the current CMOS technology. In a QCA structure, information destruction plays an essential role in the overall heat dissipation, and in turn in the power consumption of the system. Therefore, reversible logic, which significantly controls the information flow of the system, is deemed suitable to achieve ultra-low-power structures. In order to benefit from the opportunities QCA and reversible logic provide, in this paper, we first review and implement prior reversible full-adder art in QCA. We then propose a novel reversible design based on three- and five-input majority gates, and a robust one-layer crossover scheme. The new full-adder significantly advances previous designs in terms of the optimization metrics, namely cell count, area, and delay. The proposed efficient full-adder is then used to design reversible ripple-carry adders (RCAs) with different sizes (i.e., 4, 8, and 16 bits). It is demonstrated that the new RCAs lead to 33% less garbage outputs, which can be essential in terms of lowering power consumption. This along with the achieved improvements in area, complexity, and delay introduces an ultra-efficient reversible QCA adder that can be beneficial in developing future computer arithmetic circuits and architecture

    Training Progressively Binarizing Deep Networks Using FPGAs

    Full text link
    While hardware implementations of inference routines for Binarized Neural Networks (BNNs) are plentiful, current realizations of efficient BNN hardware training accelerators, suitable for Internet of Things (IoT) edge devices, leave much to be desired. Conventional BNN hardware training accelerators perform forward and backward propagations with parameters adopting binary representations, and optimization using parameters adopting floating or fixed-point real-valued representations--requiring two distinct sets of network parameters. In this paper, we propose a hardware-friendly training method that, contrary to conventional methods, progressively binarizes a singular set of fixed-point network parameters, yielding notable reductions in power and resource utilizations. We use the Intel FPGA SDK for OpenCL development environment to train our progressively binarizing DNNs on an OpenVINO FPGA. We benchmark our training approach on both GPUs and FPGAs using CIFAR-10 and compare it to conventional BNNs.Comment: Accepted at 2020 IEEE International Symposium on Circuits and Systems (ISCAS

    Semi-supervised and weakly-supervised deep neural networks and dataset for fish detection in turbid underwater videos

    Get PDF
    Fish are key members of marine ecosystems, and they have a significant share in the healthy human diet. Besides, fish abundance is an excellent indicator of water quality, as they have adapted to various levels of oxygen, turbidity, nutrients, and pH. To detect various fish in underwater videos, Deep Neural Networks (DNNs) can be of great assistance. However, training DNNs is highly dependent on large, labeled datasets, while labeling fish in turbid underwater video frames is a laborious and time-consuming task, hindering the development of accurate and efficient models for fish detection. To address this problem, firstly, we have collected a dataset called FishInTurbidWater, which consists of a collection of video footage gathered from turbid waters, and quickly and weakly (i.e., giving higher priority to speed over accuracy) labeled them in a 4-times fast-forwarding software. Next, we designed and implemented a semi-supervised contrastive learning fish detection model that is self-supervised using unlabeled data, and then fine-tuned with a small fraction (20%) of our weakly labeled FishInTurbidWater data. At the next step, we trained, using our weakly labeled data, a novel weakly-supervised ensemble DNN with transfer learning from ImageNet. The results show that our semi-supervised contrastive model leads to more than 20 times faster turnaround time between dataset collection and result generation, with reasonably high accuracy (89%). At the same time, the proposed weakly-supervised ensemble model can detect fish in turbid waters with high (94%) accuracy, while still cutting the development time by a factor of four, compared to fully-supervised models trained on carefully labeled datasets. Our dataset and code are publicly available at the hyperlink FishInTurbidWater

    Variation-aware binarized memristive networks

    Get PDF
    The quantization of weights to binary states in Deep Neural Networks (DNNs) can replace resource-hungry multiply accumulate operations with simple accumulations. Such Binarized Neural Networks (BNNs) exhibit greatly reduced resource and power requirements. In addition, memristors have been shown as promising synaptic weight elements in DNNs. In this paper, we propose and simulate novel Binarized Memristive Convolutional Neural Network (BMCNN) architectures employing hybrid weight and parameter representations. We train the proposed architectures offline and then map the trained parameters to our binarized memristive devices for inference. To take into account the variations in memristive devices, and to study their effect on the performance, we introduce variations in R ON and R OFF . Moreover, we introduce means to mitigate the adverse effect of memristive variations in our proposed networks. Finally, we benchmark our BMCNNs and variation-aware BMCNNs using the MNIST dataset
    • …
    corecore