4,812 research outputs found

    A user configurable data acquisition and signal processing system for high-rate, high channel count applications

    Get PDF
    Real-time signal processing in plasma fusion experiments is required for control and for data reduction as plasma pulse times grow longer. The development time and cost for these high-rate, multichannel signal processing systems can be significant. This paper proposes a new digital signal processing (DSP) platform for the data acquisition system that will allow users to easily customize real-time signal processing systems to meet their individual requirements. The D-TACQ reconfigurable user in-line DSP (DRUID) system carries out the signal processing tasks in hardware co-processors (CPs) implemented in an FPGA, with an embedded microprocessor (μP) for control. In the fully developed platform, users will be able to choose co-processors from a library and configure programmable parameters through the μP to meet their requirements. The DRUID system is implemented on a Spartan 6 FPGA, on the new rear transition module (RTM-T), a field upgrade to existing D-TACQ digitizers. As proof of concept, a multiply-accumulate (MAC) co-processor has been developed, which can be configured as a digital chopper-integrator for long pulse magnetic fusion devices. The DRUID platform allows users to set options for the integrator, such as the number of masking samples. Results from the digital integrator are presented for a data acquisition system with 96 channels simultaneously acquiring data at 500 kSamples/s per channel

    Learning Transferable Architectures for Scalable Image Recognition

    Full text link
    Developing neural network image classification models often requires significant architecture engineering. In this paper, we study a method to learn the model architectures directly on the dataset of interest. As this approach is expensive when the dataset is large, we propose to search for an architectural building block on a small dataset and then transfer the block to a larger dataset. The key contribution of this work is the design of a new search space (the "NASNet search space") which enables transferability. In our experiments, we search for the best convolutional layer (or "cell") on the CIFAR-10 dataset and then apply this cell to the ImageNet dataset by stacking together more copies of this cell, each with their own parameters to design a convolutional architecture, named "NASNet architecture". We also introduce a new regularization technique called ScheduledDropPath that significantly improves generalization in the NASNet models. On CIFAR-10 itself, NASNet achieves 2.4% error rate, which is state-of-the-art. On ImageNet, NASNet achieves, among the published works, state-of-the-art accuracy of 82.7% top-1 and 96.2% top-5 on ImageNet. Our model is 1.2% better in top-1 accuracy than the best human-invented architectures while having 9 billion fewer FLOPS - a reduction of 28% in computational demand from the previous state-of-the-art model. When evaluated at different levels of computational cost, accuracies of NASNets exceed those of the state-of-the-art human-designed models. For instance, a small version of NASNet also achieves 74% top-1 accuracy, which is 3.1% better than equivalently-sized, state-of-the-art models for mobile platforms. Finally, the learned features by NASNet used with the Faster-RCNN framework surpass state-of-the-art by 4.0% achieving 43.1% mAP on the COCO dataset

    Low power digital signal processing

    Get PDF

    A 64mW DNN-based Visual Navigation Engine for Autonomous Nano-Drones

    Full text link
    Fully-autonomous miniaturized robots (e.g., drones), with artificial intelligence (AI) based visual navigation capabilities are extremely challenging drivers of Internet-of-Things edge intelligence capabilities. Visual navigation based on AI approaches, such as deep neural networks (DNNs) are becoming pervasive for standard-size drones, but are considered out of reach for nanodrones with size of a few cm2{}^\mathrm{2}. In this work, we present the first (to the best of our knowledge) demonstration of a navigation engine for autonomous nano-drones capable of closed-loop end-to-end DNN-based visual navigation. To achieve this goal we developed a complete methodology for parallel execution of complex DNNs directly on-bard of resource-constrained milliwatt-scale nodes. Our system is based on GAP8, a novel parallel ultra-low-power computing platform, and a 27 g commercial, open-source CrazyFlie 2.0 nano-quadrotor. As part of our general methodology we discuss the software mapping techniques that enable the state-of-the-art deep convolutional neural network presented in [1] to be fully executed on-board within a strict 6 fps real-time constraint with no compromise in terms of flight results, while all processing is done with only 64 mW on average. Our navigation engine is flexible and can be used to span a wide performance range: at its peak performance corner it achieves 18 fps while still consuming on average just 3.5% of the power envelope of the deployed nano-aircraft.Comment: 15 pages, 13 figures, 5 tables, 2 listings, accepted for publication in the IEEE Internet of Things Journal (IEEE IOTJ

    Low-Power and Reconfigurable Asynchronous ASIC Design Implementing Recurrent Neural Networks

    Get PDF
    Artificial intelligence (AI) has experienced a tremendous surge in recent years, resulting in high demand for a wide array of implementations of algorithms in the field. With the rise of Internet-of-Things devices, the need for artificial intelligence algorithms implemented in hardware with tight design restrictions has become even more prevalent. In terms of low power and area, ASIC implementations have the best case. However, these implementations suffer from high non-recurring engineering costs, long time-to-market, and a complete lack of flexibility, which significantly hurts their appeal in an environment where time-to-market is so critical. The time-to-market gap can be shortened through the use of reconfigurable solutions, such as FPGAs, but these come with high cost per unit and significant power and area deficiencies over their ASIC counterparts. To bridge these gaps, this dissertation work develops two methodologies to improve the usability of ASIC implementations of neural networks in these applications. The first method demonstrates a method for substantial reductions in design time for asynchronous implementations of a set of AI algorithms known as Recurrent Neural Networks (RNN) by analyzing the possible architectures and implementing a library of generic or easily altered components that can be used to quickly implement a chosen RNN architecture. A tapeout of this method was completed using as few as 112 hours of labor by the designer from RNN selection to a DRC/LVS clean chip layout ready for fabrication. The second method develops a flow to implement a set of RNNs in a single reconfigurable ASIC, offering a middle ground between fully reconfigurable solutions and completely application-specific implementations. This reconfigurable design is capable of representing thousands of possible RNN configurations in a single IC. A tapeout of this design was also completed, with both tapeouts using the TSMC 65nm bulk CMOS process

    Spatially Varying Nanophotonic Neural Networks

    Full text link
    The explosive growth of computation and energy cost of artificial intelligence has spurred strong interests in new computing modalities as potential alternatives to conventional electronic processors. Photonic processors that execute operations using photons instead of electrons, have promised to enable optical neural networks with ultra-low latency and power consumption. However, existing optical neural networks, limited by the underlying network designs, have achieved image recognition accuracy much lower than state-of-the-art electronic neural networks. In this work, we close this gap by introducing a large-kernel spatially-varying convolutional neural network learned via low-dimensional reparameterization techniques. We experimentally instantiate the network with a flat meta-optical system that encompasses an array of nanophotonic structures designed to induce angle-dependent responses. Combined with an extremely lightweight electronic backend with approximately 2K parameters we demonstrate a nanophotonic neural network reaches 73.80\% blind test classification accuracy on CIFAR-10 dataset, and, as such, the first time, an optical neural network outperforms the first modern digital neural network -- AlexNet (72.64\%) with 57M parameters, bringing optical neural network into modern deep learning era
    corecore