355 research outputs found

    Slotted ALOHA Overlay on LoRaWAN: a Distributed Synchronization Approach

    Full text link
    LoRaWAN is one of the most promising standards for IoT applications. Nevertheless, the high density of end-devices expected for each gateway, the absence of an effective synchronization scheme between gateway and end-devices, challenge the scalability of these networks. In this article, we propose to regulate the communication of LoRaWAN networks using a Slotted-ALOHA (S-ALOHA) instead of the classic ALOHA approach used by LoRa. The implementation is an overlay on top of the standard LoRaWAN; thus no modification in pre-existing LoRaWAN firmware and libraries is necessary. Our method is based on a novel distributed synchronization service that is suitable for low-cost IoT end-nodes. S-ALOHA supported by our synchronization service significantly improves the performance of traditional LoRaWAN networks regarding packet loss rate and network throughput.Comment: 4 pages, 8 figure

    Ciaran Carson’s constellations of ideas: theories on traditional culture from within

    Get PDF
    Ever since the dawn of folklore studies, Ireland has been identified as one of the richest stores of traditional lore and music in the world. In the twentieth century, the focus shifted from folklore studies to anthropology, which provided scholars with new tools, and inaugurated a new approach to Irish traditional culture. Both the folkloric and the anthropological approaches have their shortcomings: if the study of the Irish traditional society as a tribe could lead us to forget its positioningin the history of the West, the emphasis on folklore often dilutes itself into a quest for “local colour”. Moreover, the task of describing a different culture is complicated by the invasiveness of our epistemological approach; confronting traditional culture with the filters of literacy might lead us to perceive it as banal and simple. Ciaran Carson is a poet, a traditional musician, and the son of an accomplished storyteller. Carson approaches traditional culture as an insider: his description of traditional music and cooking are not travelogues, nor they resemble the detachedstructuralist approach of anthropologists. Carson does not treat Irish traditional culture as a sample, nor as a fragile item to be kept isolated: his poetic discourse is in constant dialogue with world literature, from Japanese Haiku to the philosophy of Walter Benjamin; like James Joyce, Carson makes Ireland the centre of the world by turning to the outside.Ciaran Carson’s perspective on Irish traditional culture is very articulated and could indeed be described as a theory, as long as we accept a theory that is not formulated in the language of criticism. Carson often describes traditional culture as a web of motifs, a constellation of narratives; the descriptive paradigm he adopts could also be described as a constellation, a web of ideas that, as in McLuhan’s mosaic technique, are juxtaposed and accumulated, in order to avoid the snares of literacy

    XNOR Neural Engine: a Hardware Accelerator IP for 21.6 fJ/op Binary Neural Network Inference

    Full text link
    Binary Neural Networks (BNNs) are promising to deliver accuracy comparable to conventional deep neural networks at a fraction of the cost in terms of memory and energy. In this paper, we introduce the XNOR Neural Engine (XNE), a fully digital configurable hardware accelerator IP for BNNs, integrated within a microcontroller unit (MCU) equipped with an autonomous I/O subsystem and hybrid SRAM / standard cell memory. The XNE is able to fully compute convolutional and dense layers in autonomy or in cooperation with the core in the MCU to realize more complex behaviors. We show post-synthesis results in 65nm and 22nm technology for the XNE IP and post-layout results in 22nm for the full MCU indicating that this system can drop the energy cost per binary operation to 21.6fJ per operation at 0.4V, and at the same time is flexible and performant enough to execute state-of-the-art BNN topologies such as ResNet-34 in less than 2.2mJ per frame at 8.9 fps.Comment: 11 pages, 8 figures, 2 tables, 3 listings. Accepted for presentation at CODES'18 and for publication in IEEE Transactions on Computer-Aided Design of Circuits and Systems (TCAD) as part of the ESWEEK-TCAD special issu

    A sub-mW IoT-endnode for always-on visual monitoring and smart triggering

    Full text link
    This work presents a fully-programmable Internet of Things (IoT) visual sensing node that targets sub-mW power consumption in always-on monitoring scenarios. The system features a spatial-contrast 128x64128\mathrm{x}64 binary pixel imager with focal-plane processing. The sensor, when working at its lowest power mode (10ÎĽW10\mu W at 10 fps), provides as output the number of changed pixels. Based on this information, a dedicated camera interface, implemented on a low-power FPGA, wakes up an ultra-low-power parallel processing unit to extract context-aware visual information. We evaluate the smart sensor on three always-on visual triggering application scenarios. Triggering accuracy comparable to RGB image sensors is achieved at nominal lighting conditions, while consuming an average power between 193ÎĽW193\mu W and 277ÎĽW277\mu W, depending on context activity. The digital sub-system is extremely flexible, thanks to a fully-programmable digital signal processing engine, but still achieves 19x lower power consumption compared to MCU-based cameras with significantly lower on-board computing capabilities.Comment: 11 pages, 9 figures, submitteted to IEEE IoT Journa

    YodaNN: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration

    Get PDF
    Convolutional neural networks (CNNs) have revolutionized the world of computer vision over the last few years, pushing image classification beyond human accuracy. The computational effort of today's CNNs requires power-hungry parallel processors or GP-GPUs. Recent developments in CNN accelerators for system-on-chip integration have reduced energy consumption significantly. Unfortunately, even these highly optimized devices are above the power envelope imposed by mobile and deeply embedded applications and face hard limitations caused by CNN weight I/O and storage. This prevents the adoption of CNNs in future ultra-low power Internet of Things end-nodes for near-sensor analytics. Recent algorithmic and theoretical advancements enable competitive classification accuracy even when limiting CNNs to binary (+1/-1) weights during training. These new findings bring major optimization opportunities in the arithmetic core by removing the need for expensive multiplications, as well as reducing I/O bandwidth and storage. In this work, we present an accelerator optimized for binary-weight CNNs that achieves 1510 GOp/s at 1.2 V on a core area of only 1.33 MGE (Million Gate Equivalent) or 0.19 mm2^2 and with a power dissipation of 895 {\mu}W in UMC 65 nm technology at 0.6 V. Our accelerator significantly outperforms the state-of-the-art in terms of energy and area efficiency achieving 61.2 TOp/s/[email protected] V and 1135 GOp/s/[email protected] V, respectively

    Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine

    Get PDF
    Deep neural networks have achieved impressive results in computer vision and machine learning. Unfortunately, state-of-the-art networks are extremely compute and memory intensive which makes them unsuitable for mW-devices such as IoT end-nodes. Aggressive quantization of these networks dramatically reduces the computation and memory footprint. Binary-weight neural networks (BWNs) follow this trend, pushing weight quantization to the limit. Hardware accelerators for BWNs presented up to now have focused on core efficiency, disregarding I/O bandwidth and system-level efficiency that are crucial for deployment of accelerators in ultra-low power devices. We present Hyperdrive: a BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel binary-weight streaming approach, which can be used for arbitrarily sized convolutional neural network architecture and input resolution by exploiting the natural scalability of the compute units both at chip-level and system-level by arranging Hyperdrive chips systolically in a 2D mesh while processing the entire feature map together in parallel. Hyperdrive achieves 4.3 TOp/s/W system-level efficiency (i.e., including I/Os)---3.1x higher than state-of-the-art BWN accelerators, even if its core uses resource-intensive FP16 arithmetic for increased robustness

    PULP-HD: Accelerating Brain-Inspired High-Dimensional Computing on a Parallel Ultra-Low Power Platform

    Full text link
    Computing with high-dimensional (HD) vectors, also referred to as hypervectors\textit{hypervectors}, is a brain-inspired alternative to computing with scalars. Key properties of HD computing include a well-defined set of arithmetic operations on hypervectors, generality, scalability, robustness, fast learning, and ubiquitous parallel operations. HD computing is about manipulating and comparing large patterns-binary hypervectors with 10,000 dimensions-making its efficient realization on minimalistic ultra-low-power platforms challenging. This paper describes HD computing's acceleration and its optimization of memory accesses and operations on a silicon prototype of the PULPv3 4-core platform (1.5mm2^2, 2mW), surpassing the state-of-the-art classification accuracy (on average 92.4%) with simultaneous 3.7Ă—\times end-to-end speed-up and 2Ă—\times energy saving compared to its single-core execution. We further explore the scalability of our accelerator by increasing the number of inputs and classification window on a new generation of the PULP architecture featuring bit-manipulation instruction extensions and larger number of 8 cores. These together enable a near ideal speed-up of 18.4Ă—\times compared to the single-core PULPv3

    A 64mW DNN-based Visual Navigation Engine for Autonomous Nano-Drones

    Full text link
    Fully-autonomous miniaturized robots (e.g., drones), with artificial intelligence (AI) based visual navigation capabilities are extremely challenging drivers of Internet-of-Things edge intelligence capabilities. Visual navigation based on AI approaches, such as deep neural networks (DNNs) are becoming pervasive for standard-size drones, but are considered out of reach for nanodrones with size of a few cm2{}^\mathrm{2}. In this work, we present the first (to the best of our knowledge) demonstration of a navigation engine for autonomous nano-drones capable of closed-loop end-to-end DNN-based visual navigation. To achieve this goal we developed a complete methodology for parallel execution of complex DNNs directly on-bard of resource-constrained milliwatt-scale nodes. Our system is based on GAP8, a novel parallel ultra-low-power computing platform, and a 27 g commercial, open-source CrazyFlie 2.0 nano-quadrotor. As part of our general methodology we discuss the software mapping techniques that enable the state-of-the-art deep convolutional neural network presented in [1] to be fully executed on-board within a strict 6 fps real-time constraint with no compromise in terms of flight results, while all processing is done with only 64 mW on average. Our navigation engine is flexible and can be used to span a wide performance range: at its peak performance corner it achieves 18 fps while still consuming on average just 3.5% of the power envelope of the deployed nano-aircraft.Comment: 15 pages, 13 figures, 5 tables, 2 listings, accepted for publication in the IEEE Internet of Things Journal (IEEE IOTJ

    Fast and Accurate Multiclass Inference for MI-BCIs Using Large Multiscale Temporal and Spectral Features

    Full text link
    Accurate, fast, and reliable multiclass classification of electroencephalography (EEG) signals is a challenging task towards the development of motor imagery brain-computer interface (MI-BCI) systems. We propose enhancements to different feature extractors, along with a support vector machine (SVM) classifier, to simultaneously improve classification accuracy and execution time during training and testing. We focus on the well-known common spatial pattern (CSP) and Riemannian covariance methods, and significantly extend these two feature extractors to multiscale temporal and spectral cases. The multiscale CSP features achieve 73.70±\pm15.90% (mean±\pm standard deviation across 9 subjects) classification accuracy that surpasses the state-of-the-art method [1], 70.6±\pm14.70%, on the 4-class BCI competition IV-2a dataset. The Riemannian covariance features outperform the CSP by achieving 74.27±\pm15.5% accuracy and executing 9x faster in training and 4x faster in testing. Using more temporal windows for Riemannian features results in 75.47±\pm12.8% accuracy with 1.6x faster testing than CSP.Comment: Published as a conference paper at the IEEE European Signal Processing Conference (EUSIPCO), 201

    Low-cost and distributed health monitoring system for critical buildings

    Get PDF
    In this paper we present a low-cost distributed embedded system for Structural Health Monitoring (SHM) that uses very cost-effective MEMS accelerometers, instead of more expensive piezoelectric analog transducers. The proposed platform provides online filtering and fusion of the collected data directly on-board. Data are transmitted after processing using a WiFi transceiver. Low-cost and synchronized devices permit to have more fine-grained measurements and a comprehensive assessment of the whole building, by evaluating their response to vibrations. The challenge addressed in this paper is to execute a quite computationally-demanding digital filtering on a low-cost microcontroller STM32, and to reduce the signal-to-noise ratio typical of MEMS devices with a spatial redundancy of the sensors. Our work poses the basis for low-cost methods for elaborating complex modal analysis of buildings and structures
    • …
    corecore