219 research outputs found
A Low-cost Sensing System for Cooperative Air Quality Monitoring in Urban Areas
Air quality in urban areas is a very important topic as it closely affects the health of citizens. Recent studies highlight that the exposure to polluted air can increase the incidence of diseases and deteriorate the quality of life. Hence, it is necessary to develop tools for real-time air quality monitoring, so as to allow appropriate and timely decisions. In this paper, we present uSense, a low-cost cooperative monitoring tool that allows knowing, in real-time, the concentrations of polluting gases in various areas of the city. Specifically, users monitor the areas of their interest by deploying low-cost and low-power sensor nodes. In addition, they can share the collected data following a social networking approach. uSense has been tested through an in-field experimentation performed in different areas of a city. The obtained results are in line with those provided by the local environmental control authority and show that uSense can be profitably used for air quality monitoring
The âEyeballingâ technique : an emerging and alerting trend of alcohol misuse
Alternative methods of alcohol consumption have recently emerged among adolescents and young adults, including the alcohol âeyeballingâ, which consist in the direct pouring of alcoholic substances on the ocular surface epithelium. In a context of drug and behavioural addictions change, âeyeballingâ can be seen as one of the latest and potentially highly risky new trends. We aimed to analyze the existing medical literature as well as online material on this emerging trend of alcohol misusePeer reviewedFinal Published versio
Enabling Mixed-Precision Quantized Neural Networks in Extreme-Edge Devices
The deployment of Quantized Neural Networks (QNN) on advanced
microcontrollers requires optimized software to exploit digital signal
processing (DSP) extensions of modern instruction set architectures (ISA). As
such, recent research proposed optimized libraries for QNNs (from 8-bit to
2-bit) such as CMSIS-NN and PULP-NN. This work presents an extension to the
PULP-NN library targeting the acceleration of mixed-precision Deep Neural
Networks, an emerging paradigm able to significantly shrink the memory
footprint of deep neural networks with negligible accuracy loss. The library,
composed of 27 kernels, one for each permutation of input feature maps,
weights, and output feature maps precision (considering 8-bit, 4-bit and
2-bit), enables efficient inference of QNN on parallel ultra-low-power (PULP)
clusters of RISC-V based processors, featuring the RV32IMCXpulpV2 ISA. The
proposed solution, benchmarked on an 8-cores GAP-8 PULP cluster, reaches peak
performance of 16 MACs/cycle on 8 cores, performing 21x to 25x faster than an
STM32H7 (powered by an ARM Cortex M7 processor) with 15x to 21x better energy
efficiency.Comment: 4 pages, 6 figures, published in 17th ACM International Conference on
Computing Frontiers (CF '20), May 11--13, 2020, Catania, Ital
Enabling mixed-precision quantized neural networks in extreme-edge devices
The deployment of Quantized Neural Networks (QNN) on advanced microcontrollers requires optimized software to exploit digital signal processing (DSP) extensions of modern instruction set architectures (ISA). As such, recent research proposed optimized libraries for QNNs (from 8-bit to 2-bit) such as CMSIS-NN and PULP-NN. This work presents an extension to the PULP-NN library targeting the acceleration of mixed-precision Deep Neural Networks, an emerging paradigm able to significantly shrink the memory footprint of deep neural networks with negligible accuracy loss. The library, composed of 27 kernels, one for each permutation of input feature maps, weights, and output feature maps precision (considering 8-bit, 4-bit and 2-bit), enables efficient inference of QNN on parallel ultra-low-power (PULP) clusters of RISC-V based processors, featuring the RV32IMCXpulpV2 ISA. The proposed solution, benchmarked on an 8-cores GAP-8 PULP cluster, reaches peak performance of 16 MACs/cycle on 8 cores, performing 21
7 to 25
7 faster than an STM32H7 (powered by an ARM Cortex M7 processor) with 15
7 to 21
7 better energy efficiency
DORY: Automatic End-to-End Deployment of Real-World DNNs on Low-Cost IoT MCUs
The deployment of Deep Neural Networks (DNNs) on end-nodes at the extreme
edge of the Internet-of-Things is a critical enabler to support pervasive Deep
Learning-enhanced applications. Low-Cost MCU-based end-nodes have limited
on-chip memory and often replace caches with scratchpads, to reduce area
overheads and increase energy efficiency -- requiring explicit DMA-based memory
transfers between different levels of the memory hierarchy. Mapping modern DNNs
on these systems requires aggressive topology-dependent tiling and
double-buffering. In this work, we propose DORY (Deployment Oriented to memoRY)
- an automatic tool to deploy DNNs on low cost MCUs with typically less than
1MB of on-chip SRAM memory. DORY abstracts tiling as a Constraint Programming
(CP) problem: it maximizes L1 memory utilization under the topological
constraints imposed by each DNN layer. Then, it generates ANSI C code to
orchestrate off- and on-chip transfers and computation phases. Furthermore, to
maximize speed, DORY augments the CP formulation with heuristics promoting
performance-effective tile sizes. As a case study for DORY, we target
GreenWaves Technologies GAP8, one of the most advanced parallel ultra-low power
MCU-class devices on the market. On this device, DORY achieves up to 2.5x
better MAC/cycle than the GreenWaves proprietary software solution and 18.1x
better than the state-of-the-art result on an STM32-F746 MCU on single layers.
Using our tool, GAP-8 can perform end-to-end inference of a 1.0-MobileNet-128
network consuming just 63 pJ/MAC on average @ 4.3 fps - 15.4x better than an
STM32-F746. We release all our developments - the DORY framework, the optimized
backend kernels, and the related heuristics - as open-source software.Comment: 14 pages, 12 figures, 4 tables, 2 listings. Accepted for publication
in IEEE Transactions on Computers
(https://ieeexplore.ieee.org/document/9381618
GVSoC: A Highly Configurable, Fast and Accurate Full-Platform Simulator for RISC-V based IoT Processors
open6siembargoed_20220427Bruschi, Nazareno; Haugou, Germain; Tagliavini, Giuseppe; Conti, Francesco; Benini, Luca; Rossi, DavideBruschi, Nazareno; Haugou, Germain; Tagliavini, Giuseppe; Conti, Francesco; Benini, Luca; Rossi, David
Characterization and modeling of CMOS-compatible acoustical particle velocity sensors for applications requiring low supply voltages
Acoustic particle velocity sensors have been obtained applying simple low resolution micromachining steps to chips fabricated using a standard microelectronic process. Each sensor consists of four silicided polysilicon wires, suspended over cavities etched into the substrate, and connected to form a heatstone bridge. Full compatibility of the micromachining procedure with the original process is demonstrated by integrating a simple pre-amplifier on the same chip as the sensors and showing that both blocks are functional. Proper design of the sensing structures allows them to operate with a single 3.3 V power supply. Sensitivity and noise measurements, performed to estimate the sensor detection limit, are described. Excess noise with a flicker-like behavior, not ascribable to the amplifier, is found when the bridges are biased in working conditions. In addition, the dependence of the sensitivity on the dc bias voltage of the bridges is investigated, comparing the experimental data with the results of a simple analytical model and finite element method simulations
- âŠ