85 research outputs found

    Energy-Sustainable IoT Connectivity: Vision, Technological Enablers, Challenges, and Future Directions

    Full text link
    Technology solutions must effectively balance economic growth, social equity, and environmental integrity to achieve a sustainable society. Notably, although the Internet of Things (IoT) paradigm constitutes a key sustainability enabler, critical issues such as the increasing maintenance operations, energy consumption, and manufacturing/disposal of IoT devices have long-term negative economic, societal, and environmental impacts and must be efficiently addressed. This calls for self-sustainable IoT ecosystems requiring minimal external resources and intervention, effectively utilizing renewable energy sources, and recycling materials whenever possible, thus encompassing energy sustainability. In this work, we focus on energy-sustainable IoT during the operation phase, although our discussions sometimes extend to other sustainability aspects and IoT lifecycle phases. Specifically, we provide a fresh look at energy-sustainable IoT and identify energy provision, transfer, and energy efficiency as the three main energy-related processes whose harmonious coexistence pushes toward realizing self-sustainable IoT systems. Their main related technologies, recent advances, challenges, and research directions are also discussed. Moreover, we overview relevant performance metrics to assess the energy-sustainability potential of a certain technique, technology, device, or network and list some target values for the next generation of wireless systems. Overall, this paper offers insights that are valuable for advancing sustainability goals for present and future generations.Comment: 25 figures, 12 tables, submitted to IEEE Open Journal of the Communications Societ

    Architecture and Advanced Electronics Pathways Toward Highly Adaptive Energy- Efficient Computing

    Get PDF
    With the explosion of the number of compute nodes, the bottleneck of future computing systems lies in the network architecture connecting the nodes. Addressing the bottleneck requires replacing current backplane-based network topologies. We propose to revolutionize computing electronics by realizing embedded optical waveguides for onboard networking and wireless chip-to-chip links at 200-GHz carrier frequency connecting neighboring boards in a rack. The control of novel rate-adaptive optical and mm-wave transceivers needs tight interlinking with the system software for runtime resource management

    An efficient asynchronous spatial division multiplexing router for network-on-chip on the hardware platform

    Get PDF
    The quasi-delay-insensitive (QDI) based asynchronous network-on-chip (ANoC) has several advantages over clock-based synchronous network-on-chips (NoCs). The asynchronous router uses a virtual channel (VC) as a primary flow-control mechanism however, the spatial division multiplexing (SDM) based mechanism performs better over input traffics over VC. This manuscript uses an asynchronous spatial division multiplexing (ASDM) based router for NoC architecture on a field-programmable gate array (FPGA) platform. The ASDM router is configurable to different bandwidths and VCs. The ASDM router mainly contains input-output (I/O) buffers, a switching allocator, and a crossbar unit. The 4-phase 1-of-4 dual-rail protocol is used to construct the I/O buffers. The performance of the ASDM router is analyzed in terms of lower urinary tract symptoms (LUTs) (chip area), delay, latency, and throughput parameters. The work is implemented using Verilog-HDL with Xilinx ISE 14.7 on artix-7 FPGA. The ASDM router achieves % chip area and obtains 0.8 ns of latency with a throughput of 800 Mfps. The proposed router is compared with existing asynchronous approaches with improved latency and throughput metrics

    Optimizing AI at the Edge: from network topology design to MCU deployment

    Get PDF
    The first topic analyzed in the thesis will be Neural Architecture Search (NAS). I will focus on two different tools that I developed, one to optimize the architecture of Temporal Convolutional Networks (TCNs), a convolutional model for time-series processing that has recently emerged, and one to optimize the data precision of tensors inside CNNs. The first NAS proposed explicitly targets the optimization of the most peculiar architectural parameters of TCNs, namely dilation, receptive field, and the number of features in each layer. Note that this is the first NAS that explicitly targets these networks. The second NAS proposed instead focuses on finding the most efficient data format for a target CNN, with the granularity of the layer filter. Note that applying these two NASes in sequence allows an "application designer" to minimize the structure of the neural network employed, minimizing the number of operations or the memory usage of the network. After that, the second topic described is the optimization of neural network deployment on edge devices. Importantly, exploiting edge platforms' scarce resources is critical for NN efficient execution on MCUs. To do so, I will introduce DORY (Deployment Oriented to memoRY) -- an automatic tool to deploy CNNs on low-cost MCUs. DORY, in different steps, can manage different levels of memory inside the MCU automatically, offload the computation workload (i.e., the different layers of a neural network) to dedicated hardware accelerators, and automatically generates ANSI C code that orchestrates off- and on-chip transfers with the computation phases. On top of this, I will introduce two optimized computation libraries that DORY can exploit to deploy TCNs and Transformers on edge efficiently. I conclude the thesis with two different applications on bio-signal analysis, i.e., heart rate tracking and sEMG-based gesture recognition

    Real-time implementation of 3D LiDAR point cloud semantic segmentation in an FPGA

    Get PDF
    Dissertação de mestrado em Informatics EngineeringIn the last few years, the automotive industry has relied heavily on deep learning applications for perception solutions. With data-heavy sensors, such as LiDAR, becoming a standard, the task of developing low-power and real-time applications has become increasingly more challenging. To obtain the maximum computational efficiency, no longer can one focus solely on the software aspect of such applications, while disregarding the underlying hardware. In this thesis, a hardware-software co-design approach is used to implement an inference application leveraging the SqueezeSegV3, a LiDAR-based convolutional neural network, on the Versal ACAP VCK190 FPGA. Automotive requirements carefully drive the development of the proposed solution, with real-time performance and low power consumption being the target metrics. A first experiment validates the suitability of Xilinx’s Vitis-AI tool for the deployment of deep convolutional neural networks on FPGAs. Both the ResNet-18 and SqueezeNet neural networks are deployed to the Zynq UltraScale+ MPSoC ZCU104 and Versal ACAP VCK190 FPGAs. The results show that both networks achieve far more than the real-time requirements while consuming low power. Compared to an NVIDIA RTX 3090 GPU, the performance per watt during both network’s inference is 12x and 47.8x higher and 15.1x and 26.6x higher respectively for the Zynq UltraScale+ MPSoC ZCU104 and the Versal ACAP VCK190 FPGA. These results are obtained with no drop in accuracy in the quantization step. A second experiment builds upon the results of the first by deploying a real-time application containing the SqueezeSegV3 model using the Semantic-KITTI dataset. A framerate of 11 Hz is achieved with a peak power consumption of 78 Watts. The quantization step results in a minimal accuracy and IoU degradation of 0.7 and 1.5 points respectively. A smaller version of the same model is also deployed achieving a framerate of 19 Hz and a peak power consumption of 76 Watts. The application performs semantic segmentation over all the point cloud with a field of view of 360°.Nos últimos anos a indústria automóvel tem cada vez mais aplicado deep learning para solucionar problemas de perceção. Dado que os sensores que produzem grandes quantidades de dados, como o LiDAR, se têm tornado standard, a tarefa de desenvolver aplicações de baixo consumo energético e com capacidades de reagir em tempo real tem-se tornado cada vez mais desafiante. Para obter a máxima eficiência computacional, deixou de ser possível focar-se apenas no software aquando do desenvolvimento de uma aplicação deixando de lado o hardware subjacente. Nesta tese, uma abordagem de desenvolvimento simultâneo de hardware e software é usada para implementar uma aplicação de inferência usando o SqueezeSegV3, uma rede neuronal convolucional profunda, na FPGA Versal ACAP VCK190. São os requisitos automotive que guiam o desenvolvimento da solução proposta, sendo a performance em tempo real e o baixo consumo energético, as métricas alvo principais. Uma primeira experiência valida a aptidão da ferramenta Vitis-AI para a implantação de redes neuronais convolucionais profundas em FPGAs. As redes ResNet-18 e SqueezeNet são ambas implantadas nas FPGAs Zynq UltraScale+ MPSoC ZCU104 e Versal ACAP VCK190. Os resultados mostram que ambas as redes ultrapassam os requisitos de tempo real consumindo pouca energia. Comparado com a GPU NVIDIA RTX 3090, a performance por Watt durante a inferência de ambas as redes é superior em 12x e 47.8x e 15.1x e 26.6x respetivamente na Zynq UltraScale+ MPSoC ZCU104 e na Versal ACAP VCK190. Estes resultados foram obtidos sem qualquer perda de accuracy na etapa de quantização. Uma segunda experiência é feita no seguimento dos resultados da primeira, implantando uma aplicação de inferência em tempo real contendo o modelo SqueezeSegV3 e usando o conjunto de dados Semantic-KITTI. Um framerate de 11 Hz é atingido com um pico de consumo energético de 78 Watts. O processo de quantização resulta numa perda mínima de accuracy e IoU com valores de 0.7 e 1.5 pontos respetivamente. Uma versão mais pequena do mesmo modelo é também implantada, atingindo uma framerate de 19 Hz e um pico de consumo energético de 76 Watts. A aplicação desenvolvida executa segmentação semântica sobre a totalidade das nuvens de pontos LiDAR, com um campo de visão de 360°

    Behind the last line of defense: Surviving SoC faults and intrusions

    Get PDF
    Today, leveraging the enormous modular power, diversity and flexibility of manycore systems-on-a-chip (SoCs) requires careful orchestration of complex and heterogeneous resources, a task left to low-level software, e.g., hypervisors. In current architectures, this software forms a single point of failure and worthwhile target for attacks: once compromised, adversaries can gain access to all information and full control over the platform and the environment it controls. This article proposes Midir, an enhanced manycore architecture, effecting a paradigm shift from SoCs to distributed SoCs. Midir changes the way platform resources are controlled, by retrofitting tile-based fault containment through well known mechanisms, while securing low-overhead quorum-based consensus on all critical operations, in particular privilege management and, thus, management of containment domains. Allowing versatile redundancy management, Midir promotes resilience for all software levels, including at low level. We explain this architecture, its associated algorithms and hardware mechanisms and show, for the example of a Byzantine fault tolerant microhypervisor, that it outperforms the highly efficient MinBFT by one order of magnitude

    Design Space Exploration and Resource Management of Multi/Many-Core Systems

    Get PDF
    The increasing demand of processing a higher number of applications and related data on computing platforms has resulted in reliance on multi-/many-core chips as they facilitate parallel processing. However, there is a desire for these platforms to be energy-efficient and reliable, and they need to perform secure computations for the interest of the whole community. This book provides perspectives on the aforementioned aspects from leading researchers in terms of state-of-the-art contributions and upcoming trends

    Neue Implementierungsmethoden für eingebettete geberlose Motor-Controller basierend auf dem Einsatz von Multi-Core-Mikrocontrollern

    Get PDF
    Diese Arbeit behandelt den Einsatz von Multi-Core-Mikrocontrollern in softwarebasierten Motor-Controllern zur geberlosen Stromregelung von PMSM. Hierbei werden die Steigerung der Motor-Controller-Performance durch Parallelisierung und die Auswirkungen von Cross-Core-Interferenzen auf das zeitliche Verhalten von Motor-Controllern fokussiert. Es wird eine Generalisierung von geberlosen Stromregelungen durchgeführt, um die allgemeine Parallelisierbarkeit dieser Anwendungen zu beschreiben. Hierauf aufbauend wird gezeigt, unter welchen Rahmenbedingungen eine effektive Steigerung der Regelfrequenz durch Parallelisierung realisierbar ist. Durch die Konsolidierung dedizierter Motor-Controller in ein Multi-Core-System kann Hardware eingespart und dadurch der Energiebedarf um bis zu 50 % gesenkt werden. Eine solche Konsolidierung verursacht regelmäßig Cross-Core-Interferenzen. Um diesem Problem entgegenzuwirken, wird ein Verfahren vorgestellt, das die negativen Einflüsse dieser Seiteneffekte auf die Laufzeiten der Motor-Controller analysiert und quantifiziert. Hierauf aufbauend werden Strategien zur Reduktion der Interferenzen beschrieben und evaluiert. Durch die erzielten Ergebnisse werden Multi-Core-Mikrocontroller als Basis neuer Implementierungsmethoden für Motor-Controller erschlossen. Sie erweitern deren Design- und Implementierungsprozesse, um hier Multi-Core-Mikrocontroller durch Parallelisierung und Konsolidierung effektiv und effizient einzusetzen.This thesis addresses the use of multi-core microcontrollers in software-based motor controllers for the position sensorless control of PMSM. The focus is on increasing the motor controller performance through parallelization and on the effects of cross-core interferences on the temporal behaviour of motor controllers. A generalization of position sensorless current controls is carried out in order to describe the general possibilities to parallelize these applications. Building on this, it is shown under which conditions an effective increase in the control frequency can be achieved through parallelization. By consolidating dedicated motor controllers into one multi-core system, hardware can be saved to reduce energy costs by up to 50 %. Such consolidation causes cross-core interference on a regular basis. To address this problem, a procedure is presented that analyses and quantifies the negative influences of these side effects on the runtimes of the motor controllers. Based on this, strategies for reducing interference are described and evaluated. The results of this work open up multi-core microcontrollers as a base for new implementation methods for motor controllers. They expand the design and implementation processes of motor controllers in order to use multi-core microcontrollers effectively and efficiently through parallelization and consolidation

    nDimNoC: Real-Time D-dimensional NoC

    Get PDF
    The growing demand of powerful embedded systems to perform advanced functionalities led to a large increase in the number of computation nodes integrated in Systems-on-chip (SoC). In this context, network-on-chips (NoCs) emerged as a new standard communication infrastructure for multi-processor SoCs (MPSoCs). In this work, we present nDimNoC, a new D-dimensional NoC that provides real-time guarantees for systems implemented upon MPSoCs. Specifically, (1) we propose a new router architecture and a new deflection-based routing policy that use the properties of circulant topologies to ensure bounded worst-case communication delays, and (2) we develop a generic worst-case communication time (WCCT) analysis for packets transmitted over nDimNoC. In our experiments, we show that the WCCT of packets decreases when we increase the dimensionality of the NoC using nDimNoC 19s topolgy and routing policy. By implementing nDimNoC in Verilog and synthesizing it for an FPGA platform, we show that a 3D-nDimNoC requires "485-times less silicon than routers that use virtual channels (VC). We computed the maximum operating frequency of a 3D-nDimNoC with Xilinx Vivado. Increasing the number dimensions in the NoC improves WCCT at the cost of a more complex routing logic that may result in a reduced operating clock frequency.info:eu-repo/semantics/publishedVersio

    Self-aware reliable monitoring

    Get PDF
    Cyber-Physical Systems (CPSs) can be found in almost all technical areas where they constitute a key enabler for anticipated autonomous machines and devices. They are used in a wide range of applications such as autonomous driving, traffic control, manufacturing plants, telecommunication systems, smart grids, and portable health monitoring systems. CPSs are facing steadily increasing requirements such as autonomy, adaptability, reliability, robustness, efficiency, and performance. A CPS necessitates comprehensive knowledge about itself and its environment to meet these requirements as well as make rational, well-informed decisions, manage its objectives in a sophisticated way, and adapt to a possibly changing environment. To gain such comprehensive knowledge, a CPS must monitor itself and its environment. However, the data obtained during this process comes from physical properties measured by sensors and may differ from the ground truth. Sensors are neither completely accurate nor precise. Even if they were, they could still be used incorrectly or break while operating. Besides, it is possible that not all characteristics of physical quantities in the environment are entirely known. Furthermore, some input data may be meaningless as long as they are not transferred to a domain understandable to the CPS. Regardless of the reason, whether erroneous data, incomplete knowledge or unintelligibility of data, such circumstances can result in a CPS that has an incomplete or inaccurate picture of itself and its environment, which can lead to wrong decisions with possible negative consequences. Therefore, a CPS must know the obtained data’s reliability and may need to abstract information of it to fulfill its tasks. Besides, a CPS should base its decisions on a measure that reflects its confidence about certain circumstances. Computational Self-Awareness (CSA) is a promising solution for providing a CPS with a monitoring ability that is reliable and robust — even in the presence of erroneous data. This dissertation proves that CSA, especially the properties abstraction, data reliability, and confidence, can improve a system’s monitoring capabilities regarding its robustness and reliability. The extensive experiments conducted are based on two case studies from different fields: the health- and industrial sectors
    • …
    corecore