20 research outputs found

    SeLoC-ML: Semantic Low-Code Engineering for Machine Learning Applications in Industrial IoT

    Full text link
    Internet of Things (IoT) is transforming the industry by bridging the gap between Information Technology (IT) and Operational Technology (OT). Machines are being integrated with connected sensors and managed by intelligent analytics applications, accelerating digital transformation and business operations. Bringing Machine Learning (ML) to industrial devices is an advancement aiming to promote the convergence of IT and OT. However, developing an ML application in industrial IoT (IIoT) presents various challenges, including hardware heterogeneity, non-standardized representations of ML models, device and ML model compatibility issues, and slow application development. Successful deployment in this area requires a deep understanding of hardware, algorithms, software tools, and applications. Therefore, this paper presents a framework called Semantic Low-Code Engineering for ML Applications (SeLoC-ML), built on a low-code platform to support the rapid development of ML applications in IIoT by leveraging Semantic Web technologies. SeLoC-ML enables non-experts to easily model, discover, reuse, and matchmake ML models and devices at scale. The project code can be automatically generated for deployment on hardware based on the matching results. Developers can benefit from semantic application templates, called recipes, to fast prototype end-user applications. The evaluations confirm an engineering effort reduction by a factor of at least three compared to traditional approaches on an industrial ML classification case study, showing the efficiency and usefulness of SeLoC-ML. We share the code and welcome any contributions.Comment: Accepted by the 21st International Semantic Web Conference (ISWC2022

    Opening the “Black Box” of Silicon Chip Design in Neuromorphic Computing

    Get PDF
    Neuromorphic computing, a bio-inspired computing architecture that transfers neuroscience to silicon chip, has potential to achieve the same level of computation and energy efficiency as mammalian brains. Meanwhile, three-dimensional (3D) integrated circuit (IC) design with non-volatile memory crossbar array uniquely unveils its intrinsic vector-matrix computation with parallel computing capability in neuromorphic computing designs. In this chapter, the state-of-the-art research trend on electronic circuit designs of neuromorphic computing will be introduced. Furthermore, a practical bio-inspired spiking neural network with delay-feedback topology will be discussed. In the endeavor to imitate how human beings process information, our fabricated spiking neural network chip has capability to process analog signal directly, resulting in high energy efficiency with small hardware implementation cost. Mimicking the neurological structure of mammalian brains, the potential of 3D-IC implementation technique with memristive synapses is investigated. Finally, applications on the chaotic time series prediction and the video frame recognition will be demonstrated

    Fast Obstacle Detection System for UAS Based on Complementary Use of Radar and Stereoscopic Camera

    Get PDF
    Autonomous unmanned aerial systems (UAS) are having an increasing impact in the scientific community. One of the most challenging problems in this research area is the design of robust real-time obstacle detection and avoidance systems. In the automotive field, applications of obstacle detection systems combining radar and vision sensors are common and widely documented. However, these technologies are not currently employed in the UAS field due to the major complexity of the flight scenario, especially in urban environments. In this paper, a real-time obstacle-detection system based on the use of a 77 GHz radar and a stereoscopic camera is proposed for use in small UASs. The resulting system is capable of detecting obstacles in a broad spectrum of environmental conditions. In particular, the vision system guarantees a high resolution for short distances, while the radar has a lower resolution but can cover greater distances, being insensitive to poor lighting conditions. The developed hardware and software architecture and the related obstacle-detection algorithm are illustrated within the European project AURORA. Experimental results carried out employing a small UAS show the effectiveness of the obstacle detection system and of a simple avoidance strategy during several autonomous missions on a test site

    Full-System Simulation of Mobile CPU/GPU Platforms

    Get PDF
    Graphics Processing Units (GPUs) critically rely on a complex system software stack comprising kernel- and userspace drivers and Just-in-time (JIT) compilers. Yet, existing GPU simulators typically abstract away details of the software stack and GPU instruction set. Partly, this is because GPU vendors rarely release sufficient information about their latest GPU products. However, this is also due to the lack of an integrated CPU/GPU simulation framework, which is complete and powerful enough to drive the complex GPU software environment. This has led to a situation where research on GPU architectures and compilers is largely based on outdated or greatly simplified architectures and software stacks, undermining the validity of the generated results. In this paper we develop a full-system system simulation environment for a mobile platform, which enables users to run a complete and unmodified software stack for a state-of-the-art mobile Arm CPU and Mali-G71 GPU powered device. We validate our simulator against a hardware implementation and Arm’s stand-alone GPU simulator, achieving 100% architectural accuracy across all available toolchains. We demonstrate the capability of our GPU simulation framework by optimizing an advanced Computer Vision application using simulated statistics unavailable with other simulation approaches or physical GPU implementations. We demonstrate that performance optimizations for desktop GPUs trigger bottlenecks on mobile GPUs, and show the importance of efficient memory use.Postprin

    Reducing Library Characterization Time for Cell-aware Test while Maintaining Test Quality

    Get PDF
    Cell-aware test (CAT) explicitly targets faults caused by defects inside library cells to improve test quality, compared with conventional automatic test pattern generation (ATPG) approaches, which target faults only at the boundaries of library cells. The CAT methodology consists of two stages. Stage 1, based on dedicated analog simulation, library characterization per cell identifies which cell-level test pattern detects which cell-internal defect; this detection information is encoded in a defect detection matrix (DDM). In Stage 2, with the DDMs as inputs, cell-aware ATPG generates chip-level test patterns per circuit design that is build up of interconnected instances of library cells. This paper focuses on Stage 1, library characterization, as both test quality and cost are determined by the set of cell-internal defects identified and simulated in the CAT tool flow. With the aim to achieve the best test quality, we first propose an approach to identify a comprehensive set, referred to as full set, of potential open- and short-defect locations based on cell layout. However, the full set of defects can be large even for a single cell, making the time cost of the defect simulation in Stage 1 unaffordable. Subsequently, to reduce the simulation time, we collapse the full set to a compact set of defects which serves as input of the defect simulation. The full set is stored for the diagnosis and failure analysis. With inspecting the simulation results, we propose a method to verify the test quality based on the compact set of defects and, if necessary, to compensate the test quality to the same level as that based on the full set of defects. For 351 combinational library cells in Cadence’s GPDK045 45nm library, we simulate only 5.4% defects from the full set to achieve the same test quality based on the full set of defects. In total, the simulation time, via linear extrapolation per cell, would be reduced by 96.4% compared with the time based on the full set of defects

    Performance-aware NILM model optimization for edge deployment

    Get PDF
    Non-Intrusive Load Monitoring (NILM) describes the extraction of the individual consumption pattern of a domestic appliance from the aggregated household consumption. Nowadays, the NILM research focus is shifted towards practical NILM applications, such as edge deployment, to accelerate the transition towards a greener energy future. NILM applications at the edge eliminate privacy concerns and data transmission-related problems. However, edge resource restrictions pose additional challenges to NILM. NILM approaches are usually not designed to run on edge devices with limited computational capacity and therefore model optimization is required for better resource management. Recent works have started investigating NILM model optimization, but they utilize compression approaches arbitrarily, without considering the trade-off between model performance and computational cost. In this work, we present a NILM model optimization framework for edge deployment. The proposed edge optimization engine optimizes a NILM model for edge deployment depending on the edge device’s limitations and includes a novel performance-aware algorithm to reduce the model’s computational complexity. We validate our methodology on three edge application scenarios for four domestic appliances and four model architectures. Experimental results demonstrate that the proposed optimization approach can lead up to 36.3% average reduction of model computational complexity and 75% reduction of storage requirements

    Revisiting the high-performance reconfigurable computing for future datacenters

    Get PDF
    Modern datacenters are reinforcing the computational power and energy efficiency by assimilating field programmable gate arrays (FPGAs). The sustainability of this large-scale integration depends on enabling multi-tenant FPGAs. This requisite amplifies the importance of communication architecture and virtualization method with the required features in order to meet the high-end objective. Consequently, in the last decade, academia and industry proposed several virtualization techniques and hardware architectures for addressing resource management, scheduling, adoptability, segregation, scalability, performance-overhead, availability, programmability, time-to-market, security, and mainly, multitenancy. This paper provides an extensive survey covering three important aspects-discussion on non-standard terms used in existing literature, network-on-chip evaluation choices as a mean to explore the communication architecture, and virtualization methods under latest classification. The purpose is to emphasize the importance of choosing appropriate communication architecture, virtualization technique and standard language to evolve the multi-tenant FPGAs in datacenters. None of the previous surveys encapsulated these aspects in one writing. Open problems are indicated for scientific community as well