3,425 research outputs found

    Deep generative models for network data synthesis and monitoring

    Get PDF
    Measurement and monitoring are fundamental tasks in all networks, enabling the down-stream management and optimization of the network. Although networks inherently have abundant amounts of monitoring data, its access and effective measurement is another story. The challenges exist in many aspects. First, the inaccessibility of network monitoring data for external users, and it is hard to provide a high-fidelity dataset without leaking commercial sensitive information. Second, it could be very expensive to carry out effective data collection to cover a large-scale network system, considering the size of network growing, i.e., cell number of radio network and the number of flows in the Internet Service Provider (ISP) network. Third, it is difficult to ensure fidelity and efficiency simultaneously in network monitoring, as the available resources in the network element that can be applied to support the measurement function are too limited to implement sophisticated mechanisms. Finally, understanding and explaining the behavior of the network becomes challenging due to its size and complex structure. Various emerging optimization-based solutions (e.g., compressive sensing) or data-driven solutions (e.g. deep learning) have been proposed for the aforementioned challenges. However, the fidelity and efficiency of existing methods cannot yet meet the current network requirements. The contributions made in this thesis significantly advance the state of the art in the domain of network measurement and monitoring techniques. Overall, we leverage cutting-edge machine learning technology, deep generative modeling, throughout the entire thesis. First, we design and realize APPSHOT , an efficient city-scale network traffic sharing with a conditional generative model, which only requires open-source contextual data during inference (e.g., land use information and population distribution). Second, we develop an efficient drive testing system — GENDT, based on generative model, which combines graph neural networks, conditional generation, and quantified model uncertainty to enhance the efficiency of mobile drive testing. Third, we design and implement DISTILGAN, a high-fidelity, efficient, versatile, and real-time network telemetry system with latent GANs and spectral-temporal networks. Finally, we propose SPOTLIGHT , an accurate, explainable, and efficient anomaly detection system of the Open RAN (Radio Access Network) system. The lessons learned through this research are summarized, and interesting topics are discussed for future work in this domain. All proposed solutions have been evaluated with real-world datasets and applied to support different applications in real systems

    Joint multi-objective MEH selection and traffic path computation in 5G-MEC systems

    Get PDF
    Multi-access Edge Computing (MEC) is an emerging technology that allows to reduce the service latency and traffic congestion and to enable cloud offloading and context awareness. MEC consists in deploying computing devices, called MEC Hosts (MEHs), close to the user. Given the mobility of the user, several problems rise. The first problem is to select a MEH to run the service requested by the user. Another problem is to select the path to steer the traffic from the user to the selected MEH. The paper jointly addresses these two problems. First, the paper proposes a procedure to create a graph that is able to capture both network-layer and application-layer performance. Then, the proposed graph is used to apply the Multi-objective Dijkstra Algorithm (MDA), a technique used for multi-objective optimization problems, in order to find solutions to the addressed problems by simultaneously considering different performance metrics and constraints. To evaluate the performance of MDA, the paper implements a testbed based on AdvantEDGE and Kubernetes to migrate a VideoLAN application between two MEHs. A controller has been realized to integrate MDA with the 5G-MEC system in the testbed. The results show that MDA is able to perform the migration with a limited impact on the network performance and user experience. The lack of migration would instead lead to a severe reduction of the user experience.publishedVersio

    Exploring Hardware Fault Impacts on Different Real Number Representations of the Structural Resilience of TCUs in GPUs

    Get PDF
    The most recent generations of graphics processing units (GPUs) boost the execution of convolutional operations required by machine learning applications by resorting to specialized and efficient in-chip accelerators (Tensor Core Units or TCUs) that operate on matrix multiplication tiles. Unfortunately, modern cutting-edge semiconductor technologies are increasingly prone to hardware defects, and the trend to highly stress TCUs during the execution of safety-critical and high-performance computing (HPC) applications increases the likelihood of TCUs producing different kinds of failures. In fact, the intrinsic resiliency to hardware faults of arithmetic units plays a crucial role in safety-critical applications using GPUs (e.g., in automotive, space, and autonomous robotics). Recently, new arithmetic formats have been proposed, particularly those suited to neural network execution. However, the reliability characterization of TCUs supporting different arithmetic formats was still lacking. In this work, we quantitatively assessed the impact of hardware faults in TCU structures while employing two distinct formats (floating-point and posit) and using two different configurations (16 and 32 bits) to represent real numbers. For the experimental evaluation, we resorted to an architectural description of a TCU core (PyOpenTCU) and performed 120 fault simulation campaigns, injecting around 200,000 faults per campaign and requiring around 32 days of computation. Our results demonstrate that the posit format of TCUs is less affected by faults than the floating-point one (by up to three orders of magnitude for 16 bits and up to twenty orders for 32 bits). We also identified the most sensible fault locations (i.e., those that produce the largest errors), thus paving the way to adopting smart hardening solutions

    A Literature Review of Fault Diagnosis Based on Ensemble Learning

    Get PDF
    The accuracy of fault diagnosis is an important indicator to ensure the reliability of key equipment systems. Ensemble learning integrates different weak learning methods to obtain stronger learning and has achieved remarkable results in the field of fault diagnosis. This paper reviews the recent research on ensemble learning from both technical and field application perspectives. The paper summarizes 87 journals in recent web of science and other academic resources, with a total of 209 papers. It summarizes 78 different ensemble learning based fault diagnosis methods, involving 18 public datasets and more than 20 different equipment systems. In detail, the paper summarizes the accuracy rates, fault classification types, fault datasets, used data signals, learners (traditional machine learning or deep learning-based learners), ensemble learning methods (bagging, boosting, stacking and other ensemble models) of these fault diagnosis models. The paper uses accuracy of fault diagnosis as the main evaluation metrics supplemented by generalization and imbalanced data processing ability to evaluate the performance of those ensemble learning methods. The discussion and evaluation of these methods lead to valuable research references in identifying and developing appropriate intelligent fault diagnosis models for various equipment. This paper also discusses and explores the technical challenges, lessons learned from the review and future development directions in the field of ensemble learning based fault diagnosis and intelligent maintenance

    Design and Real-World Evaluation of Dependable Wireless Cyber-Physical Systems

    Get PDF
    The ongoing effort for an efficient, sustainable, and automated interaction between humans, machines, and our environment will make cyber-physical systems (CPS) an integral part of the industry and our daily lives. At their core, CPS integrate computing elements, communication networks, and physical processes that are monitored and controlled through sensors and actuators. New and innovative applications become possible by extending or replacing static and expensive cable-based communication infrastructures with wireless technology. The flexibility of wireless CPS is a key enabler for many envisioned scenarios, such as intelligent factories, smart farming, personalized healthcare systems, autonomous search and rescue, and smart cities. High dependability, efficiency, and adaptivity requirements complement the demand for wireless and low-cost solutions in such applications. For instance, industrial and medical systems should work reliably and predictably with performance guarantees, even if parts of the system fail. Because emerging CPS will feature mobile and battery-driven devices that can execute various tasks, the systems must also quickly adapt to frequently changing conditions. Moreover, as applications become ever more sophisticated, featuring compact embedded devices that are deployed densely and at scale, efficient designs are indispensable to achieve desired operational lifetimes and satisfy high bandwidth demands. Meeting these partly conflicting requirements, however, is challenging due to imperfections of wireless communication and resource constraints along several dimensions, for example, computing, memory, and power constraints of the devices. More precisely, frequent and correlated message losses paired with very limited bandwidth and varying delays for the message exchange significantly complicate the control design. In addition, since communication ranges are limited, messages must be relayed over multiple hops to cover larger distances, such as an entire factory. Although the resulting mesh networks are more robust against interference, efficient communication is a major challenge as wireless imperfections get amplified, and significant coordination effort is needed, especially if the networks are dynamic. CPS combine various research disciplines, which are often investigated in isolation, ignoring their complex interaction. However, to address this interaction and build trust in the proposed solutions, evaluating CPS using real physical systems and wireless networks paired with formal guarantees of a system’s end-to-end behavior is necessary. Existing works that take this step can only satisfy a few of the abovementioned requirements. Most notably, multi-hop communication has only been used to control slow physical processes while providing no guarantees. One of the reasons is that the current communication protocols are not suited for dynamic multi-hop networks. This thesis closes the gap between existing works and the diverse needs of emerging wireless CPS. The contributions address different research directions and are split into two parts. In the first part, we specifically address the shortcomings of existing communication protocols and make the following contributions to provide a solid networking foundation: • We present Mixer, a communication primitive for the reliable many-to-all message exchange in dynamic wireless multi-hop networks. Mixer runs on resource-constrained low-power embedded devices and combines synchronous transmissions and network coding for a highly scalable and topology-agnostic message exchange. As a result, it supports mobile nodes and can serve any possible traffic patterns, for example, to efficiently realize distributed control, as required by emerging CPS applications. • We present Butler, a lightweight and distributed synchronization mechanism with formally guaranteed correctness properties to improve the dependability of synchronous transmissions-based protocols. These protocols require precise time synchronization provided by a specific node. Upon failure of this node, the entire network cannot communicate. Butler removes this single point of failure by quickly synchronizing all nodes in the network without affecting the protocols’ performance. In the second part, we focus on the challenges of integrating communication and various control concepts using classical time-triggered and modern event-based approaches. Based on the design, implementation, and evaluation of the proposed solutions using real systems and networks, we make the following contributions, which in many ways push the boundaries of previous approaches: • We are the first to demonstrate and evaluate fast feedback control over low-power wireless multi-hop networks. Essential for this achievement is a novel co-design and integration of communication and control. Our wireless embedded platform tames the imperfections impairing control, for example, message loss and varying delays, and considers the resulting key properties in the control design. Furthermore, the careful orchestration of control and communication tasks enables real-time operation and makes our system amenable to an end-to-end analysis. Due to this, we can provably guarantee closed-loop stability for physical processes with linear time-invariant dynamics. • We propose control-guided communication, a novel co-design for distributed self-triggered control over wireless multi-hop networks. Self-triggered control can save energy by transmitting data only when needed. However, there are no solutions that bring those savings to multi-hop networks and that can reallocate freed-up resources, for example, to other agents. Our control system informs the communication system of its transmission demands ahead of time so that communication resources can be allocated accordingly. Thus, we can transfer the energy savings from the control to the communication side and achieve an end-to-end benefit. • We present a novel co-design of distributed control and wireless communication that resolves overload situations in which the communication demand exceeds the available bandwidth. As systems scale up, featuring more agents and higher bandwidth demands, the available bandwidth will be quickly exceeded, resulting in overload. While event-triggered control and self-triggered control approaches reduce the communication demand on average, they cannot prevent that potentially all agents want to communicate simultaneously. We address this limitation by dynamically allocating the available bandwidth to the agents with the highest need. Thus, we can formally prove that our co-design guarantees closed-loop stability for physical systems with stochastic linear time-invariant dynamics.:Abstract Acknowledgements List of Abbreviations List of Figures List of Tables 1 Introduction 1.1 Motivation 1.2 Application Requirements 1.3 Challenges 1.4 State of the Art 1.5 Contributions and Road Map 2 Mixer: Efficient Many-to-All Broadcast in Dynamic Wireless Mesh Networks 2.1 Introduction 2.2 Overview 2.3 Design 2.4 Implementation 2.5 Evaluation 2.6 Discussion 2.7 Related Work 3 Butler: Increasing the Availability of Low-Power Wireless Communication Protocols 3.1 Introduction 3.2 Motivation and Background 3.3 Design 3.4 Analysis 3.5 Implementation 3.6 Evaluation 3.7 Related Work 4 Feedback Control Goes Wireless: Guaranteed Stability over Low-Power Multi-Hop Networks 4.1 Introduction 4.2 Related Work 4.3 Problem Setting and Approach 4.4 Wireless Embedded System Design 4.5 Control Design and Analysis 4.6 Experimental Evaluation 4.A Control Details 5 Control-Guided Communication: Efficient Resource Arbitration and Allocation in Multi-Hop Wireless Control Systems 5.1 Introduction 5.2 Problem Setting 5.3 Co-Design Approach 5.4 Wireless Communication System Design 5.5 Self-Triggered Control Design 5.6 Experimental Evaluation 6 Scaling Beyond Bandwidth Limitations: Wireless Control With Stability Guarantees Under Overload 6.1 Introduction 6.2 Problem and Related Work 6.3 Overview of Co-Design Approach 6.4 Predictive Triggering and Control System 6.5 Adaptive Communication System 6.6 Integration and Stability Analysis 6.7 Testbed Experiments 6.A Proof of Theorem 4 6.B Usage of the Network Bandwidth for Control 7 Conclusion and Outlook 7.1 Contributions 7.2 Future Directions Bibliography List of Publication

    Multi-objective resource optimization in space-aerial-ground-sea integrated networks

    Get PDF
    Space-air-ground-sea integrated (SAGSI) networks are envisioned to connect satellite, aerial, ground, and sea networks to provide connectivity everywhere and all the time in sixth-generation (6G) networks. However, the success of SAGSI networks is constrained by several challenges including resource optimization when the users have diverse requirements and applications. We present a comprehensive review of SAGSI networks from a resource optimization perspective. We discuss use case scenarios and possible applications of SAGSI networks. The resource optimization discussion considers the challenges associated with SAGSI networks. In our review, we categorized resource optimization techniques based on throughput and capacity maximization, delay minimization, energy consumption, task offloading, task scheduling, resource allocation or utilization, network operation cost, outage probability, and the average age of information, joint optimization (data rate difference, storage or caching, CPU cycle frequency), the overall performance of network and performance degradation, software-defined networking, and intelligent surveillance and relay communication. We then formulate a mathematical framework for maximizing energy efficiency, resource utilization, and user association. We optimize user association while satisfying the constraints of transmit power, data rate, and user association with priority. The binary decision variable is used to associate users with system resources. Since the decision variable is binary and constraints are linear, the formulated problem is a binary linear programming problem. Based on our formulated framework, we simulate and analyze the performance of three different algorithms (branch and bound algorithm, interior point method, and barrier simplex algorithm) and compare the results. Simulation results show that the branch and bound algorithm shows the best results, so this is our benchmark algorithm. The complexity of branch and bound increases exponentially as the number of users and stations increases in the SAGSI network. We got comparable results for the interior point method and barrier simplex algorithm to the benchmark algorithm with low complexity. Finally, we discuss future research directions and challenges of resource optimization in SAGSI networks

    Evaluating Architectural Safeguards for Uncertain AI Black-Box Components

    Get PDF
    Although tremendous progress has been made in Artificial Intelligence (AI), it entails new challenges. The growing complexity of learning tasks requires more complex AI components, which increasingly exhibit unreliable behaviour. In this book, we present a model-driven approach to model architectural safeguards for AI components and analyse their effect on the overall system reliability

    Approximate Computing Survey, Part I: Terminology and Software & Hardware Approximation Techniques

    Full text link
    The rapid growth of demanding applications in domains applying multimedia processing and machine learning has marked a new era for edge and cloud computing. These applications involve massive data and compute-intensive tasks, and thus, typical computing paradigms in embedded systems and data centers are stressed to meet the worldwide demand for high performance. Concurrently, the landscape of the semiconductor field in the last 15 years has constituted power as a first-class design concern. As a result, the community of computing systems is forced to find alternative design approaches to facilitate high-performance and/or power-efficient computing. Among the examined solutions, Approximate Computing has attracted an ever-increasing interest, with research works applying approximations across the entire traditional computing stack, i.e., at software, hardware, and architectural levels. Over the last decade, there is a plethora of approximation techniques in software (programs, frameworks, compilers, runtimes, languages), hardware (circuits, accelerators), and architectures (processors, memories). The current article is Part I of our comprehensive survey on Approximate Computing, and it reviews its motivation, terminology and principles, as well it classifies and presents the technical details of the state-of-the-art software and hardware approximation techniques.Comment: Under Review at ACM Computing Survey

    A Current-Source Modular Converter for Large-Scale Photovoltaic Systems

    Get PDF
    The world is shifting toward renewable energy sources (RESs) to generate clean energy and mitigate the stress of global warming caused by CO2 emissions in recent decades. Among several RES types, large-scale photovoltaic (LSPV) plants are a promising source for meeting ambitious clean energy targets and being part of power generation. With the progress of high-power modular inverters, new opportunities have arisen to integrate them into LSPV systems connected to medium-voltage (MV) grids to obtain high efficiency and reliability, better system flexibility, and improved electrical safety compared with string or central inverters. This thesis presents and implements a new current source three-phase modular inverter (TPMI) based on a novel dual-isolated SEPIC/CUK (DISC) converter. The TPMI is designed with a single power processing stage comprised of seriesconnected DISC submodules (SMs) to deliver MV into the utility grid. It outperforms conventional high-power inverters in terms of modularity, scalability, galvanic isolation compliance, and distributed maximum power point tracking (MPPT) capabilities. The DISC converter employed as an SM in the proposed TPMI generates bipolar output (i.e., both positive and negative voltages). In addition to having step-up and step-down capabilities with a continuous input current, this converter shares an input side inductor, thereby reducing the number of components. The DISC structure, modulation method, operation, novel state-space model, and parameter design procedure are analysed in details. Then, simulation results are presented to validate the theoretical and analytical analyses of the DISC converter. The proposed TPMI inverter is subsequently integrated into the LSPV grid connection to prove its suitability for such applications. In the theoretical analysis, the advantages of TPMI structure over conventional topologies are discussed. Then, the modulation technique, and operational concept are presented, followed by a dedicated control strategy is implemented by adding a system and SM-level controllers. The system controller is required for the generation of uniform duty ratios for all SMs in order to regulate the power transfer. The SM level controller is introduced to ensure equal current and voltage distribution between SMs and to compensate for minor discrepancies between the various parameters. The entire TPMI system is demonstrated through MATLAB and Simulink simulations, with the objective being to deliver the rated (1 MW) power from the PV modules under normal operation, uniform shading, and partial shading conditions and to match PV generation with the grid’s power demands. A downscaled 3-kW TPMI inverter was developed in the laboratory to validate its feasibility experimentally with its control strategy in different operating conditions. Finally, the TPMI performance is compared with selected current source inverter topologies, which shows that TPMI obtains good efficiency within the context of existing state-of-the-art current source converters. Then, the TPMI structure is modified by redesigning its DISC SMs, which provides several benefits, including a reduction in the number of switch devices operating at high frequency, thus decreasing switching losses, and an increase in efficiency. In this study, a half-cycle modulation (HCM) scheme is developed for the switches, and the operation of a modified DISC SM is analysed. Simulation and experimental results validate the performance of the modified TPMI topology and demonstrate its suitability for LSPV applications. According to the results of the comparison, the maximum power efficiency of the modified TPMI structure is 95.5%, which represents an improvement over the original TPMI structure
    • …
    corecore