4,593 research outputs found
Deep generative models for network data synthesis and monitoring
Measurement and monitoring are fundamental tasks in all networks, enabling the down-stream management and optimization of the network.
Although networks inherently
have abundant amounts of monitoring data, its access and effective measurement is
another story. The challenges exist in many aspects. First, the inaccessibility of network monitoring data for external users, and it is hard to provide a high-fidelity dataset
without leaking commercial sensitive information. Second, it could be very expensive
to carry out effective data collection to cover a large-scale network system, considering the size of network growing, i.e., cell number of radio network and the number of
flows in the Internet Service Provider (ISP) network. Third, it is difficult to ensure fidelity and efficiency simultaneously in network monitoring, as the available resources
in the network element that can be applied to support the measurement function are
too limited to implement sophisticated mechanisms. Finally, understanding and explaining the behavior of the network becomes challenging due to its size and complex
structure. Various emerging optimization-based solutions (e.g., compressive sensing)
or data-driven solutions (e.g. deep learning) have been proposed for the aforementioned challenges. However, the fidelity and efficiency of existing methods cannot yet
meet the current network requirements.
The contributions made in this thesis significantly advance the state of the art in
the domain of network measurement and monitoring techniques. Overall, we leverage
cutting-edge machine learning technology, deep generative modeling, throughout the
entire thesis. First, we design and realize APPSHOT , an efficient city-scale network
traffic sharing with a conditional generative model, which only requires open-source
contextual data during inference (e.g., land use information and population distribution). Second, we develop an efficient drive testing system â GENDT, based on generative model, which combines graph neural networks, conditional generation, and quantified model uncertainty to enhance the efficiency of mobile drive testing. Third, we
design and implement DISTILGAN, a high-fidelity, efficient, versatile, and real-time
network telemetry system with latent GANs and spectral-temporal networks. Finally,
we propose SPOTLIGHT , an accurate, explainable, and efficient anomaly detection system of the Open RAN (Radio Access Network) system. The lessons learned through
this research are summarized, and interesting topics are discussed for future work in
this domain. All proposed solutions have been evaluated with real-world datasets and
applied to support different applications in real systems
Modern computing: Vision and challenges
Over the past six decades, the computing systems field has experienced significant transformations, profoundly impacting society with transformational developments, such as the Internet and the commodification of computing. Underpinned by technological advancements, computer systems, far from being static, have been continuously evolving and adapting to cover multifaceted societal niches. This has led to new paradigms such as cloud, fog, edge computing, and the Internet of Things (IoT), which offer fresh economic and creative opportunities. Nevertheless, this rapid change poses complex research challenges, especially in maximizing potential and enhancing functionality. As such, to maintain an economical level of performance that meets ever-tighter requirements, one must understand the drivers of new model emergence and expansion, and how contemporary challenges differ from past ones. To that end, this article investigates and assesses the factors influencing the evolution of computing systems, covering established systems and architectures as well as newer developments, such as serverless computing, quantum computing, and on-device AI on edge devices. Trends emerge when one traces technological trajectory, which includes the rapid obsolescence of frameworks due to business and technical constraints, a move towards specialized systems and models, and varying approaches to centralized and decentralized control. This comprehensive review of modern computing systems looks ahead to the future of research in the field, highlighting key challenges and emerging trends, and underscoring their importance in cost-effectively driving technological progress
Natural and Technological Hazards in Urban Areas
Natural hazard events and technological accidents are separate causes of environmental impacts. Natural hazards are physical phenomena active in geological times, whereas technological hazards result from actions or facilities created by humans. In our time, combined natural and man-made hazards have been induced. Overpopulation and urban development in areas prone to natural hazards increase the impact of natural disasters worldwide. Additionally, urban areas are frequently characterized by intense industrial activity and rapid, poorly planned growth that threatens the environment and degrades the quality of life. Therefore, proper urban planning is crucial to minimize fatalities and reduce the environmental and economic impacts that accompany both natural and technological hazardous events
On Age-of-Information Aware Resource Allocation for Industrial Control-Communication-Codesign
Unter dem Ăberbegriff Industrie 4.0 wird in der industriellen Fertigung die zunehmende Digitalisierung und Vernetzung von industriellen Maschinen und Prozessen zusammengefasst. Die drahtlose, hoch-zuverlĂ€ssige, niedrig-latente Kommunikation (engl. ultra-reliable low-latency communication, URLLC) â als Bestandteil von 5G gewĂ€hrleistet höchste DienstgĂŒten, die mit industriellen drahtgebundenen Technologien vergleichbar sind und wird deshalb als Wegbereiter von Industrie 4.0 gesehen. Entgegen diesem Trend haben eine Reihe von Arbeiten im Forschungsbereich der vernetzten Regelungssysteme (engl. networked control systems, NCS) gezeigt, dass die hohen DienstgĂŒten von URLLC nicht notwendigerweise erforderlich sind, um eine hohe RegelgĂŒte zu erzielen. Das Co-Design von Kommunikation und Regelung ermöglicht eine gemeinsame Optimierung von RegelgĂŒte und Netzwerkparametern durch die Aufweichung der Grenze zwischen Netzwerk- und Applikationsschicht. Durch diese VerschrĂ€nkung wird jedoch eine fundamentale (gemeinsame) Neuentwicklung von Regelungssystemen und Kommunikationsnetzen nötig, was ein Hindernis fĂŒr die Verbreitung dieses Ansatzes darstellt. Stattdessen bedient sich diese Dissertation einem Co-Design-Ansatz, der beide DomĂ€nen weiterhin eindeutig voneinander abgrenzt, aber das Informationsalter (engl. age of information, AoI) als bedeutenden Schnittstellenparameter ausnutzt.
Diese Dissertation trĂ€gt dazu bei, die EchtzeitanwendungszuverlĂ€ssigkeit als Folge der Ăberschreitung eines vorgegebenen Informationsalterschwellenwerts zu quantifizieren und fokussiert sich dabei auf den Paketverlust als Ursache. Anhand der Beispielanwendung eines fahrerlosen Transportsystems wird gezeigt, dass die zeitlich negative Korrelation von Paketfehlern, die in heutigen Systemen keine Rolle spielt, fĂŒr Echtzeitanwendungen Ă€uĂerst vorteilhaft ist. Mit der Annahme von schnellem Schwund als dominanter Fehlerursache auf der Luftschnittstelle werden durch zeitdiskrete Markovmodelle, die fĂŒr die zwei Netzwerkarchitekturen Single-Hop und Dual-Hop prĂ€sentiert werden, Kommunikationsfehlerfolgen auf einen Applikationsfehler abgebildet. Diese Modellierung ermöglicht die analytische Ableitung von anwendungsbezogenen ZuverlĂ€ssigkeitsmetriken wie die durschnittliche Dauer bis zu einem Fehler (engl. mean time to failure). FĂŒr Single-Hop-Netze wird das neuartige Ressourcenallokationsschema State-Aware Resource Allocation (SARA) entwickelt, das auf dem Informationsalter beruht und die AnwendungszuverlĂ€ssigkeit im Vergleich zu statischer Multi-KonnektivitĂ€t um GröĂenordnungen erhöht, wĂ€hrend der Ressourcenverbrauch im Bereich von konventioneller EinzelkonnektivitĂ€t bleibt.
Diese ZuverlĂ€ssigkeit kann auch innerhalb eines Systems von Regelanwendungen, in welchem mehrere Agenten um eine begrenzte Anzahl Ressourcen konkurrieren, statistisch garantiert werden, wenn die Anzahl der verfĂŒgbaren Ressourcen pro Agent um ca. 10 % erhöht werden. FĂŒr das Dual-Hop Szenario wird darĂŒberhinaus ein Optimierungsverfahren vorgestellt, das eine benutzerdefinierte Kostenfunktion minimiert, die niedrige AnwendungszuverlĂ€ssigkeit, hohes Informationsalter und hohen durchschnittlichen Ressourcenverbrauch bestraft und so das benutzerdefinierte optimale SARA-Schema ableitet. Diese Optimierung kann offline durchgefĂŒhrt und als Look-Up-Table in der unteren Medienzugriffsschicht zukĂŒnftiger industrieller Drahtlosnetze implementiert werden.:1. Introduction 1
1.1. The Need for an Industrial Solution . . . . . . . . . . . . . . . . . . . 3
1.2. Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Related Work 7
2.1. Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2. Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3. Codesign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1. The Need for Abstraction â Age of Information . . . . . . . . 11
2.4. Dependability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3. Deriving Proper Communications Requirements 17
3.1. Fundamentals of Control Theory . . . . . . . . . . . . . . . . . . . . 18
3.1.1. Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.2. Performance Requirements . . . . . . . . . . . . . . . . . . . 21
3.1.3. Packet Losses and Delay . . . . . . . . . . . . . . . . . . . . . 22
3.2. Joint Design of Control Loop with Packet Losses . . . . . . . . . . . . 23
3.2.1. Method 1: Reduced Sampling . . . . . . . . . . . . . . . . . . 23
3.2.2. Method 2: Markov Jump Linear System . . . . . . . . . . . . . 25
3.2.3. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3. Focus Application: The AGV Use Case . . . . . . . . . . . . . . . . . . 31
3.3.1. Control Loop Model . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.2. Control Performance Requirements . . . . . . . . . . . . . . . 33
3.3.3. Joint Modeling: Applying Reduced Sampling . . . . . . . . . . 34
3.3.4. Joint Modeling: Applying MJLS . . . . . . . . . . . . . . . . . 34
3.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4. Modeling Control-Communication Failures 43
4.1. Communication Assumptions . . . . . . . . . . . . . . . . . . . . . . 43
4.1.1. Small-Scale Fading as a Cause of Failure . . . . . . . . . . . . 44
4.1.2. Connectivity Models . . . . . . . . . . . . . . . . . . . . . . . 46
4.2. Failure Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2.1. Single-hop network . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2.2. Dual-hop network . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3. Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3.1. Mean Time to Failure . . . . . . . . . . . . . . . . . . . . . . . 54
4.3.2. Packet Loss Ratio . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3.3. Average Number of Assigned Channels . . . . . . . . . . . . . 57
4.3.4. Age of Information . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5. Single Hop â Single Agent 61
5.1. State-Aware Resource Allocation . . . . . . . . . . . . . . . . . . . . 61
5.2. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3. Erroneous Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 67
5.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6. Single Hop â Multiple Agents 71
6.1. Failure Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.1.1. Admission Control . . . . . . . . . . . . . . . . . . . . . . . . 72
6.1.2. Transition Probabilities . . . . . . . . . . . . . . . . . . . . . . 73
6.1.3. Computational Complexity . . . . . . . . . . . . . . . . . . . 74
6.1.4. Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . 75
6.2. Illustration Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.3.1. Verification through System-Level Simulation . . . . . . . . . 78
6.3.2. Applicability on the System Level . . . . . . . . . . . . . . . . 79
6.3.3. Comparison of Admission Control Schemes . . . . . . . . . . 80
6.3.4. Impact of the Packet Loss Tolerance . . . . . . . . . . . . . . . 82
6.3.5. Impact of the Number of Agents . . . . . . . . . . . . . . . . . 84
6.3.6. Age of Information . . . . . . . . . . . . . . . . . . . . . . . . 84
6.3.7. Channel Saturation Ratio . . . . . . . . . . . . . . . . . . . . 86
6.3.8. Enforcing Full Channel Saturation . . . . . . . . . . . . . . . 86
6.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7. Dual Hop â Single Agent 91
7.1. State-Aware Resource Allocation . . . . . . . . . . . . . . . . . . . . 91
7.2. Optimization Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.3.1. Extensive Simulation . . . . . . . . . . . . . . . . . . . . . . . 96
7.3.2. Non-Integer-Constrained Optimization . . . . . . . . . . . . . 98
7.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8. Conclusions and Outlook 105
8.1. Key Results and Conclusions . . . . . . . . . . . . . . . . . . . . . . . 105
8.2. Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
A. DC Motor Model 111
Bibliography 113
Publications of the Author 127
List of Figures 129
List of Tables 131
List of Operators and Constants 133
List of Symbols 135
List of Acronyms 137
Curriculum Vitae 139In industrial manufacturing, Industry 4.0 refers to the ongoing convergence of the real and virtual worlds, enabled through intelligently interconnecting industrial machines and processes through information and communications technology. Ultrareliable low-latency communication (URLLC) is widely regarded as the enabling technology for Industry 4.0 due to its ability to fulfill highest quality-of-service (QoS) comparable to those of industrial wireline connections. In contrast to this trend, a range of works in the research domain of networked control systems have shown that URLLCâs supreme QoS is not necessarily required to achieve high quality-ofcontrol; the co-design of control and communication enables to jointly optimize and balance both quality-of-control parameters and network parameters through blurring the boundary between application and network layer. However, through the tight interlacing, this approach requires a fundamental (joint) redesign of both control systems and communication networks and may therefore not lead to short-term widespread adoption. Therefore, this thesis instead embraces a novel co-design approach which keeps both domains distinct but leverages the combination of control and communications by yet exploiting the age of information (AoI) as a valuable interface metric.
This thesis contributes to quantifying application dependability as a consequence of exceeding a given peak AoI with the particular focus on packet losses. The beneficial influence of negative temporal packet loss correlation on control performance is demonstrated by means of the automated guided vehicle use case. Assuming small-scale fading as the dominant cause of communication failure, a series of communication failures are mapped to an application failure through discrete-time Markov models for single-hop (e.g, only uplink or downlink) and dual-hop (e.g., subsequent uplink and downlink) architectures. This enables the derivation of application-related dependability metrics such as the mean time to failure in closed form. For single-hop networks, an AoI-aware resource allocation strategy termed state-aware resource allocation (SARA) is proposed that increases the application reliability by orders of magnitude compared to static multi-connectivity while keeping the resource consumption in the range of best-effort single-connectivity. This dependability can also be statistically guaranteed on a system level â where multiple agents compete for a limited number of resources â if the provided amount of resources per agent is increased by approximately 10 %. For the dual-hop scenario, an AoI-aware resource allocation optimization is developed that minimizes a user-defined penalty function that punishes low application reliability, high AoI, and high average resource consumption. This optimization may be carried out offline and each resulting optimal SARA scheme may be implemented as a look-up table in the lower medium access control layer of future wireless industrial networks.:1. Introduction 1
1.1. The Need for an Industrial Solution . . . . . . . . . . . . . . . . . . . 3
1.2. Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Related Work 7
2.1. Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2. Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3. Codesign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1. The Need for Abstraction â Age of Information . . . . . . . . 11
2.4. Dependability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3. Deriving Proper Communications Requirements 17
3.1. Fundamentals of Control Theory . . . . . . . . . . . . . . . . . . . . 18
3.1.1. Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.2. Performance Requirements . . . . . . . . . . . . . . . . . . . 21
3.1.3. Packet Losses and Delay . . . . . . . . . . . . . . . . . . . . . 22
3.2. Joint Design of Control Loop with Packet Losses . . . . . . . . . . . . 23
3.2.1. Method 1: Reduced Sampling . . . . . . . . . . . . . . . . . . 23
3.2.2. Method 2: Markov Jump Linear System . . . . . . . . . . . . . 25
3.2.3. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3. Focus Application: The AGV Use Case . . . . . . . . . . . . . . . . . . 31
3.3.1. Control Loop Model . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.2. Control Performance Requirements . . . . . . . . . . . . . . . 33
3.3.3. Joint Modeling: Applying Reduced Sampling . . . . . . . . . . 34
3.3.4. Joint Modeling: Applying MJLS . . . . . . . . . . . . . . . . . 34
3.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4. Modeling Control-Communication Failures 43
4.1. Communication Assumptions . . . . . . . . . . . . . . . . . . . . . . 43
4.1.1. Small-Scale Fading as a Cause of Failure . . . . . . . . . . . . 44
4.1.2. Connectivity Models . . . . . . . . . . . . . . . . . . . . . . . 46
4.2. Failure Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2.1. Single-hop network . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2.2. Dual-hop network . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3. Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3.1. Mean Time to Failure . . . . . . . . . . . . . . . . . . . . . . . 54
4.3.2. Packet Loss Ratio . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3.3. Average Number of Assigned Channels . . . . . . . . . . . . . 57
4.3.4. Age of Information . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5. Single Hop â Single Agent 61
5.1. State-Aware Resource Allocation . . . . . . . . . . . . . . . . . . . . 61
5.2. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3. Erroneous Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 67
5.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6. Single Hop â Multiple Agents 71
6.1. Failure Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.1.1. Admission Control . . . . . . . . . . . . . . . . . . . . . . . . 72
6.1.2. Transition Probabilities . . . . . . . . . . . . . . . . . . . . . . 73
6.1.3. Computational Complexity . . . . . . . . . . . . . . . . . . . 74
6.1.4. Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . 75
6.2. Illustration Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.3.1. Verification through System-Level Simulation . . . . . . . . . 78
6.3.2. Applicability on the System Level . . . . . . . . . . . . . . . . 79
6.3.3. Comparison of Admission Control Schemes . . . . . . . . . . 80
6.3.4. Impact of the Packet Loss Tolerance . . . . . . . . . . . . . . . 82
6.3.5. Impact of the Number of Agents . . . . . . . . . . . . . . . . . 84
6.3.6. Age of Information . . . . . . . . . . . . . . . . . . . . . . . . 84
6.3.7. Channel Saturation Ratio . . . . . . . . . . . . . . . . . . . . 86
6.3.8. Enforcing Full Channel Saturation . . . . . . . . . . . . . . . 86
6.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7. Dual Hop â Single Agent 91
7.1. State-Aware Resource Allocation . . . . . . . . . . . . . . . . . . . . 91
7.2. Optimization Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.3.1. Extensive Simulation . . . . . . . . . . . . . . . . . . . . . . . 96
7.3.2. Non-Integer-Constrained Optimization . . . . . . . . . . . . . 98
7.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8. Conclusions and Outlook 105
8.1. Key Results and Conclusions . . . . . . . . . . . . . . . . . . . . . . . 105
8.2. Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
A. DC Motor Model 111
Bibliography 113
Publications of the Author 127
List of Figures 129
List of Tables 131
List of Operators and Constants 133
List of Symbols 135
List of Acronyms 137
Curriculum Vitae 13
Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs
As communication protocols evolve, datacenter network utilization increases.
As a result, congestion is more frequent, causing higher latency and packet
loss. Combined with the increasing complexity of workloads, manual design of
congestion control (CC) algorithms becomes extremely difficult. This calls for
the development of AI approaches to replace the human effort. Unfortunately, it
is currently not possible to deploy AI models on network devices due to their
limited computational capabilities. Here, we offer a solution to this problem
by building a computationally-light solution based on a recent reinforcement
learning CC algorithm [arXiv:2207.02295]. We reduce the inference time of RL-CC
by x500 by distilling its complex neural network into decision trees. This
transformation enables real-time inference within the -sec decision-time
requirement, with a negligible effect on quality. We deploy the transformed
policy on NVIDIA NICs in a live cluster. Compared to popular CC algorithms used
in production, RL-CC is the only method that performs well on all benchmarks
tested over a large range of number of flows. It balances multiple metrics
simultaneously: bandwidth, latency, and packet drops. These results suggest
that data-driven methods for CC are feasible, challenging the prior belief that
handcrafted heuristics are necessary to achieve optimal performance
- âŠ