4,342 research outputs found
Deep generative models for network data synthesis and monitoring
Measurement and monitoring are fundamental tasks in all networks, enabling the down-stream management and optimization of the network.
Although networks inherently
have abundant amounts of monitoring data, its access and effective measurement is
another story. The challenges exist in many aspects. First, the inaccessibility of network monitoring data for external users, and it is hard to provide a high-fidelity dataset
without leaking commercial sensitive information. Second, it could be very expensive
to carry out effective data collection to cover a large-scale network system, considering the size of network growing, i.e., cell number of radio network and the number of
flows in the Internet Service Provider (ISP) network. Third, it is difficult to ensure fidelity and efficiency simultaneously in network monitoring, as the available resources
in the network element that can be applied to support the measurement function are
too limited to implement sophisticated mechanisms. Finally, understanding and explaining the behavior of the network becomes challenging due to its size and complex
structure. Various emerging optimization-based solutions (e.g., compressive sensing)
or data-driven solutions (e.g. deep learning) have been proposed for the aforementioned challenges. However, the fidelity and efficiency of existing methods cannot yet
meet the current network requirements.
The contributions made in this thesis significantly advance the state of the art in
the domain of network measurement and monitoring techniques. Overall, we leverage
cutting-edge machine learning technology, deep generative modeling, throughout the
entire thesis. First, we design and realize APPSHOT , an efficient city-scale network
traffic sharing with a conditional generative model, which only requires open-source
contextual data during inference (e.g., land use information and population distribution). Second, we develop an efficient drive testing system — GENDT, based on generative model, which combines graph neural networks, conditional generation, and quantified model uncertainty to enhance the efficiency of mobile drive testing. Third, we
design and implement DISTILGAN, a high-fidelity, efficient, versatile, and real-time
network telemetry system with latent GANs and spectral-temporal networks. Finally,
we propose SPOTLIGHT , an accurate, explainable, and efficient anomaly detection system of the Open RAN (Radio Access Network) system. The lessons learned through
this research are summarized, and interesting topics are discussed for future work in
this domain. All proposed solutions have been evaluated with real-world datasets and
applied to support different applications in real systems
Optimization of Beyond 5G Network Slicing for Smart City Applications
Transitioning from the current fifth-generation (5G) wireless technology, the advent of beyond 5G (B5G) signifies a pivotal stride toward sixth generation (6G) communication technology. B5G, at its essence, harnesses end-to-end (E2E) network slicing (NS) technology, enabling the simultaneous accommodation of multiple logical networks with distinct performance requirements on a shared physical infrastructure. At the forefront of this implementation lies the critical process of network slice design, a phase central to the realization of efficient smart city networks. This thesis assumes a key role in the network slicing life cycle, emphasizing the analysis and formulation of optimal procedures for configuring, customizing, and allocating E2E network slices. The focus extends to catering to the unique demands of smart city applications, encompassing critical areas such as emergency response, smart buildings, and video surveillance. By addressing the intricacies of network slice design, the study navigates through the complexities of tailoring slices to meet specific application needs, thereby contributing to the seamless integration of diverse services within the smart city framework. Addressing the core challenge of NS, which involves the allocation of virtual networks on the physical topology with optimal resource allocation, the thesis introduces a dual integer linear programming (ILP) optimization problem. This problem is formulated to jointly minimize the embedding cost and latency. However, given the NP-hard nature of this ILP, finding an efficient alternative becomes a significant hurdle. In response, this thesis introduces a novel heuristic approach the matroid-based modified greedy breadth-first search (MGBFS) algorithm. This pioneering algorithm leverages matroid properties to navigate the process of virtual network embedding and resource allocation. By introducing this novel heuristic approach, the research aims to provide near-optimal solutions, overcoming the computational complexities associated with the dual integer linear programming problem. The proposed MGBFS algorithm not only addresses the connectivity, cost, and latency constraints but also outperforms the benchmark model delivering solutions remarkably close to optimal. This innovative approach represents a substantial advancement in the optimization of smart city applications, promising heightened connectivity, efficiency, and resource utilization within the evolving landscape of B5G-enabled communication technology
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
The Modernization Process of a Data Pipeline
Data plays an integral part in a company’s decision-making. Therefore, decision-makers must have the right data available at the right time. Data volumes grow constantly, and new data is continuously needed for analytical purposes. Many companies use data warehouses to store data in an easy-to-use format for reporting and analytics. The challenge with data warehousing is displaying data using one unified structure. The source data is often gathered from many systems that are structured in various ways.
A process called extract, transform, and load (ETL) or extract, load, and transform (ELT) is used to load data into the data warehouse. This thesis describes the modernization process of one such pipeline. The previous solution, which used an on-premises Teradata platform for computation and SQL stored procedures for the transformation logic, is replaced by a new solution. The goal of the new solution is a process that uses modern tools, is scalable, and follows programming best practises. The cloud-based Databricks platform is used for computation, and dbt is used as the transformation tool. Lastly, a comparison is made between the new and old solutions, and their benefits and drawbacks are discussed
Recombinant spidroins from infinite circRNA translation
Spidroins are a diverse family of peptides and the main components of spider silk. They can be used to produce sustainable, lightweight and durable materials for a large variety of medical and engineering applications. Spiders’ territorial behaviour and cannibalism precludes farming them for silk. Recombinant protein synthesis is the most promising way of producing these peptides. However, many approaches have been unsuccessful in obtaining large titres of recombinant spidroins or ones of sufficient molecular weight. The work described here is focused on expressing high molecular weight spidroins from short circular RNA molecules. Mammalian host cells were transfected with designed circular-RNA-producing plasmid vectors. A backsplicing approach was implemented to successfully circularise RNA in a variety of mammalian cell types. This approach could not express any recombinant spidroins based on a variety of qualitative protein assays. Further experiments investigated the reasons behind this.
Additionally, due to the diversity of spidroins in a large number of spider lineages, there are potentially many spidroin sequences left to be discovered. A bioinformatic pipeline was developed that accepts transcriptome datasets from RNA sequencing and uses tandem repeat detection and profile HMM annotation to identify novel sequences. This pipeline was specifically designed for the identification of repeat domains in expressed sequences. 21 transcriptomes from 17 different species, encompassing a wide selection of basal and derived spider lineages, were investigated using this pipeline. Six previously undescribed spidroin sequences were discovered. This pipeline was additionally tested in the context of the suckerin protein family. These proteins have recently been investigated for their potential properties in medicine and engineering including adhesion in wet environments. The computational pipeline was able to double the number of suckerins known to date. Further phylogenetic analysis was implemented to expand on the knowledge of suckerins. This pipeline enables the identification of transcripts that may have been overlooked by more mainstream analysis methods such as pairwise homology searches. The spidroins and suckerins discovered by this pipeline may contribute to the large repertoire of potentially useful properties characteristic of this diverse peptide family
Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5
This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered.
First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes.
Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification.
Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well
Evaluating Architectural Safeguards for Uncertain AI Black-Box Components
Although tremendous progress has been made in Artificial Intelligence (AI), it entails new challenges. The growing complexity of learning tasks requires more complex AI components, which increasingly exhibit unreliable behaviour. In this book, we present a model-driven approach to model architectural safeguards for AI components and analyse their effect on the overall system reliability
Tiny Machine Learning Environment: Enabling Intelligence on Constrained Devices
Running machine learning algorithms (ML) on constrained devices at the extreme edge of the network is problematic due to the computational overhead of ML algorithms, available resources on the embedded platform, and application budget (i.e., real-time requirements, power constraints, etc.). This required the development of specific solutions and development tools for what is now referred to as TinyML. In this dissertation, we focus on improving the deployment and performance of TinyML applications, taking into consideration the aforementioned challenges, especially memory requirements.
This dissertation contributed to the construction of the Edge Learning Machine environment (ELM), a platform-independent open-source framework that provides three main TinyML services, namely shallow ML, self-supervised ML, and binary deep learning on constrained devices. In this context, this work includes the following steps, which are reflected in the thesis structure. First, we present the performance analysis of state-of-the-art shallow ML algorithms including dense neural networks, implemented on mainstream microcontrollers. The comprehensive analysis in terms of algorithms, hardware platforms, datasets, preprocessing techniques, and configurations shows similar performance results compared to a desktop machine and highlights the impact of these factors on overall performance. Second, despite the assumption that TinyML only permits models inference provided by the scarcity of resources, we have gone a step further and enabled self-supervised on-device training on microcontrollers and tiny IoT devices by developing the Autonomous Edge Pipeline (AEP) system. AEP achieves comparable accuracy compared to the typical TinyML paradigm, i.e., models trained on resource-abundant devices and then deployed on microcontrollers. Next, we present the development of a memory allocation strategy for convolutional neural networks (CNNs) layers, that optimizes memory requirements. This approach reduces the memory footprint without affecting accuracy nor latency. Moreover, e-skin systems share the main requirements of the TinyML fields: enabling intelligence with low memory, low power consumption, and low latency. Therefore, we designed an efficient Tiny CNN architecture for e-skin applications. The architecture leverages the memory allocation strategy presented earlier and provides better performance than existing solutions. A major contribution of the thesis is given by CBin-NN, a library of functions for implementing extremely efficient binary neural networks on constrained devices. The library outperforms state of the art NN deployment solutions by drastically reducing memory footprint and inference latency. All the solutions proposed in this thesis have been implemented on representative devices and tested in relevant applications, of which results are reported and discussed. The ELM framework is open source, and this work is clearly becoming a useful, versatile toolkit for the IoT and TinyML research and development community
Beam scanning by liquid-crystal biasing in a modified SIW structure
A fixed-frequency beam-scanning 1D antenna based on Liquid Crystals (LCs) is designed for application in 2D scanning with lateral alignment. The 2D array environment imposes full decoupling of adjacent 1D antennas, which often conflicts with the LC requirement of DC biasing: the proposed design accommodates both. The LC medium is placed inside a Substrate Integrated Waveguide (SIW) modified to work as a Groove Gap Waveguide, with radiating slots etched on the upper broad wall, that radiates as a Leaky-Wave Antenna (LWA). This allows effective application of the DC bias voltage needed for tuning the LCs. At the same time, the RF field remains laterally confined, enabling the possibility to lay several antennas in parallel and achieve 2D beam scanning. The design is validated by simulation employing the actual properties of a commercial LC medium
- …