6,134 research outputs found

    Deep generative models for network data synthesis and monitoring

    Get PDF
    Measurement and monitoring are fundamental tasks in all networks, enabling the down-stream management and optimization of the network. Although networks inherently have abundant amounts of monitoring data, its access and effective measurement is another story. The challenges exist in many aspects. First, the inaccessibility of network monitoring data for external users, and it is hard to provide a high-fidelity dataset without leaking commercial sensitive information. Second, it could be very expensive to carry out effective data collection to cover a large-scale network system, considering the size of network growing, i.e., cell number of radio network and the number of flows in the Internet Service Provider (ISP) network. Third, it is difficult to ensure fidelity and efficiency simultaneously in network monitoring, as the available resources in the network element that can be applied to support the measurement function are too limited to implement sophisticated mechanisms. Finally, understanding and explaining the behavior of the network becomes challenging due to its size and complex structure. Various emerging optimization-based solutions (e.g., compressive sensing) or data-driven solutions (e.g. deep learning) have been proposed for the aforementioned challenges. However, the fidelity and efficiency of existing methods cannot yet meet the current network requirements. The contributions made in this thesis significantly advance the state of the art in the domain of network measurement and monitoring techniques. Overall, we leverage cutting-edge machine learning technology, deep generative modeling, throughout the entire thesis. First, we design and realize APPSHOT , an efficient city-scale network traffic sharing with a conditional generative model, which only requires open-source contextual data during inference (e.g., land use information and population distribution). Second, we develop an efficient drive testing system — GENDT, based on generative model, which combines graph neural networks, conditional generation, and quantified model uncertainty to enhance the efficiency of mobile drive testing. Third, we design and implement DISTILGAN, a high-fidelity, efficient, versatile, and real-time network telemetry system with latent GANs and spectral-temporal networks. Finally, we propose SPOTLIGHT , an accurate, explainable, and efficient anomaly detection system of the Open RAN (Radio Access Network) system. The lessons learned through this research are summarized, and interesting topics are discussed for future work in this domain. All proposed solutions have been evaluated with real-world datasets and applied to support different applications in real systems

    A reinforcement learning recommender system using bi-clustering and Markov Decision Process

    Get PDF
    Collaborative filtering (CF) recommender systems are static in nature and does not adapt well with changing user preferences. User preferences may change after interaction with a system or after buying a product. Conventional CF clustering algorithms only identifies the distribution of patterns and hidden correlations globally. However, the impossibility of discovering local patterns by these algorithms, headed to the popularization of bi-clustering algorithms. Bi-clustering algorithms can analyze all dataset dimensions simultaneously and consequently, discover local patterns that deliver a better understanding of the underlying hidden correlations. In this paper, we modelled the recommendation problem as a sequential decision-making problem using Markov Decision Processes (MDP). To perform state representation for MDP, we first converted user-item votings matrix to a binary matrix. Then we performed bi-clustering on this binary matrix to determine a subset of similar rows and columns. A bi-cluster merging algorithm is designed to merge similar and overlapping bi-clusters. These bi-clusters are then mapped to a squared grid (SG). RL is applied on this SG to determine best policy to give recommendation to users. Start state is determined using Improved Triangle Similarity (ITR similarity measure. Reward function is computed as grid state overlapping in terms of users and items in current and prospective next state. A thorough comparative analysis was conducted, encompassing a diverse array of methodologies, including RL-based, pure Collaborative Filtering (CF), and clustering methods. The results demonstrate that our proposed method outperforms its competitors in terms of precision, recall, and optimal policy learning

    Probabilistic design of support structures for offshore wind turbines by means of non-Gaussian spectral analysis

    Get PDF
    Offshore wind energy is of special importance in order to meet the ambitious goals to produce climate-neutral energy. Therefore, an accelerated installation of offshore wind turbines is required. The design is to be achieved with respect to standards and guidelines. Especially probabilistic design methods allow an accurate and economic structural design. Not only the environmental conditions vary during the lifetime, but the short-term loads are also subject of random scattering. For the design of offshore wind turbines, the required load simulations are usually carried out in time domain. In comparison, it is less time-consuming to obtain loads by means of frequency-domain analysis. This is very beneficial for the probabilistic design which requires significantly more load simulations in time domain. However, non-linearities and time-variant behaviour of offshore wind turbines cannot be represented well during the load simulation in frequency domain. Extreme loads and fatigue loads can be calculated by means of frequency-domain analysis. The determination of the distribution functions of extreme values is well established on a theoretical background. As for the fatigue design, different empirical models exist which describe the distribution function of fatigue loads on the basis of frequency-domain analysis. In this thesis, a new model is introduced which leads to more accurate results. Since frequency-domain analysis is not always suitable, the transformation of signals given in frequency domain is required to generate respective random time series. As for the design of offshore wind turbines, only limited recommendations are stated in standards on how to carry out this transformation. Detailed analysis shows that accurate results with respect to wave-induced loads are also obtained for coarser discretisation of spectra. The resulting loads and their statistical properties are still accurate, while the numerical effort can be reduced in comparison to the stated recommendations. On the basis of theoretical findings, time series from load simulations of offshore wind turbine are analysed regarding their spectral properties. Investigations are carried out to evaluate the agreement between the extreme load and fatigue loads which are either simulated or calculated on the basis of the spectral properties. It is also shown that currents within sea states lead to increased fatigue loads

    A review of differentiable digital signal processing for music and speech synthesis

    Get PDF
    The term “differentiable digital signal processing” describes a family of techniques in which loss function gradients are backpropagated through digital signal processors, facilitating their integration into neural networks. This article surveys the literature on differentiable audio signal processing, focusing on its use in music and speech synthesis. We catalogue applications to tasks including music performance rendering, sound matching, and voice transformation, discussing the motivations for and implications of the use of this methodology. This is accompanied by an overview of digital signal processing operations that have been implemented differentiably, which is further supported by a web book containing practical advice on differentiable synthesiser programming (https://intro2ddsp.github.io/). Finally, we highlight open challenges, including optimisation pathologies, robustness to real-world conditions, and design trade-offs, and discuss directions for future research

    Strategy Tripod Perspective on the Determinants of Airline Efficiency in A Global Context: An Application of DEA and Tobit Analysis

    Get PDF
    The airline industry is vital to contemporary civilization since it is a key player in the globalization process: linking regions, fostering global commerce, promoting tourism and aiding economic and social progress. However, there has been little study on the link between the operational environment and airline efficiency. Investigating the amalgamation of institutions, organisations and strategic decisions is critical to understanding how airlines operate efficiently. This research aims to employ the strategy tripod perspective to investigate the efficiency of a global airline sample using a non-parametric linear programming method (data envelopment analysis [DEA]). Using a Tobit regression, the bootstrapped DEA efficiency change scores are further regressed to determine the drivers of efficiency. The strategy tripod is employed to assess the impact of institutions, industry and resources on airline efficiency. Institutions are measured by global indices of destination attractiveness; industry, including competition, jet fuel and business model; and finally, resources, such as the number of full-time employees, alliances, ownership and connectivity. The first part of the study uses panel data from 35 major airlines, collected from their annual reports for the period 2011 to 2018, and country attractiveness indices from global indicators. The second part of the research involves a qualitative data collection approach and semi-structured interviews with experts in the field to evaluate the impact of COVID-19 on the first part’s significant findings. The main findings reveal that airlines operate at a highly competitive level regardless of their competition intensity or origin. Furthermore, the unpredictability of the environment complicates airline operations. The efficiency drivers of an airline are partially determined by its type of business model, its degree of cooperation and how fuel cost is managed. Trade openness has a negative influence on airline efficiency. COVID-19 has toppled the airline industry, forcing airlines to reconsider their business model and continuously increase cooperation. Human resources, sustainability and alternative fuel sources are critical to airline survival. Finally, this study provides some evidence for the practicality of the strategy tripod and hints at the need for a broader approach in the study of international strategies

    UMSL Bulletin 2023-2024

    Get PDF
    The 2023-2024 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1088/thumbnail.jp

    Analysis and Design of Non-Orthogonal Multiple Access (NOMA) Techniques for Next Generation Wireless Communication Systems

    Get PDF
    The current surge in wireless connectivity, anticipated to amplify significantly in future wireless technologies, brings a new wave of users. Given the impracticality of an endlessly expanding bandwidth, there’s a pressing need for communication techniques that efficiently serve this burgeoning user base with limited resources. Multiple Access (MA) techniques, notably Orthogonal Multiple Access (OMA), have long addressed bandwidth constraints. However, with escalating user numbers, OMA’s orthogonality becomes limiting for emerging wireless technologies. Non-Orthogonal Multiple Access (NOMA), employing superposition coding, serves more users within the same bandwidth as OMA by allocating different power levels to users whose signals can then be detected using the gap between them, thus offering superior spectral efficiency and massive connectivity. This thesis examines the integration of NOMA techniques with cooperative relaying, EXtrinsic Information Transfer (EXIT) chart analysis, and deep learning for enhancing 6G and beyond communication systems. The adopted methodology aims to optimize the systems’ performance, spanning from bit-error rate (BER) versus signal to noise ratio (SNR) to overall system efficiency and data rates. The primary focus of this thesis is the investigation of the integration of NOMA with cooperative relaying, EXIT chart analysis, and deep learning techniques. In the cooperative relaying context, NOMA notably improved diversity gains, thereby proving the superiority of combining NOMA with cooperative relaying over just NOMA. With EXIT chart analysis, NOMA achieved low BER at mid-range SNR as well as achieved optimal user fairness in the power allocation stage. Additionally, employing a trained neural network enhanced signal detection for NOMA in the deep learning scenario, thereby producing a simpler signal detection for NOMA which addresses NOMAs’ complex receiver problem

    Classical and quantum algorithms for scaling problems

    Get PDF
    This thesis is concerned with scaling problems, which have a plethora of connections to different areas of mathematics, physics and computer science. Although many structural aspects of these problems are understood by now, we only know how to solve them efficiently in special cases.We give new algorithms for non-commutative scaling problems with complexity guarantees that match the prior state of the art. To this end, we extend the well-known (self-concordance based) interior-point method (IPM) framework to Riemannian manifolds, motivated by its success in the commutative setting. Moreover, the IPM framework does not obviously suffer from the same obstructions to efficiency as previous methods. It also yields the first high-precision algorithms for other natural geometric problems in non-positive curvature.For the (commutative) problems of matrix scaling and balancing, we show that quantum algorithms can outperform the (already very efficient) state-of-the-art classical algorithms. Their time complexity can be sublinear in the input size; in certain parameter regimes they are also optimal, whereas in others we show no quantum speedup over the classical methods is possible. Along the way, we provide improvements over the long-standing state of the art for searching for all marked elements in a list, and computing the sum of a list of numbers.We identify a new application in the context of tensor networks for quantum many-body physics. We define a computable canonical form for uniform projected entangled pair states (as the solution to a scaling problem), circumventing previously known undecidability results. We also show, by characterizing the invariant polynomials, that the canonical form is determined by evaluating the tensor network contractions on networks of bounded size

    Computational and experimental studies on the reaction mechanism of bio-oil components with additives for increased stability and fuel quality

    Get PDF
    As one of the world’s largest palm oil producers, Malaysia encountered a major disposal problem as vast amount of oil palm biomass wastes are produced. To overcome this problem, these biomass wastes can be liquefied into biofuel with fast pyrolysis technology. However, further upgradation of fast pyrolysis bio-oil via direct solvent addition was required to overcome it’s undesirable attributes. In addition, the high production cost of biofuels often hinders its commercialisation. Thus, the designed solvent-oil blend needs to achieve both fuel functionality and economic targets to be competitive with the conventional diesel fuel. In this thesis, a multi-stage computer-aided molecular design (CAMD) framework was employed for bio-oil solvent design. In the design problem, molecular signature descriptors were applied to accommodate different classes of property prediction models. However, the complexity of the CAMD problem increases as the height of signature increases due to the combinatorial nature of higher order signature. Thus, a consistency rule was developed reduce the size of the CAMD problem. The CAMD problem was then further extended to address the economic aspects via fuzzy multi-objective optimisation approach. Next, a rough-set based machine learning (RSML) model has been proposed to correlate the feedstock characterisation and pyrolysis condition with the pyrolysis bio-oil properties by generating decision rules. The generated decision rules were analysed from a scientific standpoint to identify the underlying patterns, while ensuring the rules were logical. The decision rules generated can be used to select optimal feedstock composition and pyrolysis condition to produce pyrolysis bio-oil of targeted fuel properties. Next, the results obtained from the computational approaches were verified through experimental study. The generated pyrolysis bio-oils were blended with the identified solvents at various mixing ratio. In addition, emulsification of the solvent-oil blend in diesel was also conducted with the help of surfactants. Lastly, potential extensions and prospective work for this study have been discuss in the later part of this thesis. To conclude, this thesis presented the combination of computational and experimental approaches in upgrading the fuel properties of pyrolysis bio-oil. As a result, high quality biofuel can be generated as a cleaner burning replacement for conventional diesel fuel

    Specificity Determining Features at the Interface of Biomolecular Complexes as Regulators of Biological Functions

    Get PDF
    Amino acid residues at the biomolecular interface play essential roles in many biological and cellular processes; relevant to this thesis, protein-protein interactions regulate signaling pathways and enzymatic activity, whereas protein-DNA interactions control gene expression, and protein-peptide interactions are central to the immune system. Biomolecular recognition and binding stability are largely determined by residues at the molecular interface. In this thesis, we focused on three biological datasets that are related to humans and human health: 1) dysregulated citrullination in the inflamed joints of rheumatoid arthritis patients, 2) a novel family of PRD-like transcription factors critical to the first few cell divisions in human life, and 3) epitopes that likely activate a cytotoxic T cell-mediated immune response against SARS-CoV-2 infection. For each dataset, in order to study the structural and functional consequences of molecular interactions, we applied a wide range of bioinformatics techniques to analyze sequences, structures and biological data retrieved from various databases, as well as taking into account experimental results from collaborators and from the literature. In rheumatoid arthritis, normally cytoplasmic peptidylarginine deiminase (PAD) enzymes citrullinate arginine residues in extracellular matrix (ECM) proteins. To examine specificity determining features that regulate the citrullination activity, we analyzed the sequence and structure data of the ECM proteins that were found citrullinated in chronically inflamed human joints. For citrullination, we found that an arginine side chain needs to be exposed to solvent but can arise from ÎČ-strands, α-helices, loops and ÎČ-turns. Moreover, there is no sequence motif linked to enzymatic activity. In addition, we studied the effect of citrullination on proteins important for a normal ECM, focusing on integrin binding to fibronectin and transforming growth factor-ÎČ (TGF-ÎČ). Citrullination of these proteins was found to inhibit cell attachment and spreading since PAD-treatment of the isoDGR motif in fibronectin and the RGD motif in TGF-ÎČ significantly reduced their binding with integrin αVÎČ3 and αVÎČ6, respectively. The expression of the human paired (PRD)-like transcription factors (TFs) are limited to the period of embryonic genome activation up to the 8-cell stage. We identified that one of these PRD-like TFs, LEUTX, binds to a TAATCC sequence motif. Sequence comparisons revealed that LEUTX protein is comprised of two domains: the DNA-binding homeodomain and a Leutx domain containing a transactivation domain. We identified specificity determining residues in the LEUTX homeodomain that are important for recognition of the TAATCC-containing 36 bp DNA motif enriched in genes involved in embryonic genome activation. We demonstrated using molecular models why a heterozygotic missense mutation A54V at the DNA-specificity determining position of LEUTX has significantly reduced overall transcriptional activity, as well as why the double mutant – I47T and A54V – form of LEUTX restores binding to the DNA motif similarly to that seen in the I47T mutation alone. At the onset of the COVID-19 pandemic we sought to understand the molecular factors that trigger the cytotoxic T cell-mediated immune response against the SARS-CoV-2 virus, taking advantage of binding data and 3D structures for related viruses and other pathogenic organisms. We first predicted the MHC class I (MHC-I)-specific immunogenic epitopes of length 8- to 11 amino acids from the SARS-CoV-2 proteins. Next, we predicted that the 9-mer epitopes would have the highest potential to elicit a strong immune response. For experimental validation, the predicted 9-mer epitopes were matched with the SARS-CoV-derived epitopes that are known to elicit an effective T cell response in vitro. Furthermore, our observations provide a structural explanation for the binding of SARS-CoV-2 epitopes to MHC-I molecules, identifying conserved immunogenic epitopes essential for understanding the pathogenesis of COVID-19. The three investigated datasets were made in concert with collaborative experimental studies and/or considering publicly available experimental data. The experimental studies generally provided the starting point for the in silico studies, which in turn had the objective of providing a detailed explanation of the experimental results. Furthermore, the in silico results could be used to devise novel and focused experiments, suggesting that bioinformatics predictions and wet-laboratory experimental investigations optimally take place with multiple advantages. Overall, this thesis demonstrates the synergy that is possible by applying this interdisciplinary approach to understanding the consequences of molecular interactions.Aminosyror i kontaktytan mellan olika biomolekyler spelar en viktig roll i mĂ„nga biologiska och cellulĂ€ra processer; relevanta interaktioner för den hĂ€r avhandlingen Ă€r protein-protein interaktioner som reglerar signaleringsrutter och enzymatisk aktivitet, protein-DNA interaktioner som kontrollerar genexpression, samt protein-peptid interaktioner som har en central roll i immunförsvaret. BiomolekylĂ€r igenkĂ€nning och bindningsstabilitet beror till stor del pĂ„ de aminosyror som finns i den molekylĂ€ra kontaktytan. I den hĂ€r avhandlingen fokuserade vi pĂ„ tre biologiska dataset som Ă€r relaterade till mĂ€nniskor och mĂ€nniskors hĂ€lsa: 1) felreglerad citrullinering i inflammerade leder hos patienter med reumatoid artrit, 2) en nyupptĂ€ckt familj av PRD (human paired)-lika transkriptionsfaktorer som Ă€r nödvĂ€ndiga för de första celldelningarna i mĂ€nniskolivet, och 3) epitoper som troligen aktiverar en cytotoxisk T-cell-förmedlad immunrespons mot SARS-CoV-2 infektioner. För att studera de strukturella och funktionella konsekvenserna av de molekylĂ€ra interaktionerna i varje dataset, anvĂ€ndes en mĂ€ngd olika bioinformatiska tekniker för att analysera sekvenser, strukturer och biologiska data frĂ„n olika databaser och dessutom beaktades experimentella resultat frĂ„n samarbetspartners och frĂ„n litteraturen. I reumatoid artrit citrullinerar vanligen PAD (cytoplasmatiska peptidyl arginin deiminas)-enzymer arginin-aminosyror i proteiner i det extracellulĂ€ra matrixet (ECM). För att undersöka egenskaper som avgör specificiteten hos citrullineringsaktiviteten analyserade vi sekvens- och strukturdata för ECM-proteiner som blir citrullinerade i kroniskt inflammerade leder hos mĂ€nniskor. Vi upptĂ€ckte att en argininsidokedja mĂ„ste vara i kontakt med det omgivande lösningsmedlet för att kunna citrullineras, att de kan finnas i beta-strĂ€ngar, alfa-helixar och beta-svĂ€ngar, samt att det inte finns nĂ„gra sekvensmotiv som Ă€r kopplade till enzymatisk aktivitet. Utöver detta studerade vi effekten av citrullinering pĂ„ proteiner som Ă€r viktiga för normal extracellulĂ€r matrix, med fokus pĂ„ integrinbinding till fibronektin och TGF-ÎČ (transforming growth factor-ÎČ). Citrullinering av dessa proteiner upptĂ€cktes inhibera cellvidhĂ€ftning och spridning eftersom PAD-behandling av isoDGR-motivet i fibronektin och RGD-motivet i TGF-ÎČ ordentligt reducerar deras bindning till integrin αVÎČ3 och αVÎČ6, respektive. ExpressionsnivĂ„erna av PRD-lika transkriptionsfaktorer (TF) Ă€r begrĂ€nsade till perioden av zygotens genomaktivering upp till 8-cells stadiet. Vi identifierade att en av dessa PRD-lika transkriptionsfaktorer, LEUTX, binder till ett TAATCC sekvensmotiv. SekvensjĂ€mförelser avslöjade att LEUTX proteinet bestĂ„r av tvĂ„ domĂ€ner, det DNA-bindande homeodomĂ€net och en leutx-domĂ€n som innehĂ„ller en transaktiveringsdomĂ€n. Vi identifierade specificitetsbestĂ€mmande aminosyror i LEUTX homeodomĂ€nen som Ă€r viktiga för igenkĂ€nning av TAATCC-innehĂ„llande 36 baspars DNA-motivet som Ă€r berikad med gener involverade i zygotens genomaktivering. Vi anvĂ€nde molekylĂ€ra modeller för att visa varför en heterozygotisk missense-mutation, A54V, i DNA-specificitetsbestĂ€mmande positionen i LEUTX har ordentligt minskad generell transkriptionsaktivitet, och varför dubbelmutanten I47T och A54V Ă„terstĂ€ller bindning till DNA-motivet pĂ„ samma sĂ€tt som observerats i enbart I47T mutationen. NĂ€r COVID-19 pandemin inleddes försökte vi förstĂ„ de molekylĂ€ra faktorer som startar den cytotoxiska T-cell-förmedlade immunresponsen mot SARS-CoV-2 viruset, genom att utnyttja bindningsdata och 3D strukturer för relaterade virus och andra patogena organismer. Vi förutspĂ„dde först MHC klass I (MHC-I)-specifika immunogena epitoper av lĂ€ngden 8 till 11 aminosyror frĂ„n SARS-CoV-2 proteiner. DĂ€refter förutspĂ„dde vi att epitoper bestĂ„ende av 9 aminosyror hade den högsta potentialen att orsaka en stark immunrespons. För experimentell validering matchades de 9 aminosyror lĂ„nga epitoperna med epitoper frĂ„n SARS-CoV som man vet att orsakar en effektiv T-cell respons in vitro. VĂ„ra observationer bidrar ocksĂ„ med en strukturell förklaring för bindningen av SARS-CoV-2 epitoper till MHC-I molekyler, vilket identifierar konserverade immunogena epitoper som Ă€r nödvĂ€ndiga för att förstĂ„r patogenesen hos COVID-19. De tre undersökta dataseten gjordes i samarbete med experimentella studier och/eller genom att ta allmĂ€nt tillgĂ€ngliga experimentella data i beaktande. De experimentella studierna gav en startpunkt för in silico-studierna, vilka i sin tur hade som mĂ„l att ge en detaljerad förklaring till de experimentella resultaten. In silico-resultaten kan ocksĂ„ anvĂ€ndas för att utveckla nya och fokuserade experiment, vilket indikerar att bioinformatiska förutspĂ„elser och experimentella studier optimalt sker med mĂ„nga fördelar. Över lag visar denna avhandling synergin som Ă€r möjlig genom att anvĂ€nda detta interdisciplinĂ€ra arbetssĂ€tt för att förstĂ„ konsekvenserna av molekylĂ€ra interaktioner
    • 

    corecore