9,801 research outputs found
UMSL Bulletin 2023-2024
The 2023-2024 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1088/thumbnail.jp
Analysis and Design of Non-Orthogonal Multiple Access (NOMA) Techniques for Next Generation Wireless Communication Systems
The current surge in wireless connectivity, anticipated to amplify significantly in future wireless technologies, brings a new wave of users. Given the impracticality of an endlessly expanding bandwidth, thereâs a pressing need for communication techniques that efficiently serve this burgeoning user base with limited resources. Multiple Access (MA) techniques, notably Orthogonal Multiple Access (OMA), have long addressed bandwidth constraints. However, with escalating user numbers, OMAâs orthogonality becomes limiting for emerging wireless technologies. Non-Orthogonal Multiple Access (NOMA), employing superposition coding, serves more users within the same bandwidth as OMA by allocating different power levels to users whose signals can then be detected using the gap between them, thus offering superior spectral efficiency and massive connectivity. This thesis examines the integration of NOMA techniques with cooperative relaying, EXtrinsic Information Transfer (EXIT) chart analysis, and deep learning for enhancing 6G and beyond communication systems. The adopted methodology aims to optimize the systemsâ performance, spanning from bit-error rate (BER) versus signal to noise ratio (SNR) to overall system efficiency and data rates. The primary focus of this thesis is the investigation of the integration of NOMA with cooperative relaying, EXIT chart analysis, and deep learning techniques. In the cooperative relaying context, NOMA notably improved diversity gains, thereby proving the superiority of combining NOMA with cooperative relaying over just NOMA. With EXIT chart analysis, NOMA achieved low BER at mid-range SNR as well as achieved optimal user fairness in the power allocation stage. Additionally, employing a trained neural network enhanced signal detection for NOMA in the deep learning scenario, thereby producing a simpler signal detection for NOMA which addresses NOMAsâ complex receiver problem
Discontinuous grammar as a foreign language
[Abstract] In order to achieve deep natural language understanding, syntactic constituent parsing is a vital step, highly demanded by many artificial intelligence systems to process both text and speech. One of the most recent proposals is the use of standard sequence-to-sequence models to perform constituent parsing as a machine translation task, instead of applying task-specific parsers. While they show a competitive performance, these text-to-parse transducers are still lagging behind classic techniques in terms of accuracy,
coverage and speed. To close the gap, we here extend the framework of sequence-to-sequence models for constituent parsing, not only by providing a more powerful neural architecture for improving their performance, but also by enlarging their coverage to handle the most complex syntactic phenomena: discontinuous structures. To that end, we design several novel linearizations that can fully produce discontinuities and, for the first time, we test a sequence-to-sequence model on the main discontinuous benchmarks, obtaining competitive results on par with task-specific discontinuous constituent parsers and achieving state-of-the-art scores on the (discontinuous) English Penn Treebank.Xunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2020/11We acknowledge the European Research Council (ERC), which has funded this research under the European Unionâs Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150) and the Horizon Europe research and innovation programme (SALSA, grant agreement No 101100615), ERDF/ MICINN-AEI (SCANNER-UDC, PID2020-113230RB-C21), Xunta de Galicia (ED431C 2020/11), and Centro de InvestigaciĂłn de Galicia ââCITICâ, funded by Xunta de Galicia and the European Union (ERDF - Galicia 2014â2020 Program), by grant ED431G 2019/01. Funding for open access charge: Universidade da Coruña/CISUG
Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning
Protein-DNA interaction is critical for life activities such as replication,
transcription, and splicing. Identifying protein-DNA binding residues is
essential for modeling their interaction and downstream studies. However,
developing accurate and efficient computational methods for this task remains
challenging. Improvements in this area have the potential to drive novel
applications in biotechnology and drug design. In this study, we propose a
novel approach called CLAPE, which combines a pre-trained protein language
model and the contrastive learning method to predict DNA binding residues. We
trained the CLAPE-DB model on the protein-DNA binding sites dataset and
evaluated the model performance and generalization ability through various
experiments. The results showed that the AUC values of the CLAPE-DB model on
the two benchmark datasets reached 0.871 and 0.881, respectively, indicating
superior performance compared to other existing models. CLAPE-DB showed better
generalization ability and was specific to DNA-binding sites. In addition, we
trained CLAPE on different protein-ligand binding sites datasets, demonstrating
that CLAPE is a general framework for binding sites prediction. To facilitate
the scientific community, the benchmark datasets and codes are freely available
at https://github.com/YAndrewL/clape
Machine learning in solar physics
The application of machine learning in solar physics has the potential to
greatly enhance our understanding of the complex processes that take place in
the atmosphere of the Sun. By using techniques such as deep learning, we are
now in the position to analyze large amounts of data from solar observations
and identify patterns and trends that may not have been apparent using
traditional methods. This can help us improve our understanding of explosive
events like solar flares, which can have a strong effect on the Earth
environment. Predicting hazardous events on Earth becomes crucial for our
technological society. Machine learning can also improve our understanding of
the inner workings of the sun itself by allowing us to go deeper into the data
and to propose more complex models to explain them. Additionally, the use of
machine learning can help to automate the analysis of solar data, reducing the
need for manual labor and increasing the efficiency of research in this field.Comment: 100 pages, 13 figures, 286 references, accepted for publication as a
Living Review in Solar Physics (LRSP
Modular lifelong machine learning
Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge.
Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand.
This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems.
First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures.
Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations.
Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods.
Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer
Continual Learning, Fast and Slow
According to the Complementary Learning Systems (CLS)
theory~\cite{mcclelland1995there} in neuroscience, humans do effective
\emph{continual learning} through two complementary systems: a fast learning
system centered on the hippocampus for rapid learning of the specifics,
individual experiences; and a slow learning system located in the neocortex for
the gradual acquisition of structured knowledge about the environment.
Motivated by this theory, we propose \emph{DualNets} (for Dual Networks), a
general continual learning framework comprising a fast learning system for
supervised learning of pattern-separated representation from specific tasks and
a slow learning system for representation learning of task-agnostic general
representation via Self-Supervised Learning (SSL). DualNets can seamlessly
incorporate both representation types into a holistic framework to facilitate
better continual learning in deep neural networks. Via extensive experiments,
we demonstrate the promising results of DualNets on a wide range of continual
learning protocols, ranging from the standard offline, task-aware setting to
the challenging online, task-free scenario. Notably, on the
CTrL~\cite{veniat2020efficient} benchmark that has unrelated tasks with vastly
different visual images, DualNets can achieve competitive performance with
existing state-of-the-art dynamic architecture
strategies~\cite{ostapenko2021continual}. Furthermore, we conduct comprehensive
ablation studies to validate DualNets efficacy, robustness, and scalability.
Code will be made available at \url{https://github.com/phquang/DualNet}.Comment: arXiv admin note: substantial text overlap with arXiv:2110.0017
Fault diagnosis in aircraft fuel system components with machine learning algorithms
There is a high demand and interest in considering the social and environmental effects of the componentâs lifespan. Aircraft are one of the most high-priced
businesses that require the highest reliability and safety constraints. The complexity of aircraft systems designs also has advanced rapidly in the last decade. Consequently, fault detection, diagnosis and modification/ repair procedures are becoming more challenging. The presence of a fault within an aircraft system can result in changes to system performances and cause operational downtime or accidents in a worst-case scenario.
The CBM method that predicts the state of the equipment based on data collected is widely used in aircraft MROs. CBM uses diagnostics and prognostics models
to make decisions on appropriate maintenance actions based on the Remaining Useful Life (RUL) of the components.
The aircraft fuel system is a crucial system of aircraft, even a minor failure in the fuel system can affect the aircraft's safety greatly. A failure in the fuel system that
impacts the ability to deliver fuel to the engine will have an immediate effect on system performance and safety. There are very few diagnostic systems that
monitor the health of the fuel system and even fewer that can contain detected faults. The fuel system is crucial for the operation of the aircraft, in case of failure,
the fuel in the aircraft will become unusable/unavailable to reach the destination.
It is necessary to develop fault detection of the aircraft fuel system. The future aircraft fuel system must have the function of fault detection. Through the information of sensors and Machine Learning Techniques, the aircraft fuel systemâs fault type can be detected in a timely manner.
This thesis discusses the application of a Data-driven technique to analyse the healthy and faulty data collected using the aircraft fuel system model, which is
similar to Boeing-777. The data is collected is processed through Machine learning Techniques and the results are comparedPhD in Manufacturin
Reinforcement learning in large state action spaces
Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios.
This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory).
In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications
- âŠ